Operating Systems
This very central research area deals with concepts and techniques for the specific assignment of computing resources to the competing processes and the associated control of the execution of the respective machine programs. Not only are analytical procedures considered, but also constructive approaches are pursued. Tailor-made operating system structures are considered according to the requirements of a respective application domain, either as a proprietary development from scratch or as an extension or adaptation of existing solutions. In functional as well as non-functional terms, the approach is on the one hand application-oriented and on the other hand hardware-centric. The work is mainly characterized by the technology of multi-core or many-core processors and the associated challenges in the coordination of communication and competition of parallel processes.
Projects:
Dynamic Operating-System Specialisation
(Third Party Funds Single)
Project leader:
Project members: , , ,
Start date: 1. May 2022
Acronym: DOSS
Funding source: DFG-Einzelförderung / Sachbeihilfe (EIN-SBH)
URL: https://sys.cs.fau.de/research/doss
Abstract:
An operating system is located between two fronts: On the one hand ("above") the machine programs of the applications with their sometimes very different functional and non-functional requirements and on the other hand ("below") the computer hardware, whose features and equipment ideally are to be made available "unfiltered" and "noise-free" for the applications. However, a general purpose system cannot be as efficient in any of its functions as a system designed specifically for a specific purpose, and less demanding applications may require that they are not forced to pay for the resources consumed by the unneeded functions. So it is not uncommon for large systems, once put into operation, to be subject to frequent changes --- precisely in order to achieve a better fit to changing application requirements.
The ideal operating system offers exactly what is required for the respective application --- no more and no less, but also depending on the characteristics of the hardware platform. However, such an ideal is only realistic, if at all, for an uni-programming mode of operation. In the case of multi-programming, the various applications would have to have "sufficiently the same" functional and non-functional requirement characteristics in order not to burden any of the applications with the overhead that the unneeded functions entail. An operating system with these characteristics falls into the category of special purpose operating system, it is tailored to the needs of applications of a certain type.
This is in contrast to the general purpose operating system, where the ultimate hope is that an application will not be burdened with excessive overhead from the unneeded functions. At least one can try to minimise the "background noise" in the operating system if necessary --- ideally in this case with a different "discount" depending on the program type. The operating system would then not only have to be dynamically freed from unnecessary ballast and shrink with less demanding applications, but also be able to grow again with more demanding applications with the necessary and additional functions. Specialisation of an operating system depending on the respective application ultimately means functional reduction and enrichment, for which a suitable system software design is desirable, but often can no longer be implemented, especially with legacy systems.
One circumstance for the specialisation of an operating system relates to measures explicitly initiated "from outside". On the one hand, this affects selected system calls and, on the other hand, tasks such as bootstrapping and the loading of machine programs, operating system kernel modules or programs that are to be executed in sandbox-like virtual machines within the operating system kernel. This form of specialisation also enables the dynamic generation of targeted protective measures as a result of particularly vulnerable operating system operations, such as loading external modules of the operating system kernel. The other determinant of the specialisation of an operating system relates to measures initiated implicitly "from within". This concerns possible reactions of an operating system to changes in its own runtime behavior that are only noticeable during operation, in order to then adapt the strategies of resource management to the respective workload and to seamlessly integrate the corresponding software components into the existing system.
The project focus is the dynamic operating system specialisation triggered by extrinsic and intrinsic events. The focus is on concepts and techniques that (a) are independent of a specific programming paradigm or hardware approach and (b) are based on just in time (JIT) compilation of parts of the operating system (kernel) in order to to be loaded on demand or to be replaced anticipatory to the respective conditions on the "operating system fronts". Existing general-purpose systems such as Linux are the subject of investigation.
Publications:
Luci: Loader-based Dynamic Software Updates for Off-the-shelf Shared Objects
2023 USENIX Annual Technical Conference (Boston, MA, 10. July 2023 - 12. July 2023)
In: 2023 USENIX Annual Technical Conference (USENIX ATC 23) 2023
Open Access: https://www.usenix.org/system/files/atc23-heinloth.pdf
URL: https://www.usenix.org/system/files/atc23-heinloth.pdf
, , :
Towards Just-In-Time Compiling of Operating Systems
12th Workshop on Programming Languages and Operating Systems (PLOS 2023)
DOI: 10.1145/3623759.3624551
, , , :
Non-volatility in energy-aware operating systems
(Third Party Funds Single)
Project leader:
Start date: 1. January 2022
Acronym: NEON
Funding source: DFG-Einzelförderung / Sachbeihilfe (EIN-SBH)
URL: https://sys.cs.fau.de/en/research/neon-note
Abstract:
The current trend toward fast, byte-addressable non-volatile memory (NVM) with latencies and write resistance closer to SRAM and DRAM than flash positions NVM as a possible replacement for established volatile technologies. While on the one hand the non-volatility and low leakage capacity make NVM an attractive candidate for new system designs in addition to other advantageous features, on the other hand there are also major challenges, especially for the programming of such systems. For example, power failures in combination with NVM to protect the computing status result in control flows that can unexpectedly transform a sequential process into a non-sequential process: a program has to deal with its own status from earlier interrupted runs.
If programs can be executed directly in the NVM, normal volatile main memory (functional) becomes superfluous. Volatile memory can then only be found in the cache and in device/processor registers ("NVM-pure"). An operating system designed for this can dispense with many, if not all, persistence measures that would normally otherwise be implemented and thereby reduce its level of background noise. Considered in detail, this enables energy requirements to be reduced, computing power to be increased and latencies to be reduced. In addition, the elimination of these persistence measures means that an "NVM-pure" operating system is leaner than its functionally identical twin of conventional design. On the one hand, this contributes to better analysability of non-functional properties of the operating system and, on the other hand, results in a smaller attack surface or trustworthy computing base.
The project follows an "NVM-pure" approach. A threatening power failure leads to an interrupt request (power failure interrupt, PFI), with the result that a checkpoint of the unavoidable volatile system state is created. In addition, in order to tolerate possible PFI losses, sensitive operating system data structures are secured in a transactional manner analogous to methods of non-blocking synchronisation. Furthermore, methods of static program analysis are applied to (1) cleanse the operating system of superfluous persistence measures, which otherwise only generate background noise, (2) break up uninterruptible instruction sequences with excessive interruption latencies, which can cause the PFI-based checkpoint backup to fail and (3) define the work areas of the dynamic energy demand analysis. To demonstrate that an "NVM-pure" operating system can operate more efficiently than its functionally identical conventional twin, both in terms of time and energy, the work is carried out with Linux as an example.
Publications:
Luci: Loader-based Dynamic Software Updates for Off-the-shelf Shared Objects
2023 USENIX Annual Technical Conference (Boston, MA, 10. July 2023 - 12. July 2023)
In: 2023 USENIX Annual Technical Conference (USENIX ATC 23) 2023
Open Access: https://www.usenix.org/system/files/atc23-heinloth.pdf
URL: https://www.usenix.org/system/files/atc23-heinloth.pdf
, , :
On the Performance of NVRAM-based Operating Systems: A Case Study with Linux and FreeBSD
(2023)
ISSN: 2191-5008
DOI: 10.25593/issn.2191-5008/CS-2023-01
, , , , , , :
On energy awareness in NVRAM-based operating systems – NEON and PAVE
In: Schloss Dagstuhl -- Leibniz-Zentrum für Informatik (ed.): Power and Energy-Aware Computing on Heterogeneous Systems (PEACHES), 2023, p. 43-44 (Dagstuhl Reports, Vol.Dagstuhl Seminar 22341)
DOI: 10.4230/DagRep.12.8.31
, :
Back to the Core-Memory Age: Running Operating Systems in NVRAM only
Architecture of Computing Systems. ARCS 2023 (Athen, 13. June 2023 - 15. June 2023)
In: Georgios Goumas, Sven Tomforde, Jürgen Brehm, Stefan Wildermann, Thilo Pionteck (ed.): Lecture Notes in Computer Science 2023
DOI: 10.1007/978-3-031-42785-5_11
, , , , , , :
Towards Just-In-Time Compiling of Operating Systems
12th Workshop on Programming Languages and Operating Systems (PLOS 2023)
DOI: 10.1145/3623759.3624551
, , , :
Power-fail aware byte-addressable virtual non-volatile memory
(Third Party Funds Group – Sub project)
Overall project: SPP 2377: Disruptive Memory Technologies
Project leader: ,
Start date: 5. April 2021
End date: 14. May 2026
Acronym: PAVE
Funding source: DFG / Schwerpunktprogramm (SPP)
URL: https://sys.cs.fau.de/en/research/pave-note
Abstract:
Virtual memory (VM) subsystems blur the distinction between storage and memory such that both volatile and non-volatile data can be accessed transparently via CPU instructions. Each and every VM subsystem tries hard to keep highly contended data in fast volatile main memory to mitigate the high access latency of secondary storage, irrespective of whether the data is considered to be volatile or not. The recent advent of byte-addressable NVRAM does not change this scheme in principle, because the current technology can neither replace DRAM as fast main memory due to its significantly higher access latencies, nor secondary storage due to its significantly higher price and lower capacity. Thus, ideally, VM subsystems should be NVRAM-aware and be extended in such a way that all available byte-addressable memory technologies can be employed to their respective advantages. By means of an abstraction anchored in the VM management in the operating system, legacy software should then be able to benefit unchanged and efficiently from byte- addressable non-volatile main memory. Due to the fact that most VM subsystems are complex, highly-tuned software systems, which have evolved over decades of development, we follow a minimally invasive approach to integrate NVRAM-awareness into an existing VM subsystem instead of developing an entirely new system from scratch. NVRAM will serve as an immediate DRAM substitute in case of main memory shortage and inherently support processes with large working sets. However, due to the high access latencies of NVRAM, non-volatile data also needs to be kept at least temporarily in fast volatile main memory and the volatile CPU caches, anyway. Our new VM subsystem - we want to adapt FreeBSD accordingly - then takes care of migration of pages between DRAM and NVRAM, if the available resources allow. Thus, the DRAM is effectively managed as a large software-controlled volatile page cache for NVRAM. Consequently, this raises the question of data losses caused by power outages. The VM subsystem therefore needs to keep its own metadata in a consistent and recoverable state and modified pages in volatile memory need to be copied to NVRAM to avoid losses. The former requires an extremely efficient transactional mechanism for modifying of complex, highly contended VM metadata, while the latter must cope with potentially large amounts of modified pages with limited energy reserves.
Publications:
On the Performance of NVRAM-based Operating Systems: A Case Study with Linux and FreeBSD
(2023)
ISSN: 2191-5008
DOI: 10.25593/issn.2191-5008/CS-2023-01
, , , , , , :
Back to the Core-Memory Age: Running Operating Systems in NVRAM only
Architecture of Computing Systems. ARCS 2023 (Athen, 13. June 2023 - 15. June 2023)
In: Georgios Goumas, Sven Tomforde, Jürgen Brehm, Stefan Wildermann, Thilo Pionteck (ed.): Lecture Notes in Computer Science 2023
DOI: 10.1007/978-3-031-42785-5_11
, , , , , , :
NVall: A Crash-Resistant and Kernel-Compatible Memory Allocator for NVRAM
FG-BS Herbsttreffens 2023 (Bamberg, 28. September 2023 - 29. September 2023)
DOI: 10.18420/fgbs2023h-02
URL: https://dl.gi.de/items/8d0686f6-a88e-4e96-af34-2743d49b99a4
, , , :
Towards Just-In-Time Compiling of Operating Systems
12th Workshop on Programming Languages and Operating Systems (PLOS 2023)
DOI: 10.1145/3623759.3624551
, , , :
Resilient Power-Constrained Embedded Communication Terminals
(Third Party Funds Group – Sub project)
Overall project: SPP 2378 Resilient Worlds
Project leader: ,
Project members: , ,
Start date: 26. March 2021
Acronym: SPP 2378 ResPECT
Funding source: Deutsche Forschungsgemeinschaft (DFG)
Abstract:
Within the wide subject of resilience in networked worlds ResPECT focuses on a core element of all networked systems: sensor- and actuator-nodes in cyber-physical systems. Communication up to today is understood and implemented as an auxiliary functionality of embedded systems. The system itself is disruption-tolerant and able to handle power failures or in a limited scope even hardware problems, but the communication isn't part of the overall design. In the best case it can make use of the underlying system resilience. ResPECT develops a holistic operating system and communication protocol stack, assuming that conveying information (the receipt of control data for actuators or the sending of sensor data) is a core task of all networked components. Consequently it must become a part of the operating system's management functionality. ResPECT builds on two pillars: Non-volatile memory and transactional operation. Non- volatile memory in recent years has evolved towards a serious element of the storage hierarchy. Even embedded platforms with exclusively non-volatile memory become conceivable. Network communication, other than social communication, is transactional in its design: Data is collected and under channel constraints like latency, error-resilience and energy consumption and content constraints like age and therewith value of information is transmitted between the communication partners. Other than for operating systems this communication, however, faces many external disruptions and impacts. In addition, the duration of a disruption can have severe implications on the validity of already completed transactions like the persistence of the physical connection. Hence on resumption all this has to be considered. ResPECT consequently will - by interdisciplinary research of operating system and communication experts - develop a model based on transactions and will apply non-volatile memory to ensure, that states during the flow of transactions are known at any point in time and can and will be stored persistently. This monitoring and storing functionality must be very efficient (with respect to the energy consumption as well as to the amount of data to be stored in non-volatile memory) and hence be implemented as a core functionality of the operating system. To ensure generalizability and to have the model available for a variety of future platforms, ResPECT will focus on IP-networks and use communication networks which typically are operated as WAN, LAN or PAN (wide, local or personal area networks).
Publications:
WIP: Towards a Transactional Network Stack for Power-Failure Resilience
In: Proceedings of the 21st IEEE Consumer Communications & Networking Conference (CCNC) - Work-In-Progress 2024
DOI: CCNC51664.2024.10454781
URL: https://ieeexplore.ieee.org/document/10454781
, , , , :
Towards Just-In-Time Compiling of Operating Systems
12th Workshop on Programming Languages and Operating Systems (PLOS 2023)
DOI: 10.1145/3623759.3624551
, , , :
WoCA: Avoiding Intermittent Execution in Embedded Systems by Worst-Case Analyses with Device States
25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2024) (Copenhagen, Denmark, 24. June 2024 - 28. June 2024)
In: Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2024) 2024
URL: https://sys.cs.fau.de/publications/2024/raffeck_24_lctes.pdf
, , :
VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions
RAID'24: The 27th International Symposium on Research in Attacks, Intrusions and Defenses (Padua, 30. September 2024 - 2. October 2024)
In: Proceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2024
DOI: 10.1145/3678890.3678907
URL: https://arxiv.org/pdf/2405.00078
, , , , , :
PfIP: A UDP/IP Transactional Network Stack for Power-Failure Resilience in Embedded Systems
22nd IEEE Consumer Communications & Networking Conference (CCNC) (Las Vegas, NV, USA, 10. January 2025 - 13. January 2025)
In: Proceedings of the Consumer Communications & Networking Conference (CCNC 2025) 2025
, , , , , , :
Migration-Aware Multi-Core Real-Time Executive
(Own Funds)
Project leader:
Project members:
Start date: 11. August 2020
Acronym: maRE
Abstract:
This research proposal investigates the predictability of task migration in multi-core real-time systems. Therefore, we propose , a migration- aware real-time executive where migration decisions are no longer based on generic performance parameters but systematically deduced on application-specific knowledge of the real-time tasks. These so-called migration hints relate to temporal and spatial aspects of real-time tasks; they mark potential migration points in their non- sequential (multi-threaded) machine programs. Migration hints enable the operating system to reach decisions that have the most favorable impact on the overall predictability and system performance. The proposal assumes that application-specific hints on admissible and particularly favorable program points for migration represent a cost- effective way to leverage multi-core platforms with existing real-time systems and scheduling techniques. The object of investigation is multi-core platforms with heterogeneous memory architectures. The focus is on the worst-case overhead caused by migration in such systems, mainly depending on the current size and location of the real-time tasks' resident core-local data. This data set, which varies in size over execution time, is determined using tool-based static analysis techniques that derive usable migration hints at design time. In addition, the proposal develops migration-aware variants of standard real-time operating systems, which provide specialized interfaces and mechanisms to utilize these migration hints as favorable migration points at runtime to provide predictable migrations and optimize the overall schedulability and performance of the system.
Publications:
Contact Persons:
Participating Scientists:
- Rüdiger Kapitza
- Wolfgang Schröder-Preikschat
- Luis Gerhorst
- Phillip Raffeck
- Peter Wägemann
- Jürgen Kleinöder
- Dustin Tien Nguyen
- Thomas Preisner
- Maximilian Ott
Publications:
Luci: Loader-based Dynamic Software Updates for Off-the-shelf Shared Objects
2023 USENIX Annual Technical Conference (Boston, MA, 10. July 2023 - 12. July 2023)
In: 2023 USENIX Annual Technical Conference (USENIX ATC 23) 2023
Open Access: https://www.usenix.org/system/files/atc23-heinloth.pdf
URL: https://www.usenix.org/system/files/atc23-heinloth.pdf , , :
Nowa: A Wait-Free Continuation-Stealing Concurrency Platform
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS) (Portland, Oregon, 17. May 2021 - 21. May 2021)
In: 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2021
DOI: 10.1109/IPDPS49936.2021.00044
URL: https://www4.cs.fau.de/~flow/papers/schmaus2021nowa.pdf , , , , :