Research Projects
Our recent funded projects:
-
Energy-, Latency- and Resilience-aware Networking
(Third Party Funds Group – Sub project)
Overall project: SPP 1914 „Cyber-Physical Networking (CPN)
Term: since 1. January 2020
Funding source: DFG / Schwerpunktprogramm (SPP)
URL: https://www.nt.uni-saarland.de/project/latency-and-resilience-aware-networking-larn/ -
Non-volatility in energy-aware operating systems
(Third Party Funds Single)
Term: since 1. January 2022
Funding source: DFG-Einzelförderung / Sachbeihilfe (EIN-SBH)The current trend toward fast, byte-addressable non-volatile memory (NVM) with latencies and write resistance closer to SRAM and DRAM than flash positions NVM as a possible replacement for established volatile technologies. While on the one hand the non-volatility and low leakage capacity make NVM an attractive candidate for new system designs in addition to other advantageous features, on the other hand there are also major challenges, especially for the programming of such systems. For example, power failures in combination with NVM to protect the computing status result in control flows that can unexpectedly transform a sequential process into a non-sequential process: a program has to deal with its own status from earlier interrupted runs.
If programs can be executed directly in the NVM, normal volatile main memory (functional) becomes superfluous. Volatile memory can then only be found in the cache and in device/processor registers ("NVM-pure"). An operating system designed for this can dispense with many, if not all, persistence measures that would normally otherwise be implemented and thereby reduce its level of background noise. Considered in detail, this enables energy requirements to be reduced, computing power to be increased and latencies to be reduced. In addition, the elimination of these persistence measures means that an "NVM-pure" operating system is leaner than its functionally identical twin of conventional design. On the one hand, this contributes to better analysability of non-functional properties of the operating system and, on the other hand, results in a smaller attack surface or trustworthy computing base.
The project follows an "NVM-pure" approach. A threatening power failure leads to an interrupt request (power failure interrupt, PFI), with the result that a checkpoint of the unavoidable volatile system state is created. In addition, in order to tolerate possible PFI losses, sensitive operating system data structures are secured in a transactional manner analogous to methods of non-blocking synchronisation. Furthermore, methods of static program analysis are applied to (1) cleanse the operating system of superfluous persistence measures, which otherwise only generate background noise, (2) break up uninterruptible instruction sequences with excessive interruption latencies, which can cause the PFI-based checkpoint backup to fail and (3) define the work areas of the dynamic energy demand analysis. To demonstrate that an "NVM-pure" operating system can operate more efficiently than its functionally identical conventional twin, both in terms of time and energy, the work is carried out with Linux as an example.
-
Power-Aware Critical Sections
(Third Party Funds Single)
Term: 1. January 2015 - 30. September 2022
Funding source: DFG-Einzelförderung / Sachbeihilfe (EIN-SBH)Race conditions of concurrent processes within a computing system may cause partly inexplicable phenomena or even defective run-time behaviour. Reason are critical sections in non-sequential programs. Solutions for the protection of critical sections generally are facing a multi-dimensional problem space: (1) processor-local interrupts, (2) shared-memory multi/many-core multiprocessors with (2a) coherent or (2b) incoherent caches, (3) distributed-memory systems with global address space, (4) interference with process management of the operating system. Thereby, the protection method makes pessimistic or optimistic assumptions regarding the occurrence of access contention.The number of contending processes depends on the use case and has a large impact on the effectiveness of their coordination at all levels of a computing system. Overhead, scalability, and dedication of the protective function thereby constitute decisive performance-affecting factors. This influencing quantity not only accounts for varying process run-times but also different energy uses. The former results in noise or jitter in the program flow: non-functional properties that are especially problematic for highly parallel or real-time dependent processes. In contrast, the later has economical importance as well as ecological consequences on the one hand and is tangent to the boundary of scalability of many-core processors (dark silicon) on the other hand.Subject to the structural complexity of a critical section and its sensitivity to contention, a trade-off becomes apparent that shall be tackled in the project by means of analytical and constructive measures. Objects of investigation are own special-purpose operating systems, which were designed primarily for the support of parallel and partly also real-time dependent data processing, and Linux. Goal is the provision (a) of a software infrastructure for load-dependent and---by the program sections---self-organized change of protection against crucial race condition of concurrent processes as well as (b) of tools for preparation, characterisation, and capturing of those sections. Hotspots caused by increased process activity and becoming manifested in energy-use and temperature rise shall be avoided or attenuated on demand or anticipatory by a section-specific dispatch policy. The overhead induced by the particular dispatch policy slips in the weighting to dynamic reconfiguration of a critical section for undertaking a change only in case that real practical gain compared to the original solution can be expected. Before-after comparisons based on the investigated operating systems shall demonstrate the effectivity of the approach developed.Race conditions of concurrent processes within a computing system may cause partly inexplicable phenomena or even defective run-time behaviour. Reason are critical sections in non-sequential programs. Solutions for the protection of critical sections generally are facing a multi-dimensional problem space: (1) processor-local interrupts, (2) shared-memory multi/many-core multiprocessors with (2a) coherent or (2b) incoherent caches, (3) distributed-memory systems with global address space, (4) interference with process management of the operating system. Thereby, the protection method makes pessimistic or optimistic assumptions regarding the occurrence of access contention.The number of contending processes depends on the use case and has a large impact on the effectiveness of their coordination at all levels of a computing system. Overhead, scalability, and dedication of the protective function thereby constitute decisive performance-affecting factors. This influencing quantity not only accounts for varying process run-times but also different energy uses. The former results in noise or jitter in the program flow: non-functional properties that are especially problematic for highly parallel or real-time dependent processes. In contrast, the later has economical importance as well as ecological consequences on the one hand and is tangent to the boundary of scalability of many-core processors (dark silicon) on the other hand.Subject to the structural complexity of a critical section and its sensitivity to contention, a trade-off becomes apparent that shall be tackled in the project by means of analytical and constructive measures. Objects of investigation are own special-purpose operating systems, which were designed primarily for the support of parallel and partly also real-time dependent data processing, and Linux. Goal is the provision (a) of a software infrastructure for load-dependent and---by the program sections---self-organized change of protection against crucial race condition of concurrent processes as well as (b) of tools for preparation, characterisation, and capturing of those sections. Hotspots caused by increased process activity and becoming manifested in energy-use and temperature rise shall be avoided or attenuated on demand or anticipatory by a section-specific dispatch policy. The overhead induced by the particular dispatch policy slips in the weighting to dynamic reconfiguration of a critical section for undertaking a change only in case that real practical gain compared to the original solution can be expected. Before-after comparisons based on the investigated operating systems shall demonstrate the effectivity of the approach developed. -
Resource-Efficient Fault and Intrusion Tolerance
(Third Party Funds Single)
Term: since 1. October 2009
Funding source: DFG-Einzelförderung / Sachbeihilfe (EIN-SBH)
URL: https://www4.cs.fau.de/Research/REFIT/Internet-based services play a central role in today's society. With such services progressively taking over from traditional infrastructures, their complexity steadily increases. On the downside, this leads to more and more faults occurring. As improving software-engineering techniques alone will not do the job, systems have to be prepared to tolerate faults and intrusions.
REFIT investigates how systems can provide fault and intrusion tolerance in a resource-efficient manner. The key technology to achieve this goal is virtualization, as it enables multiple service instances to run in isolation on the same physical host. Server consolidation through virtualization not only saves resources in comparison to traditional replication, but also opens up new possibilities to apply optimizations (e.g., deterministic multi-threading).
Resource efficiency and performance of the REFIT prototype are evaluated using a web-based multi-tier architecture, and the results are compared to non-replicated and traditionally-replicated scenarios. Furthermore, REFIT develops an infrastructure that supports the practical integration and operation of fault and intrusion-tolerant services; for example, in the context of cloud computing.
-
Invasive Run-Time Support System (iRTSS) (C01)
(Third Party Funds Group – Sub project)
Overall project: TRR 89: Invasive Computing
Term: 1. July 2010 - 30. June 2022
Funding source: DFG / Sonderforschungsbereich / Transregio (SFB / TRR)Teilprojekt C1 erforscht Systemsoftware für invasiv-parallele Anwendungen. Bereitgestellt werden Methoden, Prinzipien und Abstraktionen zur anwendungsgewahren Erweiterung, Konfigurierung und Anpassung invasiver Rechensysteme durch eine neuartige, hochgradig flexible Betriebssystem-Infrastruktur. Diese wird zur praktischen Anwendung in ein Unix-Wirtssystem integriert. Untersucht werden (1) neue Entwurfs- und Implementierungsansätze nebenläufigkeitsgewahrer Betriebssysteme, (2) neuartige AOP-ähnliche Methoden für die statische und dynamische (Re-)konfigurierung von Betriebssystemen sowie (3) agentenbasierte Ansätze für die skalierbare und flexible Verwaltung von Ressourcen.
-
Security in Invasive Computing Systems (C05)
(Third Party Funds Group – Sub project)
Overall project: TRR 89: Invasive Computing
Term: 1. July 2010 - 30. June 2022
Funding source: DFG / Sonderforschungsbereich / Transregio (SFB / TRR)Untersucht werden Anforderungen und Mechanismen zum Schutz vor böswilligen Angreifern für ressourcengewahre rekonfigurierbare Hardware/Software-Architekturen. Der Fokus liegt auf der Umsetzung von Informationsflusskontrolle mittels Isolationsmechanismen auf Anwendungs-, Betriebssystems- und Hardwareebene. Ziel der Untersuchungen sind Erkenntnisse über die Wechselwirkungen zwischen Sicherheit und Vorhersagbarkeit kritischer Eigenschaften eines invasiven Rechensystems.