Dokumentation
Preparation
In this course we will extend our StuBS OS (developed during the operating systems lecture's exercises) with common isolation features. A basic StuBS implementation to build upon is provided via GitLab – you only have to choose between the two flavors:
- OOStuBS (single core)
- MPStuBS (multi core, required for 7.5 ECTS)
Please note: We strongly recommend to use the provided source. Since there is quite a good chance that your own StuBS code from last semester still contains undiscovered bugs/issues and probably lacks some useful features (like a dynamic allocator), you should avoid extending it and stick to our skeleton 😉.
In contrast to the operating system exercises, you will not receive any updated sources from us (unless we have urgent bug fixes) – so it is all up to you how you structure your code for the upcoming assignments. However, this also means you will have to stick to it until the end of the semester, so better try to organize it as good as possible!
User Applications in Ring 3
In this course, you will successively extend your operating system to use modern isolation techniques, fully separating each user process from the kernel and other user processes.
As a first step, you have to modify StuBS in a way that any application code always runs in protection ring 3 and only the handling of interrupts (especially time slice scheduling interrupts) is performed on ring 0.
Global Descriptor Table
Currently, StuBS employs a Global Descriptor Table (GDT) with an entry for kernel (ring 0) only in long mode. In order to be able to execute code in user mode, you have to extend the GDT with entries for code and data in ring 3. Additionally, you have to introduce a descriptor entry for each Task State Segment (TSS, see below).
Further information and a detailed description of the data structures can be found in the Intel Software Developer’s Manual Volume 3 in section 3.4.5 Segment Descriptors.
Introduce User/Kernel Stacks
Code running in ring 0 should use an extra kernel stack (separate from the user stack). For this, we have to exploit the ancient x86 mechanism for hardware tasks. The TSS structure is, similar to the GDT, a silent reminder of this previously intended purpose – and nowadays most of the content is there for legacy reasons only. Modern OS (like StuBS, of course) use software tasks instead and the TSS is solely used to keep track of the kernel stack pointer: During a switch from ring 3 to 0, the CPU changes the stack pointer according to the corresponding value stored in the TSS. Consequently, a separate TSS (and hence a descriptor for it in the GDT, see above) is required for each core: For a single-core system like OOStuBS, one TSS is sufficient, while a multi-core system requires one for each possible core (Core::MAX
in MPStuBS). Using the task register (load instruction ltr
), each core is able to determine (through an indirection via the GDT) the location of its TSS. The Intel manual states a description of the TSS structure and load procedure in Section 7.2 Task Management Data Structures.
On each task switch in Dispatcher you have to update the TSS so it will contain the kernel stack pointer of the current thread – extend Thread / StackPointer accordingly.
- Attention
- To avoid stack overflows with all its strange behavior, you should make sure to always set the top-of-(kernel-)stack in the TSS instead of the current (kernel-)stack pointer.
Initial Switch to Ring 3
The start of user threads uses a kickoff function similar to Thread::kickoff. However, the new kickoff has to perform a switch to user mode (ring 3) before it is able to call the target function (Application::action()). To do the switch, you will have to exploit the fact that interrupted threads will automatically return to their previous ring using iretq
: You have to fake an interrupt stack, in which the last two bits of the segment selector entries specify the desired protection level (ring 3 in your case).
In case you struggle with the structure of the stack layout for interrupts, Section 6.12 Exception and Interrupt Handling of the Intel manual might enlighten you.
- Attention
- Please be aware that we still need kernel threads running on ring 0 (e.g., IdleThread)!
Also note that privileged instructions are not permitted in ring 3. While this should sound obvious, the usage of such instructions in our code might not be so obvious at all: kout
uses the hardware cursor, which is accessed through IOPort (inb
and outb
) – which are allowed in ring 0 only (and will leave you behind with a General-Protection-Fault (GPF)). To avoid these obstacles, it is sufficient for your Application to just print to a custom TextStream (without hardware cursor) in an endless loop (without calling any additional functions like GuardedBell::sleep() or Keyboard::getKey() ).
At some point (after you've fixed all the GPFs), you might find yourself asking "Am I already/really in user mode now...?". There is an easy approach to answer this question: Just take the code segment register (cs
, use inline assembly) and check its last two bits – if set, you are (finally) in ring 3!
See also
CPU Core Local Storage (required in MPStuBS)
In some way, StuBS already has core local variables: For example, the epilogue queue in Guard consists of an array for each core and employs Core::getID() (which itself utilizes LAPIC::getID() in conjunction with a look-up table) for each access. A more enhanced approach employs the currently unused gs
segment register (the other extra segment, fs
, is reserved for thread-local storage according to the SystemV ABI (Section 10.3)): On each core, gs
points to a different part of memory (set via Core::MSR) having the same structure each (for example implemented as a struct
array with Core::MAX elements, ideally cache aligned). Access it in assembly with gs:OFFSET
, whereas OFFSET
is the byte position of the required element in the struct
(the GCC intrinsic __builtin_offsetof
can be useful).
Since an application in user mode can access gs
as well, we want to have a separate value for the kernel, swapped on each ring switch using the assembly instruction swapgs
. The initial value is stored in the Model Specific Register MSR_GS_BASE, which gets swapped with MSR_SHADOW_GS_BASE during this instruction.
- Note
- Make sure to only swap the segment on an actual ring switch (an interrupt while executing code in ring 0 should not swap the segment!). For testing purposes, you should assign different values to these MSRs (since we only use core local storage in ring 0, the userspace
gs
base can simply be set to0
)
For this exercise, you have to implement a getID()
function using this technique, which should return the same value as Core::getID(). Measure the performance of both functions using the TSC according to Intels How to Benchmark Code Execution Times in an emulator (Qemu/KVM) and on real bare-metal hardware.
- Attention
- Don't forget to use CPUID to check if required instructions like
rdtscp
are available on the current (virtual) hardware!
After a successful verification, you can modify the PerCore wrapper to use the new getID()
, which will slightly improve the performance of Guard and Scheduler (however, the impact might not really be noticeable to the user).