Our OS lacks a synchronous way back from user mode (ring 3) to kernel (ring 0). To overcome this issue, you have to introduce a system call interface.
1. Interrupt-based System Calls
A very basic approach is triggering a software interrupt (vector
0x80 like Linux, for example) to switch to the kernel.
A trap via the
int instruction is a privileged operation: When used in ring 3 without special preparation, it would just cause a General Protection Fault (exception number
13) instead of the desired system call vector in ring 3. However, it is possible to modify the Interrupt Descriptor Table (IDT) and allow triggering a specific system call vector from ring 3. Additionally, since our interrupt_handler is designed for device interrupts, you are advised to create a custom system call entry (in assembly) with a corresponding high-level handler function (in C++). Register the new entry function using IDT::handle() for your system call vector (the parameter
dpl defines the allowed privilege level).
- All system calls should be synchronized using the epilogue level (employ Guarded in your system call handler)!
Since you switch the stack from user to kernel, passing arguments over the stack would be highly cumbersome. You should better stick to passing the arguments via registers only. Conveniently, this is also the default for (the first six) function parameters according to the x64 SystemV ABI (Section 3.2.3).
For the upcoming assignments, you should prepare for system calls with up to five parameters. Since all system calls use the same vector, you'll also need an identifier to distinguish them – you can handle them in a big
It is strongly recommended to extensively test the passing of all five arguments and the return value using a custom test system call.
- If done properly, you are able to avoid copying or saving/restoring registers – it's not just less assembly code but also faster.
Functionality provided by the System Calls
Implement the following system calls (while using a reasonable semantic):
size_t write(int fd, const void *buf, size_t len); size_t read(int fd, void *buf, size_t len); void sleep(int ms); int sem_init(int semid, int value); void sem_destroy(int semid); void sem_wait(int semid); void sem_signal(int semid);
Separating each function in a
stub (for ring 3) and
skeleton (ring 0) part (with the system call handler acting as dispatcher) might improve the readability of your code structure. It may also be advisable to employ the
write system call in a OutputStream compatible wrapper to retain the simple and accustomed output functionality.
2. Fast System Calls
While interrupt-based system calls are fairly easy, they cause significant overhead leading to a notable performance degradation. To overcome this issue, lightweight mechanisms like
sysexit (Intel) and
sysret (AMD) have been introduced. Due to compatibility reasons x64 systems use the latter one.
To prepare your system for those fast system calls, you should start by adjusting the GDT according to the required layout described in Intel's manual at 5.8.8 Fast System Calls in 64-Bit Mode for Model Specific Register MSR_STAR. The pointer to a new assembly entry function (calling the system-call-handler already implemented for the interrupt-based variant) needs to be assigned to MSR_LSTAR. In MSR_SFMASK you can define all bits which should automatically be cleared from the
flags register upon executing
syscall (the 9th bit, interrupt enabled, might be an excellent idea). And above all: don't forget to enable the
syscall instruction by setting the MSR_EFER_SCE bit in the Extended Feature Enable Register.
- To return from the kernel during a fast system call, you have to use the instruction
o64 sysretin NASM (without the
o64only a 32-bit
sysretwill be performed) or
q) in GCC inline assembly!
Arguments can be passed by register, similar to the interrupt-based approach. However, the
rcx register (4th parameter) is also used by the
syscall instruction – maybe you can work around this issue by switching the parameter to another (scratch) register?
Functionality provided by the System Calls
Each fast system call should provide the same functionality as the corresponding interrupt-based one. To distinguish between them you can prefix them with something like
fast_. You probably want to use a custom test function to validate the passing of the maximum number of parameters.
Benchmark (7.5 ECTS)
For the extended exercise, you have to introduce an additional system call
for both variants – which does exactly what you would expect from its name: no-operation, nothing. Its only purpose is performance analysis:
Use your benchmark experience gathered in the previous assignment to measure the performance of both variants using the
nop system call. Bear in mind that some of the proposed benchmark instructions might not work in ring 3.