Arrow of time
Arrow of time

How do you explain an OS kernel to a layperson? + an old text about The Matrix as an Operating System

Share Tweet Share

Earlier this year I was a guest at a gathering of people who were mostly involved with social sciences and …

Earlier this year I was a guest at a gathering of people who were mostly involved with social sciences and politics and a topic soon arised in which I had to explain some of the things I work with. In this group of 10-ish people there were one or two who had even a vague idea what a kernel is (they were engineering students, actually), and were suitably impressed, but the rest of the group simply offered encouraging blank stares - they were waiting for me to explain what I just said. Well, the go-to approach when dealing with anything vaguely engineer-ish is to make an ad-hoc metaphor involving a car (at least for a male audience...). But... actually... how DO you compare something from a car with the kernel? You can't compare it to the engine, or to the chasis, or even to the road the car travels on. It's even worse when trying to explain what a kernel programmer can do within an operating system. What would THAT be in the world of car metaphors? A kernel programmer is definitely not a mechanic. Maybe, a car... architect? Of a sorts? After stuttering long enough I just went full retard and tried to impress the girls of the group by saying it's like the kernel developer designs the very laws of physics on which the car runs. They were, sadly, still unconvinced.

Can you do better?

The Matrix came out in 1999, and soon after I became interested in operating systems, which eventually lead me to working as a kernel developer on the FreeBSD project. But in the mean time i was learning and spent considerable time being appropriately geekily fascinated with the topic.

Sometime in the early 2000s, probably somwhere around 2003 or 2004 I wrote what I think is a quite clever comparison between the characters and the happenings in The Matrix and its sequels and a (fictuous) operating system. To save this text from being lost in time like bytes on forgotten servers, and bring it Into The Cloud, I'm copy pasting this original text here below. Be warned: you can stretch out metaphors too far.

Since one of my fascinations is operating systems design, implementation and maintenance, ever since I've first seen the Matrix movies I've thought some of the concepts in them can be related to familiar concepts in operating systems:

  • The Matrix world: a running operating system, with userland (the "common" world, in which people live, and the kernel (the "Matrix" proper). Apparently it's a pretty buggy OS...
  • People: processes, both kernel processes and user processes. There's a big distinction between normal, "unprivileged" people, and daemons with root privileges - "agents". Root daemons can open privileged ports, kill random processes, manage memory, etc.
  • Matrix: the kernel. It looks like a message passing kernel, not necessarily a microkernel (though they are some microkernel aspects, such as the abundance of kernel processes, strict separation of duty between them, and the already mentioned message passing). Kernel manages all processes, and performs operations on their behalf (such as keeping them alive, servicing them and recycling them). But there's an apparent security defect: some userland processes can (because of a bug) transfer and execute parts of their programs in the kernel space. Only certain syscalls are affected (the "phones"), and this kind of privilege escalation garbles the userland process' return stack, such that if the process receives a signal, it segfaults and is garbage collected (if you're killed in the Matrix, you're dead for real).
  • Oracle: the process (task) scheduler. Has all the numbers from process monitoring (resource usage) and knows in advance (broadly) how to schedule them to run to their optimum.
  • Agents: system monitoring / intrusion detection / prevention system (IDS / IPS) with heuristical operation. Most of them have a kernel part (kernel module) but are basically daemons run with superuser privileges in the userland. They are tasked to find and kill processes which attempt to violate system security.
  • The trainman: kernel-userland gateway / message passing queue. You've got to go through him if you want to validly pass data between userland and kernel. You also might be stuck in the queue forever.
  • The Merovingian: networking / IPC stack. It's his business to know everything going on between processes. Has a bug manifesting in occasional input / output data corruption.
  • Vampires / ghosts: compatibility shims for older API / KPI versions. Their code is rudimentary and, for historical reasons, interfaces with parts of kernel normal processes shouldn't (i.e. they have lots of layering violations).
  • The Architect: kernel monitoring infrastructure (supervisor), tasked with monitoring processes, killing those that wedge and restarting those that crashed. Since it's a realtime high-availability OS, the debugging and monitoring infrastructure has the absolute highest priority and is "blessed" to be infallible (thus, to limit the possibility of error, is very limited in its complexity). It's been misconfigured to be overzealous, does availability checking too often, taking too many resources, and so interferes with the normal operation of the operating system.
  • Keymaster: security / privilege subsystem. It's stable, but unfortunately relies on the VM system and the IPC system which are buggy, and can be exploited by processes to gain more privileges from him.
  • THE PLOT: There's a design bug between the VM (virtual memory) system, the process management system and the scheduler, manifesting under high system load (lots of processes, high memory pressure). It is a compound error, which results in at least three things: * Memory pages can get corrupt or missasigned to processes that don't own them. Since kernel and userland share the VM, processes on either side can end up with memory pages from the other, revealing sensitive data and making way for security escalations. Mixing up the VM pages bypasses address space protection between the processes. * The IPC subsystem, bad as it already is, gets even worse when its data structures get corrupted or the memory load gets so high it deadlocks waiting for buffers. * The system monitor goes berserk, killing and restarting processes in a loop, unaware that it makes the things worse by building additional memory pressure and process load, eventually greatly helping spread the VM pages corruption between the processes.
  • Agent Smith: privileged IPC daemon with part of it implemented as a kernel module. It's so closely tied with the kernel module part that it shares data structures with it without sanity checking. Once it was killed by another privileged process, but it was in the middle of a syscall so when the monitor restarted him, the corruption which was already done to its process descriptor resulted in most of its program being executed in the kernel context. It continued to work in this corrupted state for a long time, wedged in a loop, erroneously tagging processes as security breaches and overwriting some of their memory pages with its own.
  • Neo: Initially a userland network server process, the VM corruption resulted in it being assigned both superuser privileges and high priority (CPU time). Eventually, it got its executable memory pages mixed up with the IDS process Smith, but not the data pages. Before long it also starts killing processes, including Smith and his corrupted copies.
  • THE ENDING: process Smith eventually tries to kill the scheduler process, but since it's itself scheduled by it, cannot do so reliably. The system gets wedged because the scheduler cannot perform its tasks anymore, including interrupt servicing, but the part of Smith's code in the scheduler's VM image (which is accidentally also the part shared with process Neo) still runs. Since there are only two processes running, they both are trying to kill each other. Meanwhile, since interrupts are no longer being served, the hardware watchdog timer wakes up, inserts a NMI, which wakes up the monitoring system. It decides the system is in a critical state and proceeds to kill all processes, then restarts them to bring the system up again. The End.

Post mortem analysis: There appears to be an inherent flaw in the design of the operating system, especially in the VM, IPC and monitoring subsystems, resulting in a global memory corruption among processes and critical failure of address space protection for a small number of processes.

Recommendation: More fine tuning is needed to settle out the proper process priorities, reduce priority inversions and imbalance. VM system probably needs to be rewritten and IDS system replaced with a less resource intensive version. System monitor needs to be modified not to start extensive operations if the system load is above a threshold.

There! An interpretation of The Matrix without involving "free will" in any way.

comments powered by Disqus