37. Mach, Another Distributed Microkernel

Part of the 22C:116 Lecture Notes for Fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Introduction

Recently, Mach has come to prominence as the microkernel underlying MacOS X. Mach was designed at Carnegie Mellon University in the late 1970's and early 1980's; with the specific goal of producing a new distributed foundation on which Unix systems could be implemented. The goal of providing absolute compatability with Unix forced some compromizes in the design of this system, and it also means that most Mach users are unaware of Mach itself, viewing it as merely a new foundation layer for Unix implementation. In fact, Mach is far more than this!

Mach is a microkernel, in much the same way that Demos and Amoeba are. As with these systems, the primary emphasis of Mach is to provide communication tools that allow efficient implementation of user-level services such as file systems. Above the microkernel, most system functions in Mach are expected to take the form of servers, and most classical system calls end up being implemented as client-server interactions.

The Like Amoeba and Demos, the Mach kernel supports processes and interprocess communication, and does not provide higher level services such as file systems. Like Amoeba, Mach supports kernel threads and has a process model that defines a process as a protection domain in which one or more threads run. Like Demos but unlike Amoeba, Mach's message passing model uses kernel-protected capabilities, so all of the interprocess communication capabilities of a process are packaged in a C-list associated with that process. Like Amoeba, Mach supports a segmented memory model, but unlike Amoeba, Mach allows individual pages of a segment to be either mapped to page frames or unmapped.

Mach Processes

Each Mach process is made up of a virtual address space, a collection of threads, and a C-list. The capabilities in the C-list refer to communictaion ports and semaphores. (Note that capabilities for semaphores semaphore were not part of Amoeba or Demos.) Processes are passive! The kernel maintains, on behalf of each process, 4 communications ports. Each process has an exception port; whenever a thread in the process raises an exception, for example, by divide by zero, illegal memory reference, or similar problems, the kernel sends a message to the exception port. This message identifies the cause of the exception in detail.

Messages sent to a process's process port are interpreted by the kernel. All kernel operations on processes that may be carried out by other processes are implemented by messages to the process port instead of by kernel calls. As such, no kernel calls take process ID's as parameters; instead, operations that might take that form are formulated as messages to the process port of the process being manipulated. Process termination, suspension, resumption, priority etc all are handled this way.

The bootstrap port is used to start a process. The initial thread of the process typically begins by reading from the bootstrap port; when a process is started, the first thing its creator does is send a message to that process's bootstrap port containing the information that process needs in order to run. Typically, this message will include the ports that make up its standard environment and the parameters with which it was created.

Finally, note that each process is initially created with one thread, and this thread has a thread port. Messages sent to the thread port of a thread are interpreted by the kernel as operations on that thread. Because capabilities for all threads of a process will typically be found in the C-list of that process, it is possible for threads to operate on each others' states as well as on their own.

Memory

Mach has a paged model of memory, with the virtual address space of a process divided up into regions, where regions are separated by undefined areas of the address space, and regions are divided up into pages. Regions may correspond to segments on those machines where the address space is paged and segmented, but the term region is used in order to decouple the user-level notions of addressing from the implementation in terms of any particular memory managemet unit or higher level system built on top of Mach.

The Mach kernel does not handle page-faults! Instead, the Mach trap handler for memory management faults simply stops the thread that caused the trap and sends a message to the page-fault port of that memory object. The message contains the information necessary for the fault handler -- a thread of some process -- to handle the fault and then restart the thread that caused the fault. It is easy to infer that this information includes the thread-handle of the thread that caused the fault, allowing access to the registers of that thread and allowing the fault handler to restart that thread, and it includes information from the MMU about the nature of the fault. If the fault handler decides that the memory reference was indeed illegal, the fault handler must raise an exception by passing a message to the process's exception port. Therefore, the fault message to the page fault handler must include information about the process that hosted the thread that raised the fault.

If a thread accesses a memory location that is not in any of the memory regions assigned to the thread's process, the kernel sends a message directly to the process's exception port. Note that this implies that all threads of a process will have the same exception handler!

The kernel services for memory management are quite primitive: Physical page frames may be allocated and associated with a page of a region, and they may be deallocated. Regions are first class objects and may be shared.

Threads

Mach uses relatively lightweight kernel threads. Thread creation and management are all done by kernel calls, either directly or by sending messages to a process or a thread management port. Obviously, the create-thread message is sent to the process management port of the process where that thread will be created, and the messages to manipulate the state of an existing thread are directed to the thread management port of that thread. Aside from this, mach threads are fairly conventional kernel threads, so little more needs to be said about them.

Ports

Mach ports are not attributes of a process. Instead, ports are first class objects. Thus, both the sender and receiver of a message must have capabilities that name the port through which the message is sent. Ports may have multiple recipients, for example, processes that cooperatively share the load of processing messages to a particular port. Mach does not permit multiple recipients to actively read from one port, so Mach communication is not completely symmetrical. Rather, at any time, only one process may be registered to read from any particular port. This allows fault tolerant servers to be constructed in much the same way they are in Amoeba.

Mach processes have capability lists, where the capability lists indicate what ports and semaphores the process may reference!

All threads within one process, and all processes that share one memory region must run on the same shared-memory machine, possibly with more than one processor, but ports may be shared by processes running on many different machines in the network environment. Port creation and destruction are primitive kernel operations, and capabilities for ports may be included in messages, along with data. Unlike Amoeba and like Demos, though, capabilities sent in messages are strictly segregated from other data in the message.

Virtual Memory in Mach

A typical page fault exception handler for Mach might have a structure like this:

    loop
        await page-fault exception message
        -- assert it's an illegal memory reference!
        get from message, capabilities for:
            the process that caused the fault
            the thread in that process that caused the fault
            the segment that the fault was in
        get from message
            the virtual address in the segment
        pick a page to replace (in that segment?)
        -- each of the following operations involves
        -- sending a message to some other server
        map the segment into the handler's address space
        write that page to disk
        unmap that page from its segment (frees one page frame)
        map the page that caused the fault (allocates a page frame)
        read the contents of that page from disk
        start the thread that caused the fault
    end loop

This page fault handler runs entirely outside the kernel! If a user of the Mach kernel wants to write their own fault handler, they may; all the user has to do is create a memory region with a user-provided page fault handler instead of using the standard handler. The user's handler is free to handle things in an eccentric way, including forwarding exception messages to the standard handler!

Some of the most interesting experiments with distributed virtual memory were conducted on the Mach system, with fault handlers on different machines cooperating to send pages back and forth between machines so that the processes on the different machines thought they had shared memory when they, in fact, were connected by a network.