35. Amoeba, a Cryptographic Security Example

Part of the 22C:116 Lecture Notes for Fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Introduction

Andrew Tannenbaum's Amoeba operating system illustrates a complete server-side implementation of capability-based security in a distributed system. Unlike the example presented in the previous lecture, access rights are included in each capability, and the integrity of the access rights field of each capability is guaranteed by the server-side checking mechanism.

Amoeba Capabilities

Tannenbaum's Amoeba system supports capabilities formatted as follows:

          | server id | resource id | rights |  check   |
          |    48     |      24     |    8   |   48     |
          |___________|_____________|________|__________|
          |___________|_____________|________|__________|
Amoeba uses server-side authentication; thus, the kernel is only concerned with the server ID field; this is used to send messages to the appropriate server. The interpretation of the other fields is entirely up to the server! The resource ID field is similar to the queue-number field of a Demos link, and the check field serves essentially the same role of the cookies used for the server-side Demos like model presented in the previous lecture.

The server ID used in Amoeba is not the machine ID; rather, it is a "free floating" number with no direct connection to any particular machine or process. Servers may register themselves with the local kernel as serving some particular server ID; if the registration is legitimate, the process will receive requests for service at that ID. The model Amoeba uses to determine the legitimacy of these registrations is also cryptographic, but we will ignore it for the time being.

Amoeba provides a standard set of interpretations for the fields described above, but this interpretation is not enforced. Processes may depart from the standard, although there is no good reason to do so. In the standard interpretation, these fields are used as follows:

The resource ID field is used by the server to identify which of many objects it is being asked to operate on. Amoeba assumes that each server process will manage many different objects. This, in turn, implies large servers, which implies an assumption that process creation will be a heavyweight operation and should be done infrequently.

Contrast this design with a hypothetical system where process creation is very inexpensive, and therefore, it would be reasonable to represent objects by processes, so the server-id field would identify the object itself and the resource-id field could therefore be either very small or entirely absent. Here, we obviously assume that each server will support operations on a large number of objects. In effect, an Amoeba server is expected to represent an entire class and not just one object.

The rights field confers access rights to the object that the server has extended to the client. Each bit is assumed to confer one right; if the rights field is all ones, the user has all rights. If the object is a file, for example, the rights might refer to such operations as read and write.

The check field is used by the server to assure that the capability is valid and that the rights field has not been corrupted. For each object allocated by a server, it will typically assign a random check number that must match the check number provided with a capability by the client. This prevents the accidental forgery of capabilities. The server must store the check numbers for all objects it creates so that it can check the legitimacy of capabilities presented to it.

The problem of preventing clients from asserting rights they do not have is solved by an ingeneous cryptographic trick. The check field given to the client is not an exact copy of the check number held by the server; instead if C is the server's private check number for the object, and the user's rights are R, the user's check field must be f(C xor R). When the user presents a capability to the server, it may easily compute f(C xor R) and compare it with the check field. The standard function f is a trapdoor function, that is, a function which is very easy to compute in the forward direction but with a very difficult to compute inverse.

For an ideal trapdoor function operating on an n-bit argument, the time-space product to compute f would be O(nk), while the time-space product for computing f-1 would be O(2n). Note for comparison that an add can be done in O(log n) time using O(n log n) gates in parallel hardware, or it can be done serially in O(n) time using O(1) gates. Similarly, multiplication can also be done in parallel in O(log n) time using O(n2 log n) gates, or using a shift and add algorithm in O(n log n) time using O(n log n) gates or using the same algorithm with bit-serial addition in O(n2) time using O(1) gates. In contrast, table-lookup evaluation for an n-bit function takes takes a table of size O(2n) and requires at least O(log n) time for address decoding.

The use of a trapdoor function to verify capabilities means that only the server may restrict the rights on a capability. This, in turn, means that every standard Amoeba server includes a "restrict" operation that takes a new rights mask and a capability as parameters, and returns a restricted version of that capability. To avoid the expense of this operation on capabilities with a rights mask of all ones, the server's check number C is included literally in such capabilities and the one-way function f is not used. Because there is a standard one-way function available to all users, a user of a capability with a rights mask of all ones may restrict it locally. This introduces no security loophole precisely because that user already had all rights to the object, and so, that user gained nothing by being given the "secret key" used by the server to manipulate that object.

Amoeba includes a clever trick to allow the owner of an object, that is, to allow a process with all rights to an object to perform restriction without consulting the server. In the special case where the rights field is 11111111, the Amoeba system includes the server's private check number as the check field, literally. This gives nothing away, because the owner of an object has all rights to that object, so could not use this information to forge additional rights. The trick with the trapdoor function is only needed to protect capabilities that confer some subset of the rights.

Amoeba Message Delivery

The Amoeba system does not use the actual ID of a process for message delivery. Instead, the server ID field of an Amoeba capability is simply an integer, with no fixed connection to either the recipient process or to any particular machine.

Each machine in an Amoeba system runs a local kernel, a large part of which is concerned with message delivery. This kernel maintains a table, local to each machine, showing what local processes are currently registered to receive messages addressed to particular server IDs.

The correctness of Amoeba's message delivery system rests on the fact that, when a machine wishes to send a message using an unknown server ID, the machine can broadcast a request to all kernels asking "what machine is currently running a server for this ID?" If one of the kernels has that ID in its local server table, it replies with its network address, allowing the sending machine to deliver the message.

For efficiency, this broadcast protocol must be used only rarely. The first step in ensuring that it is only rarely used is a local cache maintained by each kernel. This cache lists recently encountered server IDs and the network addresses where the servers reside. These cache entries are created by replies to the kinds of broadcasts mentioned above, and they are retained using something like an LRU replacement policy.

If an Amoeba system was enlarged to the point where this cache scheme was insufficient to limit the use of this broadcast protocol traffic to an acceptable level, the system could easily be modified to use a low-level name server, with broadcasts being used only when both the local cache and the name server had no record of some server. If the system was enlarged even more, so that the name server became a bottleneck, it could be broken down into a server hierarchy, with local, regional and global servers.

Amoeba Server Registration

Amoeba servers must register their server ID with the local kernel. This registration would open a security loophole if it were done the obvious way, by allowing any process to register an ID it wanted. The Amoeba system avoids this potential problem by using mechanisms very similar to those used to create the check field of a capability.

In the Amoeba system, a distinction is made between the private server ID, typically a value that is a closely held secret of the server, and the public server ID that is used in all capabilities that address resources held by that server. The Amoeba kernel service allowing a server to register a new ID takes the private ID as a parameter and computes, inside the kernel, the public ID that is being registered. As a result, the only way to register a particular public ID is to know the corresponding private ID.

The public ID is derived from the private ID by a trapdoor function, the same function that is used to protect capabilities. Therefore, even if a user knows the trapdoor function and the public ID of a server, the user cannot easily compute the private ID of the server.

Amoeba Fault Tolerance

The reason Amoeba does not have a hard connection between the server ID and the actual process that serves the requests is because a fault tolerant system must allow for server failure. The Demos system contained no such provisions!

In Amoeba, a fault tolerant service would typically be delivered by a cluster of server processes, each running on a different machine, with all processes monitoring each other. If these processes detect that a failure has occured, the survivors would conduct an election to determine which survivor registers to offer the service.

Each transaction between a client and the current public representative of such a fault tolerant server would typically be conducted as a distributed atomic transaction, with updates to parallel data structures in all of the member processes of the server cluster. As a result, after any failure, the elected representative of the survivors would have all of the data necessary to continue offering the service.

Amoeba Process Migration

Process migration under Amoeba rests on the same registration mechanism. If an Amoeba process wishes to migrate, it communicates with a process management server to launch a new copy of itself, then communicates with that new copy, installing all necessary state information in the new copy before telling the new copy to begin execution.

Immediately after starting the new copy, the old copy of the process terminates itself. Immediately after it is started, the new copy registers itself to receive all requests that would otherwise have gone to the old copy. This is possible because the old copy gave the new copy its private ID along with all the other state information it passed along.

Once the old copy terminates, any messages in its incoming queues may be simply deleted, leaving it to the fault tolerance mechanisms of the Amoeba remote procedure call model to recover from this. After migration, the message delivery cache entries in other machines referring to the old copy of the process are invalid, and this will be discovered when an attempt is made to use them. When a cache entry is found to be invalid, the sender falls back on a broadcast attempting to locate the server that was lost.

Note that, in the spirit of server-side authentication, this migration mechanism is entirely a matter of user-level code. Applications may be written that notice when they are running on overloade machines and automatically migrate away, but the kernel itself remains simple, offering only the necessary tools and doing none of the work.

Amoeba Server Discovery

Note that the Amoeba mechanisms do pose one problem: How does a user application discover the public ID of a server. There are two solutions to this problem. First, a process may register with a name server. In the case of Amoeba, the directory server is a high-level name server, associating textual names with capabilities. While these are usually capabilities for files or directories, a process wishing to register, for example, a random-number service, could enter a capability for itself under the name /servers/random (assuming, of course, that everyone had agreed to use the directory /servers to hold capabilities for the public request ports of various servers.

Another solution common on Amoeba is to simply publish the public ID of a server. Typically, these IDs are encoded as textual constants in the include files that are used to compile programs that use those servers, so if the server is ever moved to a new private ID, user code relying on that server can be simply recompiled to make use of the new public ID.

The risk of publishing the public ID of a server in an include file is that it prevents the ID from being changed more frequently than applications are recompiled. This means that the ID remains exposed to long-term attack by potential adversaries. In contrast, IDs that are registered in directory servers may be changed routinely, and need only remain valid for as long as the longest expected lifetime of a capability referencing a resource managed by that server.

Because Amoeba was never intended for critical applications, it was not designed to stand up to protracted attacks from outside, and therefore, the level of security described above was considered sufficient. The customer was, after all, the European Space Agency, with an interest in inexpensive supercomputers built from clusters of commodity machines for use in analysis of scientific data. If the customer had been the defense department of any large country, the design would verly likely have been quite different!

Amoeba Threads

Just by inspecting the Amoeba capability representation, it is easy to see that Amoeba servers are intended to be heavyweight entities. The capability format allows each server to support as many as 16 million distinct objects, and this requires a table of as many as 16 million private check codes, one per object. Thus, each Amoeba server process is expected to manage a huge numbers of objects on behalf of its clients. Huge servers imply heavyweight processes.

Amoeba is based on a blocking model of interprocess communication using remote procedure calls. Thus, the primary services offered to user processes are get_request -- await an RPC from a client, put_reply -- send a reply to a client, and trans -- send an RPC request and await a reply (a function that really ought to have been named call). Both get_request and trans are blocking primitives because all data is passed using synchronous non-buffered message passing.

In Amoeba, the thread that does a get_request must follow (after any number of trans operations) with a put_reply. It may not get other requests, and only the thread that got a request may put a reply. Reply transmission is always to a particular process on a particular machine, with none of the complexity of abstract resource IDs that characterized client to server communication. This efficiency is possible because Amoeba uses a custom protocol stack and not a general purpose protocol stack such as TCP/IP.

The use of blocking primitives limits the opportunities for parallelism in Amoeba; these primitives make it difficult for a client to perform other useful computing while awaiting a response from a server, and they make it difficult for a server to perform useful computation while awaiting a request from a client. If applications wish to do this, they must do it with threads.

Amoeba threads are implemented by the kernel. An Amoeba process is a heavyweight entity with a list of segment descriptors; an Amoeba thread is an attribute of a process with a stack pointer, program counter and register set. The threads of a program may interact using semaphores, binary semaphores (mutexes), and asynchronous interrupts.

If an Amoeba process needs to perform a nonblocking operation, it must spawn a thread that will be blocked by that operation while the other threads of the process continue working. Threads are sufficiently lightweight that this can be done at reasonable cost.