3. Using Unix Access Rights

Part of the 22C:169 Lecture Notes for Sppring 2006
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Search Paths

In the Unix shells, when you type a command name, the shell tries to execute that command from several different directories. In the very first version of Bourne's shell, it may have been the case that you had to type in the full file name of the command to be executed. This was very quickly replaced by a shell that searched in /bin and in the user's current directory for the file name. As /bin filled up, a new binary directory was added to the standard Unix distribution, /usr/bin and many users started gathering their executable files into personal binary directories, for example, /Users/jones/bin.

In all current Unix shells, the shell environment variable PATH is used to control the set of directories the shell searches and the order of the search. If you type echo $PATH on a Unix system, you will see your search path, for example:

% echo $PATH
/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/opt/local/bin:/Users/jones/bin:.

This path says, first look up the command name in /bin, and if it is not there, look in sbin, and then /usr/bin, and then /usr/local/bin, etc. The very last item in the path is a dot. This means the current directory. If I wanted my commands to take precidence overy system commands with the same name, I could set the path as follows:

.:/Users/jones/bin:/bin: ...

This new path means that, when I type in a command name, the shell should first look in my current working directory, then in my personal private directory of executables, then in the list of standard system directories.

There is something very natural about the idea of having the system search my directory before it searches a system directory. If I need a command named, for example pear, I should be able to create that file and use it without having to worry about what's in /usr/bin. As it turns out, there is a file in /usr/bin on my computer called pear. I have no idea what it does, but when I typed the pear command it split a long list of commands at me that mentioned something about PEAR packages. Wonderful!

The point is, pear is a useful command name for me, in my context, so why not let my context take priority over the context set by some system administrator who decided to install a package called pear that has nothing to do with what I am doing? This seems perfectly analogous to the situation in high-level programming languages, where local definitions take precedence over global definitions, so how could it be unsafe?

A Puzzle -- the danger in local paths

Many of the executables in /usr/bin are actually shell scripts. For example, on the Linux system I use, the commands allcm, allneeded, anytopnm, apr-config, and apropos, among others, begin with the "magic" text #!/bin/sh.

Suppose, among such files, I find a file that has the set user-id bit set and the owner is root, the super user. The newgrp and sudo commands have this bit set. The former changes the group ID of its caller, while the latter allows an unprivileged user who knows the root password to execute privileged code. For self protection, these two commands are public execute only, so I cannot tell if they are shell scripts. In contrast, most commands are public read-execute.

Here is the risk that the author of a set user-id shell script with root as the owner must guard against. If the shell script executes a command, the shell will search the current path for that command. This path is passed to the shell as part of the user's execve command that launched the script. So, if the user passes a path with the user's directories in front of the system's directories, and if the user knows the name of some program executed by the script, then the user can put a program by that name in the user's directory so that the system will run the user's program with root privileges.

To defend against this attack, most Unix versions erect multiple barriers:

Either of the first two defenses would be sufficient. Using both protects against careless system programmers making a mistake. The third defense would be totally unnecessary if either the first or second defense was properly carried out, but it makes failures less dangerous by making it harder to find system commands that have been improperly implemented.

Puzzle 2 -- students can't see what everyone else can

Access rights are reported by the ls -l command as follows:

-rwxr-xr-x    1 jones  staff  11632 23 Mar  2005 a.out
drwxr-xr-x    9 jones  staff    306  9 Jul  2002 amulet

The first letter indicates whether the file is a directory d or not -, while the remaing letters are triples, rwx, indicating read, write and execute, with the first tripple referring to the owner's rights, the second to the group rights, and the third to the public rights. Where the set user-id bit is set, the user's x bit is reported as s, while if the set group-id bit is set, the same thing is reported with the group x bit.

Suppose the access rights to a file are set as follows:

drwx---rwx    9 jones  students 306  4 Jul  2005 gradebook

Here, students have no access, the owner and public have unlimited access.

A naive user might imagine that the presence of the all rights in the public field would give everyone public access. After all, everyone is a member of the public! In fact, Unix does not work this way. Instead, you get the exact rights granted by the first of your IDs to match the file ID. Owner rights take precidence over group rights which take precidence over public rights.

As a result, any user in group students will have no access at all to the above file, while everyone else will have unlimited access to the file, assuming that they can reach it through the directory structure.

Puzzle 3 -- It's yours, but you can't read it

Consider the following directory structure under a Unix system:

/  drwxr-xr-x root
|
|---/Users  drwxr-xr-x root
    |
    |---/Users/yourdirectory  drwx------ you
        |
        |---/Users/yourdirectory/theirfile  -rwx------ they

Each file in the above directory tree is given with its full path name, its access rights, and its owner. For this example, group rights are not involved.

If you try to open /Users/yourdirectory/theirfile, you can get to it because you have the right to read your directory, but you have no access to the file itself because only they have any access. Actually, if you try to open it without requesting either read or write access, so that the only permitted operation on the open file is close(), you might be allowed to open it, but this grants you no useful access to the file.

If they try to open /Users/yourdirectory/theirfile, they cannot because they cannot traverse the directory it is in. The file still counts against their disk quota, since they are the file owner, but they have no access to it and cannot even delete the link to it.

Puzzle 4 -- How to create the previous puzzle

How did the situation above get created? The key lies in the ln shell command which, in turn, uses the link kernel call. Consider the following shell command:

	ln a b

This is equivalent to the following kernel call:

	link( "a", "b" );

What either of these does is create a new directory entry, b that refers to the exact same object that was referenced by the directory entry a. The file is not duplicated. Rather, a new directory entry is created that references the exact same file.

This is possible because, under the Unix file system, files (with their associated ownership, group membership and access rights), are actually known only by a numerical file number, also known as the i-node number of the file. Textual directory entries simply associate names with file numbers, so it is quite possible to have one file known by two completely different names. This also means that directory entries can be created and destroyed freely, at low cost, while the large heavy file sits unchanged, with no expensive disk operations involved in link manipulation.

Consider this initial directory tree:

/  drwxr-xr-x root
|
|---/Users  drwxr-xr-x root
|   |
|   |---/Users/yourdirectory  drwx------ you
|   |
|   |---/Users/theirdirectory drwx------ they
|       |
|       |---/Users/theirdirectory/theirfile  -rwx------ they
|
|---/tmp  drwxrwxrwx root

Here, they have created theirfile in their own directory, so they have complete access to their file, and naturally, they can change the access rights any way they want and create any content they want. Also note the directory /tmp; this directory is open to the world. Anyone can create or delete links here. This is the common situation on Unix systems, where /tmp exists as the standard place to put temporary files. Usually, the system administrator promises to delete (usually automatically) all files over some age that are found there.

Now, if they execute the following commands:

	ln /Users/theirdirectory/theirfile /tmp/thefile
	rm /Users/theirdirectory/theirfile

The effect is to creat a new link to their file and then cut the old link, so that their file is now known only as /tmp/thefile. This creates this directory tree:

/  drwxr-xr-x root
|
|---/Users  drwxr-xr-x root
|   |
|   |---/Users/yourdirectory  drwx------ you
|   |   |
|   |   |---/Users/theirdirectory/theirfile  -rwx------ they
|   |
|   |---/Users/theirdirectory drwx------ they
|
|---/tmp  drwxrwxrwx root

At this point, we have created exactly the situation we set out to create. The file has been moved between users, with the access rights set up so that nobody has access to the file itself. You, having this file, could pass it onward to someone else, who could pass it, eventually, back to the file's owner. You have custody, and you might act as some kind of escrow agent, holding it while unable to touch it, so that they, when they get it back, have a guarantee that nobody was able to change it.

A better way to pass files around

The above example shows the use of /tmp, as has traditionally been possible on Unix systems. In fact, this is a bad idea. In attempting to pass theirfile to the other user, it was put in a public place where anyone could link to it or interfer with its passage. What we really want is a private way to convey files from one user to another. Here is what we probably ought to have done, using an approach that is fully within what the Unix file system allows:

/  drwxr-xr-x root
|
|---/Users  drwxr-xr-x root
    |
    |---/Users/yourdirectory  drwx--x--x you
    |   |
    |   |---/Users/yourdirectory/public  drwxr-xr-x you
    |   |
    |   |---/Users/yourdirectory/dropbox  drwx-wx-wx you
    |
    |---/Users/theirdirectory drwx--x--x they
        |
        |---/Users/theirdirectory/public  drwxr-xr-x they
        |
        |---/Users/theirdirectory/dropbox  drwx-wx-wx they

Here, each user has two directories, one that is world readable, so that files put there by the user are available to the public, and one which is world writable so that the public may place files there that are intended for delivery to the user. The user's directory allows those in the world to traverse the path to these two public directories, but the public cannot find out the names of the user's files.

Of course, an exhaustive search of all possible file names could locate the files in the user directory, but this is a massive project. Clearly, users in the world described above should avoid short file names at the top of their directory structure, but even if outsiders find these files, they will be of little use unless the user also sets the access rights to allow public access.

It would have been reasonable to use a directory structure such as is suggested above to implement the electronic mail system of Unix, but unfortunately, this was invented without benefit of such structures, so by tradition, all Unix e-mail is stored in files within /usr/spool and is entirely outside of the control of the individual user.

Of course, given that Unix is over 30 years old, it should be no surprise that introducing new conventions is almost impossible. The weight of established practice makes it almost impossible to abandon old directories such as /tmp because so many applications expect these directorys to be there.