22C:116, Lecture Notes, Lecture 3, Fall 2000

Douglas W. Jones
University of Iowa Department of Computer Science

Resource Management

One of the classical ways of studying operating systems is to look at the system as a resource manager. Traditionally, each resource is viewed as a scarce resource with many competing users, and the system is viewed primarily as a resource allocation and control system that arbitrates between competing users. This view is old, but it remains relevant in this era of gigabyte main memories and gigaflop CPUs because the resource demands of our applicatons have grown to keep pace with the resources we have available.

Memory resources may be viewed in many ways, depending both on what objects are clients of the memory manager and on how the manager allocates memory.
Memory may be allocated in pages or segments. These terms are usually defined with reference to memory management unit hardware, but when only software is involved, the same principles apply, and the same terms may be used.
- Pages are fixed size blocks of memory at block-aligned addresses.
```
 A page descriptor  ______________________________
                   |___________________|__________|
                   |  address of page  |0000000000|
```
  The descriptor of a page is a single memory address, with the least significant n bits of the address all zero, where the page size is 2ⁿ addressable units of memory. The addreable unit of memory may be the byte or word, depending on other aspects of the architecture.
- Segments are variable size contiguous blocks of memory, sometimes made up of contiguous ranges of pages, and sometimes with no internal page structure.
```
 A segment descriptor  ___________________
                base  |___________________| 
              length  |___________________|
```
  The descriptor for a segment gives the address of the first word in the segment and the length of that segment in some multiple of the addressable unit of memory. While some memory management units allow segment lengths to differ by as little as one addressable unit of memory, other machines constrain the segment length to be some larger multiple. For example, a machine might have byte addressible memory but require that segments begin and end on a word boundary.
- Segments may be allocated within pages, or segments may consist of multiple pages. In either case, this implies a hierarchy of memory managers (a page-based memory manager allocating from within segments, or a segment-based memory manager composing segments from pages).
The clients of a memory manager may be user programs or other parts of the operating system. Allocation to user programs need not use the same mechanism as allocation to other parts of the system, although a proliferation of mechanisms clearly adds complexity to the system.
For example, under UNIX, the kernel uses a segmented memory manager to allocate space for small kernel data structures, while using paged management to allocate space to user programs. In turn, most UNIX application programming environments use a program called a heap manager to allocate segments of memory from within their pages to user data structures.
The use of paged or segmented memory allocation does not necessarily rest on the use of memory management hardware. For example, old versions of MacOS for the MC68000 processors in the original Apple Macintosh computers used pages of 64K words each, with each open application being allocated, at minimum, one stack page and one code page. The heap manager for most versions of Pascal and C allocates segments from the heap. The heap itself is frequently implemented as a contiguous range of pages in the user's virtual address space, and this range, in turn, is frequently managed as a segment.
While paged and segmented memory management schemes were originally proposed as solutions to the scarcity of main memory, they remain very important today for a different reason: While having a gigabyte of main memory may eliminate scarcity, it does not eliminate protection and addressing problems. With virtual addressing hardware, each process can be given a separate virtual address space so that information sharing is strictly controlled and errors in one process are unlikely to corrupt memory required by another.
Sharing CPU time is essential on any system with more processes than processors. It is common to talk about the contending processes as if each belongs to a different end user of the system. While multiprocessors have been in use since the 1960's, the number of logical processes is almost always larger than the number of CPUs, even on the few systems that have been built with thousands of CPUs.
Since 1966, with the Berkeley Timesharing System, many systems have allowed users to run more than one process in parallel, so not only do users compete with each other for CPU time, but the processes of one user also compete with each other.
Aside: Surveys done of UNIX users done in the 1970's show that a single user at a remote terminal typically had 4 processes, one of which is interactive, one of which was waiting for the former process to terminate, and two of which may have been waiting or running in the background.
On window-oriented systems, it is common to see one or more processes behind each window on the display screen.
Input-output time must typically be allocated between contending requests for input-output transfers on all but the smallest systems.
Operating systems typically allocate I/O to processes competing for the limited bandwidth of the available channels. This may involve code to schedule I/O transactions that is as complex as the code used to schedule CPU time, or on low performance personal computers, the system may ignore this issue.
Secondary memory is the single most important resource managed by most operating systems, at least when seen from the user perspective. In the extreme, some users may view the operating system as if it was nothing more than a command line interface or a point-and-click program launcher plus a file system.
From the user perspective, the central function of a file system, as opposed to the I/O subsystem, is the allocation of space on disk for competing files. Allocation of disk space usually involves algorithms quite different from those used for main memory, both because the user model of disk files is not the same as the user model of main memory, and because secondary storage devices are typically block structured with a complex relationship between block addresses and I/O latency times.
Security and reliability are frequently seen differently by the system and by users.
Protection mechanisms are necessary for both security or reliability, but many small computer systems do without, even though the technology needed to build secure systems has been well known since the mid 1960's.
Security demands that system resources be protected malicious attack.
Reliability demands that system resources be protected accidental damage.
Aside: As a general rule, anything which a malicious user might intentionally do to damage a system might also be done by a careless user. Traditionally, security has been concerned with protection against malicious users, while carelessness was viewed as the concern of fault tolerance. The recognition that malice and carelessness frequently produce the same result has led to the recognition of the relationship between fault tolerance and secure systems.
Resource Protection is generally demanded by users when they perceive a security threat.
Typically, users demand that segments of main memory, files on secondary memory, and terminals allocated to them all be protected from unauthorized use or inspection by other users. There are uniform access control and protection mechanisms that can control access to all of these on the same basis, but on most modern systems, each class of resource is protected using ad-hoc mechanisms that are generally different form those used to protect other resources.
Resource sharing becomes an issue when there are multiple users or multiple processes that must share access to some resources while other resources are private and protected.
Files, windows or memory segments are all appropriate objects that might be shared or might be protected. Sharing is only trivial when there is no protection, and protection is only trivial when there is no sharing.
Early timesharing systems viewed resource sharing in terms of allocating free resources to one or another user, exclusively. If this is done, the system is effectively creating one isolated personal computer for each user, so there is essentially no sharing and the result is of only limited utility.
With the advent of systems such as Multics and the Berkeley Timesharing System (around 1966), the focus shifted to the management of information that was actively shared between users. Thus, each user might have some memory regions that are entirely private, and others that are shared with some, but not all, other users. Management of this shared data, whether on disk or in main memory, poses a significant challenge.
This challenge was largely ignored for much of a decade between 1975 and 1985 as personal computers displaced timesharing systems from growing numbers of applications, but it has re-emerged with a vengence with the advent of computer networks.
Reliability, or absence of failures, is a natural goal of the system designer. Protection mechanisms play a strong part in assuring the reliability of systems by preventing failures such as programming errors in one part of a system from having unlimited effects on other parts of the system.
On early systems, it was generally assumed that any failure in one part of a system would lead to the failure of the entire system, but as technology has advanced, we have learned to build systems where failures lead to incremental degradation of performance.
Early examples of this include memory managers which detect memory errors and remove the bad memory addresses from the pool of memory available for allocation to various system applications. This is now routinely done both with disk systems and with main memory.
More modern examples include multiprocessors where, when a processor fails, the remaining processors continue to operate.
Aside 1: With centralized systems, the assumption that system failure was an all-or-nothing thing was natural. In fact, this was never true, but partial failures of a centralized system were rare enough that they are the subject of folklore -- for example, the timesharing system where the floating point unit failed. The operating system didn't use floating point, so none of the system managers knew anything was wrong until users started complaining about strange results.
Aside 2: With large-scale distributed systems, partial system failures are the norm! Many networks today are large enough that the normal state of the system is that one or more machines are down at any instant.
exercise: Consider the internet, with around one million machines. Assuming each machine has a mean-time between failures of one month, what is the rate of failures, per unit time?

Process Management

The traditional picture of the states of a process, as found in many elementary operating systems textbooks, is as follows:

            Dead _
            ^|  |\
            ||    \
        kill||     \Terminate
            ||      \
            ||       \
        <----|------- \
Waiting     ||  Wait   Running ---
       \    ||        _           |
        \   ||Create  /|          |
         \  ||       /            |
    Signal\  |      / Schedule    |
           \ v     /              | Preempt
            _|    /               | Relinquish
             Ready <--------------

Process States
- Running
  Exactly one process is in the running state on each CPU in the system, never more, and only less if it is possible to halt unneeded CPUs.
- Ready
  There is a queue of processes ready to use the CPU. This may be a FIFO queue or a priority queue
If no process in the system needs the CPU, it is usual to leave a running process on the CPU! For this reason, operating systems usually include some kind of background process to soak up the otherwise unused CPU cycles. Sometimes, this is referred to as the background process (as opposed to other background processes) or the idle process.
On early UNIX systems, one well known use for the background process was for computing more digits of pi. This is silly, and there are useful housekeeping jobs the background process can do. For example, it can run CPU and memory diagnostics, or it can manage garbage collection.
The ready queue may have any of a number of interesting data structures, but it always has two basic operations, enqueue a process and dequeue a process. A FIFO queue gives a system where all processes have equal priority. A priority queue allows high priority processes to have more access to the CPU.
Fixed priority processes are used in many real-time applications. With UNIX, an interesting variable priority scheme was introduced, in which the priority of a process rises when it voluntarily relinquishes the CPU, for example, to wait for I/O completion, and the priority of a process falls when it is preempted at the end of a time-slice. This favors interactive users over compute bound processes.
Preemption
A running process is preempted when it is forced into the ready state in order to give the CPU to some other process. This might be done, for example, by the real-time-clock interrupt service routine at the end of a time-slice in a timesharing system. This is illustrated in the following pseudocode:
```
Clock interrupt service routine:
   begin
      Disable Interrupts (if not already disabled)

      -- Save process state of interrupted process
      current.pc := interrupt-return.pc
      current.registers := interrupt-return.registers

      -- Change interrupted process state to ready
      Enqueue( current, ready queue )

      -- Make new process into the current process
      current := Dequeue( ready queue )

      -- Restore process state of new process
      interrupt-return.pc := current.pc
      interrupt-return.registers := current.registers

      Enable Interrupts (if necessary)
      return
   end
```
Note that, in addition to involuntary preemption, many systems include provisions for a running process to voluntarily move itself from running to ready. This is sometimes called voluntarily relinquishing the CPU, and the code for this is essentially the same as that given above for preemption, except for differences in the calling sequence.
In old systems, clock interrupts were periodic, for example, every 1/60 of a second. Modern hardware usually has a programmable timer, so it is possible for the scheduler to vary the length of the time-slice (interval between interrupts) depending on the behavior of the process being scheduled.
The Berkeley Timesharing System did this by the late 1960's, giving long time slices to jobs that appeared compute bound, in an effort to minimize context switching, while giving short timeslices to jobs that appeared I/O bound.
Interrupts must be disabled during the interrupt service routine so that some other interrupt doesn't cause scheduling activity when the ready queue is halfway through being updated or when the process state is only partially saved or restored.
Process States
- Waiting
  There may be many events for which processes may wait. Any number of processes may be waiting for each.
  Waiting processes are typically enqueued on a semaphore.
  A semaphore has two parts: A queue of waiting processes and An integer count.
  One semaphore is needed for each condition on which some process might wait.
Semaphores were invented by E.W.Dijkstra in the late 1960's and they were only widely accepted in the early 1970's.
Prior to this, most operating systems had more ad-hoc and frequently incorrect implementations of the waiting state.
UNIX, which was developed in the late 1960's, does not entirely reflect the post-Dijkstra view of processes. With the exception of some later versions (for example, System V UNIX), UNIX does not provide user-level semaphore services, so UNIX programmers must avoid any such services that are provided if they wish to write portable programs.

Semaphore Operations

The name P is Dijkstra's original name for the wait operation; it comes from the Dutch word which is the cognate of Pause in English. Here is some pseudocode that implements this on a uniprocessor:

Wait( sem ) or P( sem ):
   begin
      Disable Interrupts
      if sem.counter > 0
	 sem.counter := sem.counter - 1
      else
         -- Save process state
         current.pc := proc-return.pc
         current.regs := proc-return.regs

         -- Change state to waiting
	 enqueue( current, sem.queue )

         -- Make new process current process
         current := Dequeue( ready queue )

         -- Restore process state
         proc-return.pc := current.pc
         proc-return.regs := current.regs
      endif
      Enable Interrupts
      return
   end

Signal( sem ) or V( sem ):
   begin
      Disable Interrupts
      if empty( sem.queue )
         sem.count := sem.count + 1
      else
         temp := dequeue( sem.queue );
         enqueue( temp, ready queue )
      endif
      Enable Interrupts
      return
   end

The above implementations are correct only if all processes have equal priority. If some have a higher priority, and if a low priority process issues a signal on a semaphore for which a high priority process is waiting, then the above code lets the low priority process continue running while the high priority process waits in the ready queue. This is wrong.