1. Overview of the Hawk Computer

Part of the Hawk Manual
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Contents

1.1. Summary
1.2. Memory Resources
1.3. CPU Resources
1.3.1. General Purpose Registers
1.3.2. Program Counter
1.3.3. Coprocessor Status Register
1.3.4. Processor Control Registers
1.3.4.1. Processor Status Word and Condition Codes
1.3.4.2. Trap Program Counter
1.3.4.3. Trap Memory Address
1.3.4.4. Trap Save Register
1.3.4.5. Cycle Counter Register
1.3.4.6. Memory-Management Unit Data Register
1.3.5. Memory-Management Unit
1.4. The Instruction Execution Cycle
1.5. Other Notational Conventions


1.1. Summary

The Hawk computer is a fictional machine that incorporates many features of modern RISC processors without slavish adherance to any particular real machine. The Hawk instruction set is based on a 32 bit word, typical for modern machines, and includes 15 general registers, a modest number by modern standards. All instructions are single-cycle, in the RISC tradition of machines like the IBM/Apple/Motorola Power architecture, the DEC Alpha architecture, the SGI MIPS architecture and the HP PA architecture. Unlike these machines, some of the Hawk instructions are variable-length, as with the older Intel 80x86/Pentium and Motorola 680x0 architectures.

With all computers, it is conventional to number the bits and other fields within in a word. The numbering of is entirely arbitrary; for the Hawk, we will arrange things as shown below:

Word Format
31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
byte 3 byte 2 byte 1 byte 0
halfword 1 halfword 0

This numbering conforms to the numbering used in DEC's 16, 32 and 64 bit computers; a numbering scheme that was later used by Intel. IBM mainframes and Motorola's microcomputers number bits and fields in the opposite order.

The Hawk word is 32 bits, while the 16-bit quantity is called a halfword. This usage is exactly the same as has been used on IBM mainframes since 1965, but it differs from early Intel, Motorola and DEC usage, all of which used 16-bit words in the early 1970's and continued to call 16 bits a word on their 32-bit machines. As with bit numbering, the particular unit called a word or a byte is arbitrary.
 

1.2. Memory Resources

A Hawk memory address is a 32-bit quantity (one word) with the following layout:

Memory Address Format
31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
page word in page byte

The high 30 bits of the address specifies one word in memory, while the low two bits may be used to specify a byte in that word. Thus, the Hawk memory consists of up to 230 words or up to 232 bytes, 4 gigabytes. These figures are typical of computers designed since the 1970's, but as with most computers, it is unlikely that all memory addresses will refer to real memory.

For some purposes, the word field of a memory address may be broken into page and word-in-page fields. Each page holds 210 words, and the entire address space contains 220 pages.

In the following, the Hawk memory will be viewed as an array of words, with addresses ranging from 0 to 232–1, with groups of 4 consecutive addresses all referencing the same word in memory. Thus, addresses 0 through 3 all refer to the first word in memory, addresses 4 through 7 all refer to the second word, and so on up to addresses 232–4 through 232–1, which all refer to the final word. To simplify discussion, memory addresses will frequently be described as records with a 30 bit word field and a 2 bit byte field.

The Hawk memory may also be viewed as an array of bytes, with addresses ranging from 0 to 232–1, so that addresses 0 refers to the first byte, which is stored in the first word, and address 1 refers to the second byte, which is also stored in the first word. Finally, the memory may be viewed as an array of halfwords, so that addresses 0 and 1 refer to the first halfword, while addresses 2 and 3 refer to the second halfword.

This view of memory as either an array of bytes, halfwords or words dates back to the IBM System 360/370 family of mainframes, introduced in 1965, and this view is shared by all but the very smallest of modern computers.

The Hawk instruction set does not directly support non-aligned memory references, but software can be written to store a word in bytes 3 through 6, for example, or to load a halfword from bytes 3 and 4.

1.3. CPU Resources

The Hawk Central Processing Unit contains 18 registers visible to application programs and 5 additional registers visible only to privileged code. 15 of these registers are general-purpose registers (see Section 1.3.1), each holding one word; these are used for variables and addresses during program execution. The program counter (see Section 1.3.2) holds the address of the next instruction to be fetched by the CPU, The condition code field of the processor status word (see Section 1.3.4.1) holds flags that allow the CPU to test results of arithmetic operations, and the coprocessor status register (see Section 1.3.3) controls interaction with any coprocessors (see Section 11.4) that may be present.

1.3.1. General Purpose Registers

The Hawk has 15 general-purpose registers, numbered 1 to 15; these may be viewed as an array of 15 words. Reference to general purpose register 0 is allowed, but the meaning of this special case depends on the context. This register model comes from the IBM 360/370 family, dating back to 1965; many other computers, including the highly influential DEC VAX share similar ideas. Depending on the context, references to register 0 either force a result to be discarded, refer to the program counter (see Section 1.3.2) or to the constant zero.

The assembly language header file hawk.h should define the names R0 to R15 for access to these registers. Formally, they will be referenced as r[0] through r[15].

1.3.2. Program Counter

The CPU operates by fetching and executing successive instructions from memory. The program-counter holds the address of the next instruction in memory awaiting execution. All Hawk instructions consist of 16-bit half words (see Section 1.1) so the program counter holds the address of a halfword in memory; as such the least significant bit of this 32-bit should always be 0. 30 bits of the program counter specify the word containing the next instruction, while 1 bit selects the half-word to use. In formal descriptions, the program counter is referenced as pc, with fields pc.word (the top 30 bits) and pc.byte (the bottom 2 bits).

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
word byte

The assembly language header file hawk.h should define the name PC as a synonym for R0 because many instructions use R0 to refer to the program counter.

An attempt to set the least significant bit of pc to 1 will cause a bus trap (see Chapter 13) with the cause code in the trap memory address register (see Section 1.3.4.3) set to 01.

1.3.3. Coprocessor Status Register

Some systems may have up to seven coprocessors. Interface with these is through two instructions, COGET and COSET (see Section 11.4) to transfer data between the CPU and any of up to 15 registers in each coprocessor. If no coprocessors are present, the coprocessor interface instructions may be ununimplemented. Coprocessor register zero (COSTAT) is shared by all coprocessors and may be thought of as part of the CPU. This register indicates which, if any, of the coprocessors is currently active and it sets options for the active coprocessor. Typical coprocessors might support floating-point arithmetic (see Chapter 15) or cryptographic operations.

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
unused coop u select enable u

Bits 1 to 7 are coprocessor enable bits, one per coprocessor. Turning one on enables the associated coprocessor, turning it off disables it, possibly powering it down. If a coprocessor is not present, the corresponding bit can still be turned on or off.

Bits 8 to 10 select a particular coprocessor. Only the selected coprocessor may be manipulated by COGET and COSET (see Section 11.4). A coprocessor that is present and enabled but not selected is sleeping; its registers retain their values in this state. Setting select to zero forces all enabled coprocessors to sleep. If COGET or COSET are used to access a nonexistant or disabled coprocessor, there will be a coprocessor trap. This allows software virtualization of missing coprocessors.

Bits 12 to 15 (coop) set the operating mode of the selected coprocessor (if it has modes).

1.3.4. Processor Control Registers

The Hawk CPU includes 6 special registers:

The bottom four bits of the processor status word matter to all programmers. These condition codes report results of selected instructions. The other special registers will rarely matter to most programmers.

1.3.4.1. Processor Status Word and Condition Codes

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
level prior unused bcd carry unused N Z V C
priority & privilege extended arithmetic condition codes

The condition code field is the only part of the processor status word most programmers use. These 4 bits summarize the results from selected instructions for later testing. The IBM System 360 from 1965 had condition codes; the form used here comes from the DEC PDP-11 from 1970. The condition codes are typically set as follows:

In formal descriptions, the processor status word will be called psw, and the individual named bits will be referenced by their names. In assembly language, the name PSW should be used.

The bcd-carry field is set by add instructions to simplify binary-coded-decimal arithmetic (see Section 11.2).

The level and old level fields support trap and interrupt service (see Chapter 13) and control whether the memory-management unit (see Section 1.3.5) is active.

1.3.4.2. Trap Program Counter

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
word byte

When an instruction causes a trap (see Chapter 13), for example, when an instruction uses an invalid operation code or references an illegal address, the program counter is saved in the trap program counter register, TPC in assembly language. The value saved is the address of the offending instruction. When an interrupt is requested, the trap program counter is used to save the address of the instruction to be executed on return from interrupt.

1.3.4.3. Trap Memory Address

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
page word in page cause

When a trap (see Chapter 13), is caused by an illegal memory reference, the offending memory address is saved in the high 30 bits of the Trap Memory Address register, TMA in assembly language. The the byte-in-word field of the address (see Section 1.2) is replace by the reason the memory address was illegal. In the case of a bus trap, caused by an illegal physical memory address, the following reason codes apply:

If the memory-management unit (see Section 1.3.5) is turned on (psw.level > 7), it will trigger an MMU trap when an instruction violates the access allowed by the R, W, X and V bits in the memory-management unit data register (see Section 1.3.4.6). The violation will be reported in the reason codes:

1.3.4.4. Trap Save Register

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
saved value

This is a simple 32-bit register, with no special hardware function. In assembly language, it is referenced as TSV. It is needed as a temporary register in order to save and restore registers during entry to and exit from a trap-service routine (see Chapter 13).

1.3.4.5. Cycle Counter Register

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
count

This is a simple 32-bit register that increments whenever the CPU performs any memory operation (fetch, load or store). In assembly language, it is known as CYC. Sampling the cycle count is a useful way to measure program performance.

1.3.4.6. Memory-Management Unit Data Register

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
physical page number unused G C R W X V

The memory-management unit data register is used to access information in the memory-management unit (see Section 1.3.5) about pages in the virtual address space. In assembly language, it is referred to as MMUDATA.

A read from the MMU data register gives the information associated with the page currently selected by the Trap Memory Address register; a write to the MMU data register causes the MMU to associate the data with the page selected by the Trap Memory Address register.

The least significant bits of the MMU data register give the access rights to a page in memory. These are defined as follows:

Normally, the G bit will be zero indicating that the MMU data register applies to only one page in the virtual address space. If the G bit is one on write, the contents of the C, R, W, X and V bits are anded with the corresponding field for every page known to the MMU. This may be used to invalidate all pages known to the MMU.

1.3.5. Memory-Management Unit

The Hawk CPU contains a memory-management unit that translates virtual addresses used by user programs into physical memory addresses. This is fairly simple compared to those of many computers. As on other machines, most programmers can ignore the presence of the memory-management unit, but system programmers must deal with it.

The Hawk memory-management unit is turned off whenever the high bit of the processor status word level field is zero (see Section 1.3.4.1). interrupts and traps always turn off the memory-management unit (see Chapter 13).

The Hawk memory-management unit consists of an array of mapping registers, each of which is seen as two 32-bit fields by the software. The first of these fields corresponds to the Trap Memory Address register, while the second field corresponds to the MMU data register:

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
virtual page number unused

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
physical page number unused C R W X V

In operation, whenever the CPU issues a virtual address, if the memory-management unit is enabled, it searches for a mapping register that contains the corresponding virtual page number field, and then substitutes the physical page number for the virtual page number in order to construct a physical address. (see Chapter 13).

The memory-management unit will request a MMU trap (see Chapter 13) for any of the following reasons:

When an MMU trap is requested the viatual address that caused the trap will be stored in the trap memory address register (see Section 1.3.4.3) and the physical page number and access rights from the mapping register will be shown in the MMU data register (see Section 1.3.4.6). If there was no match, the MMU data register will be marked as invalid.

The number of mapping registers and the search algorithm used by the memory-management unit may vary between machines. When a new association is stored in the memory-management unit, an internal MMU replacement policy determines which mapping register to use. None of these details matter to software, although they have a great impact on performance. Generally, the system software will use the MMU mapping registers as a cache to view a larger data structure, the page table, that shows physical location of each page of the virtual address space.

To invalidate the contents of all mapping registers, store to the MMU data register may be invalidated by using the G bit in the MMU data register with the V bit set to zero.

1.4. The Instruction Execution Cycle

The instruction execution cycle of any computer can be described by a computer program that performs the same computations. Therefore, the definitive descriptions of most new architectures developed since 1970 have been given in an algorithmic form. At the top level, the Hawk architecture looks like many others, with an instruction execution cycle described as follows:

    repeat
	if pc.byte = 0
	    ir = Memory[ pc.word ](bits 15 to 0)
	else pc.byte = 2
	    ir = Memory[ pc.word ](bits 31 to 16)
	pc = pc + 2
	decode ir
	execute ir
    forever

This instruction execution cycle is essentially that proposed by Berks, Goldstein and Von Neumann in 1946, even to the details of packing two instructions into each machine word. The original Von Neumann architecture was much simpler than the Hawk, with just one accumulator instead of many general purpose registers, a much simpler instructio format and a word size of 40 bits instead of the now-common 32-bits.

Instruction Register Format

The instruction register (ir) has four fields of four bits each. Four bits can select one of 16 registers or one of 16 alternative operations. The instruction register is a two-byte register. It can also be viewed as a halfword, but the first byte (bits 0 to 7) is the most significant from the point of view of instruction decoding and is therefore shown to the left, against the usual Hawk convention for byte order. The leftmost 4-bit field of the first byte of an instruction largely determines the use made of the other fields. Each of the other filds has multiple names depending on its use.

07060504 03020100 15141312 11100908
op dst
srcx
cond
s1
op1
s2
x
src
op2
const
disp

Note the discontinuity in the numbering of the bits in the above, reflecting the fact that byte 0 of each instruction halfword is the most important. This byte-reversed presentation is merely a convention used in documentation. Instruction halfwords in memory look like this:

15141312 11100908 07060504 03020100
s1
op1
s2
x
src
op2
op dst
srcx
cond
const
disp

The most significant field is the op field, bits 4 to 7, the most significant bits of the first byte. All of the other fields have multiple names that document their uses in different contexts, as determined by the op field. Details of the fields used by each group of instrucions are given with the definitions of those instructions. In all cases, the op field(s) are used to determine which instruction is used. In summary:

Zero has a special meaning in some fields. This is given in parentheses: "src (16)" means that zero encodes the constant 16, "src (pc)" means it refers to the program counter, and "dst (x)" means discard the value.

1.5. Other Notational Conventions

In descriptions of the effects of instructions, the following notations will commonly be used:

a ∧ b   a ∨ b   ~a
Logical and, or and not operations. When applied to words or other identical objects, each bit of the result is computed from the corresponding bits of the operands using the indicated logical function.
a ⊕ b   a ≡ b
Logical exclusive or and equivalence operations. exclusive or is true when the bits being compared are different, while equivalence is true when the bits being compared are the same. As with and and or, these may be applied to words, where they operate bit by bit.
a « b   a » b   a ›» b
Shift operators applied to words. The right operand, a, is shifted left or right the number of places indicated by b. In the case of a left shift, zeros are shifted in from the right. For right shifts, » is a signed shift, shifting in copies of the sign bit, while ›» is unsigned, shifting in zeros.
a:b   a:b:c
Bit number b of the word (or other object) a and bit numbers b through c of the word a. Bits are numbered from the right starting with zero. For example, if a is a 32-bit word, a»5 = a:31:5
sx(x)
Sign extends x to the size of the field to which it is being assigned. that is, if x is a 4 bit value 1011, sx(x) is ...11111011, while if x was 0101, sx(x) is ...00000101. Where sx() is not used, padding with zeros is implied.