Lecture 35, ARM Calling Sequences

Part of the notes for CS:4980:1
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Experiment 1

Consider the following C program:

int main() {
	int i;
        i = 0;
}

If we compile this with the command cc -S ... on a Raspberry Pi computer, the compiler outputs the following assembly code:

	.arch armv6
	.eabi_attribute 27, 3
	.eabi_attribute 28, 1
	.fpu vfp
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 2
	.eabi_attribute 30, 6
	.eabi_attribute 18, 4
	.file	"tt.c"
	.text
	.align	2
	.global	main
	.type	main, %function
main:
	@ args = 0, pretend = 0, frame = 8
	@ frame_needed = 1, uses_anonymous_args = 0
	@ link register save eliminated.
	str	fp, [sp, #-4]!
	add	fp, sp, #0
	sub	sp, sp, #12
	mov	r3, #0
	str	r3, [fp, #-8]
	mov	r0, r3
	add	sp, fp, #0
	ldmfd	sp!, {fp}
	bx	lr
	.size	main, .-main
	.ident	"GCC: (Debian 4.6.3-14+rpi1) 4.6.3"
	.section	.note.GNU-stack,"",%progbits

The Prologue and Epilogue

We can break this down into a number of distinct pieces. The first and most obvious of these is made up of a prologue and an epilogue that surround the output that is specific to this program:

	.arch armv6
	.eabi_attribute 27, 3
	.eabi_attribute 28, 1
	.fpu vfp
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 2
	.eabi_attribute 30, 6
	.eabi_attribute 18, 4
	.file	"tt.c"
	.text
------------- everything else -------------
	.ident	"GCC: (Debian 4.6.3-14+rpi1) 4.6.3"
	.section	.note.GNU-stack,"",%progbits

Some pieces of the prologue and epilogue are obvious: There is a directive identifying the instruction set for which this code is to be compiled (armv6), the specific floating point unit to be used (vfp), and a signature of the compiler (GCC 4.6.3). Other parts, the eabi_attributes are more of a mystery, but they declare, to the assembler, various arcane options that are present or absent on the CPU. The attribute settings listed above are correct for the Raspberry PI, so you can merely parrot them without understanding them. This is cargo-cult programming, but acceptable in this context.

Framing the code for a function

The next bit of code we can identify above is a frame around the function main. This frame contains no machine code, and is all about informing the linker that the associated code is a function, how big it is, and its name:

	.align	2
	.global	main
	.type	main, %function
------------- the code for main -------------
	.size	main, .-main

This declares the alignment required (align 2 means align to the next 4-byte boundary). Following this, it declares that the identifier main is globally defined (relative to the linker's scope system), and that it is a function. Finally, after the code for the function is given, it declares it gives an assembly directive that computes the size of the function, so that the linker can know how much memory to set aside for the code. These assembly directives must frame every globally visible function. The linker does not need to know about local functions that are, effectively, private property of one compilation unit, so long as their memory requirements are folded into one of the memory blocks known to the linker.

The Entry and Return Sequences

For any subroutine, it is possible to identify three closely related sequences of instructions:

the calling sequence
the sequence of instructions used to call that subroutine
the receiving sequence
the sequence of instructions at the start of the subroutine that receives control from the caller
the return sequence
the sequence of instructions at the end of the subroutine that returns control to the calling sequence

In the code in question, we can see the receivig and return sequences:

main:
	@ args = 0, pretend = 0, frame = 8
	@ frame_needed = 1, uses_anonymous_args = 0
	@ link register save eliminated.
	str	fp, [sp, #-4]!
	add	fp, sp, #0
	sub	sp, sp, #12
------------- the code for the subroutine body -------------
	add	sp, fp, #0
	ldmfd	sp!, {fp}
	bx	lr

Note that the ARM version of the Gnu assembler gas uses the at-sign as a comment marker. The compiler has helpfully provided us with minimal comments, one of which warns that the receivig and return sequences given here are slightly optimized by eliminating the need for a link register.

Most of the code here involves management of the activation record or stack frame of the subroutine. The sub instruction that subtracts 12 from the stack pointer allocates the activation record for this routine. The two add instructions save the former stack pointer n the frame pointer (on entry) and restore the stack pointer (before return). The str and ldmfd instructions save and restore the frame pointer. In the more general case, these also save and restore the link register.

Experiment 2

Consider the following C program:

void f(){
        int i;
        i = 1;
}
int main(){
        int i;
        i = 2;
        f();
        return i;
}

The Function f

The above code allows us to find out a bit more. Looking at the assembly code it produces, we find the following: First, the function f has the same optimized entry and return sequemce as the main program did in our first example. Because it is a void function, it contains no code to return anything:

f:
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 1, uses_anonymous_args = 0
        @ link register save eliminated.
        str     fp, [sp, #-4]!
        add     fp, sp, #0
        sub     sp, sp, #12
        mov     r3, #1
        str     r3, [fp, #-8]
        add     sp, fp, #0
        ldmfd   sp!, {fp}
        bx      lr

The General Entry and Return Sequence

The main program now includes the general entry and return sequence, because it is forced to save its link register by the call to a subsidiary subroutine.

main:
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 1, uses_anonymous_args = 0
        stmfd   sp!, {fp, lr}
        add     fp, sp, #4
        sub     sp, sp, #8
------------- the code for the subroutine body -------------
        sub     sp, fp, #4
        ldmfd   sp!, {fp, pc}

In the general case, the stmfd at the head of the subroutine is matched by an ldmfd at the end. The frame pointer and link register are saved and restored by these.

Calling a Procedure (a Void Function)

The main program contains a call to a parameterless void function.

        bl      f

This illustraes that the rather complex entry and return sequence we have seen has a payoff! The calling sequence is, in this case, just one instruction, bl (branch and link).

Returning a Value from a Function

The final statement of the main program is return i. This compiles to produce the following sequence of instructions:

        ldr     r3, [fp, #-8]
        mov     r0, r3

The first, ldr loads r3 with the value of the local variable i, indexed from the frame pointer with the displacement -8. The second statement moves the return value to r0. The presence of this move instruction demonstrates that all optimizaiton has been turned off in this compiler -- an optimizer would have loaded the return value directly into r0.