The next step is some kind of support for functions. Functions are perhaps the most fundamental language feature for abstraction and code reuse. It allows us to refer to some piece of code by a name. All we need to know is how many arguments are needed, what type of arguments, and what the function returns, and what the function computes to use the function.
In particular, it's not necessary to know how the function does what it does.
How does assembly language give us this kind of support.
To think about what's required, let's think about what happens in a function call.
Programming languages make functions easy to maintain and write by giving each function its own section of memory to operate in.
For example, suppose you have the following function.
int pickMin( int x, int y, int z ) {
int min = x ;
if ( y < min )
min = y ;
if ( z < min )
min = z ;
return min ;
}
You declare parameters x, y, and z. You also
declare local variables, min. You know that these variables
won't interfere with other variables in other functions, even if
those functions use the same variable names.
In fact, you also know that these variables won't interfere with separate invocations of itself.
For example, consider this recursive function,
int fact( int n ) {
if ( n == 0 )
return 1 ;
else
return fact( n - 1 ) * n ;
}
Each call to fact produces a new memory location for n. Thus, each separate call (or invocation) to fact has its own copy of n.
How does this get implemented? In order to understand function calls, you need to understand the stack, and you need to understand how assembly languages like MIPS deal with the stack.
Let's look at a stack.
The stack pointer is usually a register that contains the top of the stack. The stack pointer contains the smallest address x such that any address smaller than x is considered garbage, and any address greater than or equal to x is considered valid.
In the example above, the stack pointer contains the value 0x0000 1000, which was somewhat arbitrarily chosen.
The shaded region of the diagram represents valid parts of the stack.
It's useful to think of the following aspects of a stack.
Usually, push and pop are defined as follows:
The stack pointer, by convention, is $r29. That is, it's register 29. We say, "by convention", because MIPS ISA does not force you to use register 29. However, because it's a convention, you should follow it, if you expect your code to work with other pieces of code.
Here's how to implement the equivalent of push $r2 in MIPS,
which is to push register $r2 onto the stack.
push: addi $sp, $sp, -4 # Decrement stack pointer by 4
sw $r3, 0($sp) # Save $r3 to stack
The label "push" is unnecessary. It's there just to make it
easier to tell it's a push. The code would work just fine without
this label.
Here's a diagram of a push operation.
The diagram on the left shows the stack before the push operation.
The diagram in the center shows the stack pointer being decremented by 4.
The diagram on the right shows the stack after the 4 bytes from register 3 has been copied to address 0x000f fffc
You might wonder why it's necessary to update the stack pointer.
Couldn't you just do the following?
push: sw $r3, -4($sp) # Copy $r3 to stack
Certainly, this is equivalent in behavior as far as register 3
being saved to the stack.
However, we'd like to maintain the invariant that all addresses greater than or equal to the stack pointer hold valid data, and all the addresses less than the stack pointer hold invalid data. If we ran the above code, then there would be valid data at an address smaller than the stack pointer.
It just makes bookkeeping that much easier if we follow these conventions.
pop: lw $r3, 0($sp) # Copy from stack to $r3
addi $sp, $sp, 4 # Increment stack pointer by 4
Here's a diagram of the pop operation.
The diagram on the left is the initial state of the stack.
The data is copied from the stack to the register 3. Thus, the diagram in the center is the same as the one on the left. If I had drawn a picture of the contents of register 3, it would have been updated with the data from the stack.
Then, the stack pointer is moved down (shown in the diagram on the right).
As you can see, the data still is on the stack, but once the pop operation is completed, we consider that part of the data invalid. Thus, the next push operation overwrites this data. But that's OK, because we assume that after a pop operation, the data that's popped off is considered garbage.
If you've ever made the error of returning a pointer to a local variable or to a parameter that was passed by value and wondered why the value stayed valid initially, but later on got corrupted, you should now know the reason.
The data still stays on the garbage part of the stack until the next push operation overwrites it (that's when the data gets corrupted).
For example, suppose you want to push registers $r2, $r3, and
$r2 on the stack. This is code to do that:
push: addi $sp, $sp, -12 # Decrement stack pointer by 12
sw $r2, 0($sp) # Save $r2 to stack
sw $r3, 4($sp) # Save $r3 to stack
sw $r4, 8($sp) # Save $r4 to stack
Since each register takes up 4 bytes and since each memory address
stores 1 byte, we need to decrement the value of the stack pointer
by 12 to give us the space to store 3 registers.
At this point, we can copy the contents of registers 2, 3, and 4 to the stack. It's somewhat arbitrary which order we put the registers on the stack. In the code above, register 2 is on the very top of the stack. We could just have easily put register 3 or 4 on the top of the stack. The way shown above just seems more "orderly".
Popping is very similar.
pop: sw $r2, 0($sp) # Copy from stack to $r2
sw $r3, 4($sp) # Copy from stack to $r3
sw $r4, 8($sp) # Copy from stack to $r4
addi $sp, $sp, 12 # Increment stack pointer by 12
Let's imagine we're starting in main() in a C program. The stack looks something like this:
We'll call this the stack frame for main(). It is also called the activation record. A stack frame exists whenever a function has started, but yet to complete.
Suppose, inside of body of main() there's a call to foo(). Suppose foo() takes two arguments. One way to pass the arguments to foo() is through the stack. Thus, there needs to be assembly language code in main() to "push" arguments for foo() onto the the stack. The result looks like:
As you can see, by placing the arguments on the stack, the stack frame for main() has increased in size. We also reserved some space for the return value. The return value is computed by foo(), so it will be filled out once foo() is done.
Once we get into code for foo(), the function foo() may need local variables, so foo() needs to push some space on the stack, which looks like:
foo() can access the arguments passed to it from main() because the code in main() places the arguments just as foo() expects it.
We've added a new pointer called FP which stands for frame pointer. The frame pointer points to the location where the stack pointer was, just before foo() moved the stack pointer for foo()'s own local variables.
Having a frame pointer is convenient when a function is likely to move the stack pointer several times throughout the course of running the function. The idea is to keep the frame pointer fixed for the duration of foo()'s stack frame. The stack pointer, in the meanwhile, can change values.
Thus, we can use the frame pointer to compute the locations in memory for both arguments as well as local variables. Since it doesn't move, the computations for those locations should be some fixed offset from the frame pointer.
And, once it's time to exit foo(), you just have to set the stack pointer to where the frame pointer is, which effectively pops off foo()'s stack frame. It's quite handy to have a frame pointer.
We can imagine the stack growing if foo() calls another function, say, bar(). foo() would push arguments on the stack just as main() pushed arguments on the stack for foo().
So when we exit foo() the stack looks just as it did before we pushed on foo()'s stack frame, except this time the return value has been filled in.
Once main() has the return value, it can pop that and the arguments to foo() off the stack.
It returns back the same way as well.
For example, consider the code we had for factorial.
int fact( int n ) {
if ( n == 0 )
return 1 ;
else
return fact( n - 1 ) * n ;
}
Suppose fact(-1) is called. Then, the base case
is never reached (well, it might be reached once we decrement
so far that n wraps around to 0 again). This causes
one stack frame after another to be pushed. Once the stack
limit has been reached, we enter into invalid memory addresses,
and the operating system takes over and kills your programming,
telling you your program has a stack overflow.
Probably the most common cause of stack overflow is a recursive function that doesn't hit the base case soon enough. For fans of recursion, this can be a problem, so just keep that in mind.
Some languages (say, ML) can convert certain kinds of recursive functions (called "tail-recursive" functions) into loops, so that only a constant amount of space is used.
This is how CPUs used to do it. With the RISC revolution (admittedly, nearly 20 years old now) and large numbers of registers used in typical RISC machines, the goal is to (try and) avoid using the stack.
Why? The stack is in physical memory, which is RAM. Compared to accessing registers, accessing memory is much slower---probably on the order of 100 to 500 times as slow to access RAM than to access a register.
MIPS has many registers, so it does the following:
In general, this makes calling functions a little easier. In particular, the calling function usually does not need to place anything on the stack for the function being called.
However, this is clearly not a panacea. In particular, imagine main() calls foo(). Arguments are passed using $a0 and $a1, say.
What happens when foo() calls bar()? If foo() has to pass arguments too, then by convention, it's supposed to pass them using $a0 and $a1, etc. What if foo() needs the argument values from main() afterwards?
To prevent its own arguments from getting overwritten, foo() needs to save the arguments to the stack.
Thus, we don't entirely avoid using the stack.
Eventually, we have to run code from a leaf procedure. This is a function that does not make any calls to any other functions. Clearly, if there were no leaf procedures, we wouldn't exit the program, at least, not in the usual way. (If this isn't obvious to you, try to think about why there must be leaf procedures).
In a leaf procedure, there are no calls to other functions, so there's no need to save arguments on the stack. There's also no need to save return values on the stack. You just use the argument values from the registers, and place the return value in the return value registers.
There's also a complicated language feature called continuations (which appear in languages like Scheme). Continuations can be coded in a way that doesn't require a stack, and in effect, acts like a queue.
Nevertheless, the vast majority of languages that implement functions use the stack.
The code can be written in, say, C in a few hundred lines of code, but is rather unwieldy to write in assembly language.
Notice that the heap in memory is not the same as the heap data structure. Those two are different kinds of heaps, and aren't really related.
We've discussed how arguments and return values are used in the stack, and how each function call results in a stack frame being pushed on the stack. We've also talked about how a stack can overflow, typically, from a recursive function that's either got a bad base case or unexpected arguments. Occasionally, stack overflow can just occur because the stack just grows too big.