CHAPTER 2 - Globals, the Stack and Heap space
OBJECTIVE: An understanding of the three different memory storage areas accessible to C programs with emphasis on the role of the stack. Also introduce the more accurate notion of "parameter copying".
There are three memory spaces the application programmer needs to be familiar with : application global space, the stack and heap space. Although different microcomputers organize their memory in different ways, most are similar. Here is the organization of a typical personal computers memory:
Figure 2.1 A microcomputers memory
2.1 Globals:
Application global space is where all global variables are stored. These variables will be stored for the entire runtime of your program. The storage for global variables is allocated at compile time. In C programs, you store variables in global space by declaring them outside of any function.
/* globals.c */
#include <stdio.h>
#include <stdlib.h>
/* the next two variables will be stored in the global application space. Any function within the file may read or write to these variables. */
int global_int;
char global_char;
int main()
{
int local_int;
printf("this program does nothing except show the ");
printf(" declaration of global and local variables\n");
}
Source 2.1 globals.c
2.2 The Stack:
The stack is critical to understanding argument passing between functions (especially in a mixed language environment) ; however, most beginning programmers are usually only told that the stack is a place where function calls are pushed and popped during recursion (recursion is when a function calls itself until it hits an end condition). The stack stores all function calls, function arguments, function variables and the state of the processor registers. Let's first examine what stack data structure is and then discuss how the compiler uses this data structure.
The classic example to describe a stack is the spring-loaded dish bin in a cafeteria line.
Figure 2.2 A classic stack
In the dish-stack example, the dishwasher would be busily putting (pushing) plates onto the top of the dish stack while kids in the cafeteria line would be taking (popping) dishes off of the top of the stack. As a data storage structure you can see how the stack is useful in keeping track of the order of stored items. The most recent items pushed on the stack are the first to be popped off which is why it is a "Last In First Out" structure.
Figure 2.3 Stack as a LIFO structure
You can create a stack data-structure in your application program to store data; but more important is the fact that the compiler uses a system stack to run your application program. All modern computers include a stack pointer as well as assembly language instructions to push to and pop from the stack. So even though you may not use a stack in your application program, the compiler is using a stack to run your program.
2.3 The compiler and the application stack:
The compiler uses the application stack for two purposes:
· Passing arguments
· Storing call frames (also called stack frames) and local variables.
Good program design calls for tight, modular code. Modular code involves breaking down each function or task of an application into one simple module (in C a module is called a function). Each function of an application usually processes data input and produces data outputs. The input data to a function is called the function parameters. Introductory programming courses describe function parameters as being passed into the function by the calling program. This is absolutely not true and in fact I believe that this line of thinking is dangerous because beginning programmers get the impression that the calling function's variables are actually being passed to the called function! This is not the case at all! What really happens in C is that the values entered into the function call are copied onto the stack! Then execution jumps to the called procedure who refers to the parameter values as offset locations from the stack pointer (some computers store an argument pointer to the start of the functions arguments) !
There are two specific reasons why the notion of "passing arguments" is erroneous and confusing to beginning programmers:
· The term "passing" gives a false impression that variables in one function are physically transferred into a second function. This follows from just using the term "passing." If I pass you a ball, I throw the ball to you and I am left without a ball. There is only one ball! If we carry this analogy over to programming: if one function passes another function some of its variables the first function should no longer have those variables. This is obviously not what is going on inside the computer.
The notion of "passing arguments" supports an abstraction not used in the C language. The abstraction that "passing arguments" supports is that of representing a function as a black box. Here is an example of how many introductory programming courses discuss functions:
Figure 2.4 function as a black box
In this high-level concept the definition of a function is a set of instructions that process inputs to produce outputs. This abstraction is useful in its similarity to a factory which makes it intuitive to grasp. This is why languages like Pascal and Fortran support this black box paradigm; however, the C language is more interested in assembly language power than ease of instruction! This is a great credit to the C language that it keeps us closer to the machine by design. Now we begin to understand students confusion with passing addresses to a function (this will be discussed in the next section): they have been taught the black box analogy of functions and are now faced with a programming language that tears aside the abstraction and brings them right to the machine. Therefore, to understand pointers we need to lose the black box abstraction and understand copying arguments to the stack!
Let's examine function arguments being copied to the stack:
Figure 2.5 Parameter Copying
Let's trace how the compiler executes the call to Mod_char(a,2,mychar). Before jumping to the starting address of the function the arguments must be copied onto the stack. C compilers always push the last argument on the stack first and the first argument last. This convention is used to allow a variable number of parameters to be passed (copied) to a function. This way the called procedure can always find the first parameter in a known place. After the function arguments are pushed onto the stack and execution jumps to the new function, the first job of the new function is to build its stack frame on the stack. The stack frame contains state information needed by the return instruction to restore the registers, clean up the stack and return control. Essentially, the stack frame is simply all the data needed for the CPU to continue execution in the function that was "suspended" when the CPU jumped to the new function.
2.4 The Heap:
The heap is where all dynamic memory management is performed. By dynamic memory management, I mean the ability for an application to dynamically (during runtime) allocate and deallocate blocks of memory (note: on a machine with Virtual Memory the memory may not be actual physical RAM. Virtual Memory is an operating system technique that allows a hard disk to be used as Memory by swapping in to physical memory only data that is immediately needed for processing.).
There are two important points to keep in mind about the heap:
1. It grows toward the stack (on microcomputers): as shown in figure 2.1, the memory map, the stack and heap grow towards each other. Therefore you should know how much memory you have available at your applications disposal at all times (if you do not have virtual memory). All popular microcomputers will have a system function to allow you to check the amount of available memory.
2. You must manage the heap: your application is in full control of allocation and deallocation which means it is your responsibility to return all the memory that you use! If you do not you may run out of dynamic memory.
Chapter 2 Questions
1. When is the storage for global variables allocated?
2. What does LIFO stand for?
3. Are arguments actually passed to a function?
4. What other items besides function arguments are pushed onto the application stack?
5. Why is a heap necessary?
6. What would happen if you never returned any of the memory you allocated?
Further Reading.
Levy, Henry M. and Eckhouse, Richard H. Jr., Computer Programming and Architecture,The VAX-11, © 1980, Digital Equipment Corporation
Mak, Ronald, Writing Compilers and Interpreters: An Applied Approach, © 1991, John Wiley and Sons, Inc.
Tanenbaum, Andrew S., Operating Systems: Design and Implementation, © 1987, Prentice-Hall Inc.