Swapping and Simple Segmentation

Swapping

Look at the movement of instructions and data (process data) between main memory and secondary memory:

Swapping:
- move entire processes between main and secondary memory
- informally, the term "swapping" is also applied to the movement of partial processes
there are more processes than will fit in main memory
Two uses for disk:
- swap space (backing store) for
  - a temporary storage place for all of or part of processes that currently do not fit in main memory
- file system
  - a semi-permanent storage place for files

Swapping is:

a simple memory management policy
done by moving a complete process in or out of memory
- process data moved include:
  - process control block (pcb)
  - data variables (i.e. heap, stack)
  - instructions (in machine language)

Segmentation

Divide a process up into logically connected chunks which vary in size
- e.g. segment: a logically connected chunk of a process' address space
- it is the virtual address space that is being divided up

Three Segments in a process (e.g., C or C++ program)

reference: Gray, Interprocess Communications in UNIX, Sec. 1.8.
text segment
- contains the executable code (also called the 'code' segment), which is machine language code and is NOT ASCII text.
data segment - contains global, static, and dynamically allocated data
- A global variable is one declared outside of any function (such variables are always ``static'' in their space requirements while the program is running.
- A local static variable is one declared inside a function with the
```
 static 
```
  keyword. Such a variable keeps its value across function calls, i.e., if it has the value 57 when you return from the function call, it still has that value when you call the function again.
- Like global variables, local static variables are always ``static'' in their space requirements while the program is running.
- A local automatic variable is any variable declared inside a function without the
```
 static 
```
  keyword.
- All global variables and local static variables are kept in the data area, separate from local automatic variables, which are kept on the stack.
- Kept separately because there is only one instance of a global/static variable; thus, the compiler knows its exact size.
- Three parts of the data area:
  - initialized global/static variables (size is known, stored in executable file).
  - uninitialized global/static variables (size is known, not stored in executable file).
  - heap dynamically allocated space, which is allocated as needed when the program does malloc or calloc in C or new in C++. For example, linked lists are stored here so that they are not mixed up with the stack. Addresses increase as the heap grows.
    - For 2003-10 term, we will refer to this space as the heap segment, and allow it to be noncontiguous with the other parts of the data area.
stack segment
- Each function call causes an an activation record to be stored on the top of the stack.
- An activation record contains the address to return to, the values or addresses of all parameters, and space for the local variables of the function.
- Addresses decrease as the stack grows.

Click here to see a picture

Example Process Illustrating Addresses in the Segments (C program)
Example C program:

Source code: memory_segments.cpp
Output on hercules for CC
Output on hercules for cc (different)
You will get the same output if you run it again, or if you recompile it (with no change to compiler or Operating System) and run it again.
Picture for example program for hercules CC
Template for Lab 2

The code segment is started somewhere around virtual address 10000000 (hex). The code for main, proc1, and proc2 are stored in this segment. The same virtual addresses will be used every time this program is run to create a process, although the physical addresses (not shown anywhere) may differ. In all your work with pointers, you always deal with virtual addresses, rather than physical addresses.
The compiler sorts the global variables into the two groups, depending on whether they have an initial value, and decides which addresses to use for each variable. For example, given two integers that should have 4 bytes each, it may locate them at the next virtual address that is divisible by 4 and the one beyond that.
The initialized global/static variables section starts somewhere after the end of the text segment (indicated by etext) at virtual address 10001b90 (hex). Global variables g1 and g3 are stored at 10014534 and 10014538, respectively; the compiler has left 4 bytes for g1.
I would expect edata to indicate the end of the initialized variables section, but here it is shown as 10015000 (a default value), which is somewhere in the g1 array. Unexplained mystery at this time (Nov. 19, 2001).
The uninitialized global variables are stored starting at 10014648 (hex). Global variables g0 and g2 are stored here at 10014648 and 10014650; here 8 bytes have been left for g0.
Odd feature: The compiler has sorted uninitialized and initialized static variables into two groups, but has placed them all with the initialized global variables.
The local variables of the main program are stored on the stack. They are called local automatic variables because space is allocated automatically during execution for them as needed. The stack segment starts at virtual address 7fffffff (this is 2^31 - 1 showing that 31 bit virtual addresses are being used). As the stack grows, the addresses decrease.
The first (uninitialized) local automatic variable that we can see is lc2 at location 7fff2ef0 on the stack. The next uninitialized variable is lc0 at location 7fff2eec (which is 4 less, because of the four bytes used to store the first variable). The next variable is lc1 (initialized) at location 7fff2ee0 (which is 12 less).
The local variables for functions proc1 and proc2 are also stored on the stack. Local variables are created when a function is called. If the same function is called from several points in the code, the virtual addresses of the local variables may vary.
When proc1 finishes, its local variables are removed from the stack. Thus, these same addresses on the stack are available for the local variables of proc2 when it is called.
Since both proc1 and proc2 have two variables of exactly the same types, the variables lc4 and lc6 use the same addresses (at different times!), as do lc3 and lc5.
Also during execution, the heap1 and heap2 pointers are set to two locations in the heap area. The location pointed to by heap1 is 100172b8 and that pointed to by heap2 is 100172c8. These are 16 bytes apart.
If we compile the same program with another compiler, we get different memory addresses. However, if we compile again with the same compiler, we get the same results.

Loading

- place the program in main memory
- give your process the segments of memory it needs for code, static variables and initial allocations for the stack and heap

Three tricks used in loading the main memory:

1. Trick 1 (RISC OS UNIX used this):

only load part of the instructions (code segment), load the rest from the executable file when needed
the executable file cannot be deleted while the program is running (mark it as ``busy'')

2. Trick 2 (UNIX uses this):

share segments with other processes
examples:
- code (vi) - one shared copy for everyone logged in
- data (after fork) - until one process changes it

3. Trick 3 (UNIX uses this):

dynamic linking
a dynamically linked library is also called a DLL
share code for library routines among processes and users
- users cannot change the library routines
keep printf executable code in main memory as long as any process needs it
must be linked at execution (so the function call knows the address)

The rest of the memory management discussion will ignore these tricks and treat user processes separately.

1. Segmentation: logical divisions with different sized chunks
2. Paging: equal divisions with equal chunks
- say each page is 4k
- that is, split the program into 4k chunks and put the process into memory 4k at a time

Table of Contents