Swapping and Simple Segmentation
Swapping
Look at the movement of instructions and data (process data) between
main memory and secondary memory:
- Swapping:
- move entire processes between main and secondary memory
- informally, the term "swapping" is also applied to the movement of partial processes
- there are more processes than will fit in main memory
- Two uses for disk:
- swap space (backing store) for
- a temporary storage place for all of or part of processes that currently do not fit in main memory
- file system
- a semi-permanent storage place for files
Swapping is:
- a simple memory management policy
- done by moving a complete process in or out of memory
- process data moved include:
- process control block (pcb)
- data variables (i.e. heap, stack)
- instructions (in machine language)
Segmentation
Divide a process up into logically connected chunks which vary in size
- e.g. segment: a logically connected chunk of a process' address space
- it is the virtual address space that is being divided up
Three Segments in a process (e.g., C or C++ program)
- reference: Gray, Interprocess Communications in UNIX, Sec. 1.8.
- text segment
-
contains the executable code (also called the 'code' segment),
which is machine language code and is NOT ASCII text.
- data segment - contains global, static, and dynamically allocated data
-
A global variable is one declared outside of any function
(such variables are always ``static'' in their space requirements
while the program is running.
-
A local static variable is one declared inside a function
with the
static
keyword. Such a variable keeps its
value across function calls, i.e., if it has the value 57 when you
return from the function call, it still has that value when you
call the function again.
-
Like global variables, local static variables are always ``static''
in their space requirements while the program is running.
-
A local automatic variable is any variable declared inside
a function without the
static
keyword.
-
All global variables and local static variables are kept in the
data area, separate from local automatic variables, which are kept
on the stack.
- Kept separately because there is only one instance of a global/static
variable; thus, the compiler knows its exact size.
-
Three parts of the data area:
-
initialized global/static variables
(size is known, stored in executable file).
-
uninitialized global/static variables
(size is known, not stored in executable file).
-
heap
dynamically allocated space, which is allocated as needed
when the program does malloc or calloc in C or new in C++.
For example, linked lists are stored here so that they are
not mixed up with the stack. Addresses increase as the
heap grows.
-
For 2003-10 term, we will refer to this space as the
heap segment, and allow it to be noncontiguous with
the other parts of the data area.
- stack segment
-
Each function call causes an an activation record to be stored
on the top of the stack.
-
An activation record contains the address to return to, the values
or addresses of all parameters, and space for the local variables
of the function.
-
Addresses decrease as the stack grows.
Click here to see a picture
Example Process Illustrating Addresses in the Segments (C program)
Example C program:
-
The code segment is started somewhere around virtual address 10000000 (hex).
The code for main, proc1, and proc2 are stored in this segment.
The same virtual addresses will be used every time this program is
run to create a process, although the physical addresses (not shown
anywhere) may differ.
In all your work with pointers, you always deal with virtual addresses,
rather than physical addresses.
-
The compiler sorts the global variables into the two groups, depending
on whether they have an initial value, and
decides which addresses to use for each variable.
For example, given two integers that should have 4 bytes each,
it may locate them at the next virtual address that is divisible by 4
and the one beyond that.
-
The initialized global/static variables section starts somewhere
after the end of the text segment (indicated by etext)
at virtual address 10001b90 (hex).
Global variables g1 and g3 are stored at 10014534 and 10014538, respectively;
the compiler has left 4 bytes for g1.
-
I would expect edata to indicate the end of the initialized variables
section, but here it is shown as 10015000 (a default value),
which is somewhere in the g1 array. Unexplained mystery at this time (Nov. 19, 2001).
-
The uninitialized global variables are stored starting at 10014648 (hex).
Global variables g0 and g2 are stored here at 10014648 and 10014650;
here 8 bytes have been left for g0.
-
Odd feature: The compiler has sorted uninitialized and initialized static variables
into two groups, but has placed them all with the initialized global
variables.
-
The local variables of the main program are stored on the stack.
They are called local automatic variables because space is allocated
automatically during execution for them as needed.
The stack segment starts at virtual address 7fffffff (this is 2^31 - 1
showing that 31 bit virtual addresses are being used).
As the stack grows, the addresses decrease.
-
The first (uninitialized) local automatic variable that we can see
is lc2 at location 7fff2ef0 on the stack.
The next uninitialized variable is lc0 at location 7fff2eec (which is 4 less, because
of the four bytes used to store the first variable).
The next variable is lc1 (initialized) at location 7fff2ee0 (which is 12 less).
-
The local variables for functions proc1 and proc2
are also stored on the stack.
Local variables are created when a function is called.
If the same function is called from several points in
the code, the virtual addresses of the local variables may vary.
-
When proc1 finishes, its local variables are removed from the
stack. Thus, these same addresses on the stack are available for
the local variables of proc2 when it is called.
-
Since both proc1 and proc2 have two variables of exactly the same types, the variables
lc4 and lc6 use the same addresses (at different times!), as do lc3 and lc5.
-
Also during execution, the heap1 and heap2 pointers are set to two
locations in the heap area. The location pointed to by heap1 is 100172b8
and that pointed to by heap2 is 100172c8. These are 16 bytes apart.
-
If we compile the same program with another compiler, we get different
memory addresses. However, if we compile again with the same compiler,
we get the same results.
Loading
- place the program in main memory
- give your process the segments of memory it needs for
code, static variables and initial allocations
for the stack and heap
Three tricks used in loading the main memory:
1. Trick 1 (RISC OS UNIX used this):
- only load part of the instructions (code segment),
load the rest from the executable file when needed
- the executable file cannot be deleted while the
program is running (mark it as ``busy'')
2. Trick 2 (UNIX uses this):
- share segments with other processes
- examples:
- code (vi) - one shared copy for everyone logged in
- data (after fork) - until one process changes it
3. Trick 3 (UNIX uses this):
- dynamic linking
- a dynamically linked library is also called a DLL
- share code for library routines among processes and users
- users cannot change the library routines
- keep printf
executable code in main memory as long as any process needs it
- must be linked at execution (so the function call knows the address)
The rest of the memory management discussion will ignore these tricks
and treat user processes separately.
1. Segmentation:
logical divisions with different sized chunks
2. Paging:
equal divisions with equal chunks
- say each page is 4k
- that is, split the program into 4k chunks
and put the process into memory 4k at a time
Table of Contents