--------------------------------------------------------------------------------
Data representation = Information Coding
- Information including numbers, symbols, program instructions, and memory 
  addresses are represented using binary patterns in main memory and bulk 
  storage.
- Memory addresses are unsigned integers that, as usual, are represented as 
  binary patterns 

--------------------------------------------------------------------------------

REPRESENTING INTEGERS IN THE BINARY SYSTEM 

Base b Positional Number System: Each digit is assigned a weight according to 
its position. A weight is a power of the base b. Compute each digit times its 
weight and sum the results to get the value of the number.

Base 10: ordinary

Base 2: binary system  

Base 10		Base 2		    Some Powers of 2
=======================		    ================
      0		0		    2 ** 0 = 1
      1		1		    2 ** 1 = 2
      2		10		    2 ** 2 = 4
      3		11		    2 ** 3 = 8
      4		100		    2 ** 4 = 16
      5		101		    2 ** 5 = 32
      6		110		    2 ** 6 = 64
      7		111		    2 ** 7 = 128
      8		1000		    2 ** 8 = 256
      9		1001		    2 ** 9 = 512
     10		1010		    2 ** 10 = 1K = 1024
     11		1011		    2 ** 20 = 1M = ~1 million
     12		1100		    2 ** 30 = 1G = ~1 billion
     13		1101		    2 ** 32 = 4G = ~4 billion
     14		1110		    2 ** 40 = 1T = ~1 tillion
     15		1111		    2 ** 50 = 1P = ~1 quadrillion
     16		10000		    2 ** 60 = 1X = ~1 quintillion

K = kilo, M = mega, G = giga, T = tera, P = peta, X = exa
Note:  here I use K, M, G, T, P, and X as numbers, not units of something

Examples:
	Base 10              Base 2
	 14 (d)            1110 (b)
	100 (d)		1100100 (b)
	256 (d)       100000000 (b)

Conversion: n-bit binary -> decimal 
	total is initialized to 0			11001
	for positions i = 1 to n			= 2^0 + 2^3 + 2^4 
		if bit(i) is 1 then			= 1 + 8 + 16
			add 2^(i - 1) to total		= 25
		end if
	end for

Conversion: decimal -> binary 
	remainder is initialized to the decimal value
	for positions i = n to 1
		if remainder > 2^(i - 1) then
			remainder = remainder - 2^(i - 1)
			place a 1 in position i of the binary number
		else
			place a 0 in position i of the binary number
		end if
	end for

Unsigned Integers (standard binary)
	0 .. 255 (8 bit word);   0 .. 4G-1 (32 bit word)

Signed Integers (negative and positive):  Probably not needed for CS330
= Two's Complement Representation
- Most commonly used because easiest to implement and fastest in hardware
- One's complement (or just complement): convert all zeros to ones and 
  vice versa
- Procedure for obtaining the two's complement of a binary number
	- Perform one's complement (invert zeros and ones)
	- Add 1
- Example:  to represent -5 in 4 bits, find the two's complement of 5
	5 (d) = 0101 (b)     ==> 1010   (take 1's complement)
			        +   1   (add 1)
			         1011   (represents -5 in 4 bits)
- 4 bits: -1 1111; -2 1110; -3 1101; -4 1100; -5 1011; -6 1010; -7 1001; -8 1000
- To negate any value: take its two's complement; e.g. negate -5
	1011 ==> 0100 + 1 = 0101
- Addition example:  3 + 4 in 4 bits is 0011 + 0100 = 0111 = -7
- Adding of negative values:  ignore the carry
- Example: -3 + -4 in 4 bits is 1101 + 1100 = 11001 ==> 1001 = -7
- Subtraction is implemented as adding the negated value:
	A - B = A + (-B)
- In the CPU, only an Add circuit (and a one's complement) has to be implemented
- Additional examples to try:  17 + 3; 17 - 3; 17 - (-3)

------------------------------------------------------------------------------

MEMORY ADDRESSING

- memory consists of a set of bytes
- to access a particular byte, we specify its address, which is its position in
the memory.  This is called "byte addressable memory".
- if there are 32 bytes in memory, we specify a number from 0 to 31 as the 
address of the byte we are interested in
- if there are 4Mb of main memory, then there are 4 * 2**20 bytes, since M is 
an abbreviation for 2**20, which is just over a million.  Since 4 = 2**2, then 
4M = 2**2 * 2**20 = 2**(2+20) = 2**22.  Thus, there are 2**22 bytes in this 
memory.  This is the size of the memory.  Using a calculator, we can calculate 
2**22 = 4,194,304.
- in general if you have 2**n bytes in memory, you require n bits in the 
address.  This is based on on how many bits it takes to represent a number.  It
has nothing to do with the number of bits in a byte, which happens to be 8.  
- so if we have 32 bytes in memory, we need to represent 32 different numbers 
to serve as addresses of these bytes.  So it takes log_2(32) = 5 bits to 
represent 32 different numbers.  
- we use log_2(32) because this asks, "For what number x, does 2**x = 32?"  On 
your calculator, log_2(32) = log_10(32) / log_10(2) or ln(32) / ln(2).
- if we have 4Mb of main memory, then 4M = 4194304 different numbers are needed
to serve as addresses of these 4194304 bytes. It takes log_2(4194304) = 22 bits
to represent the numbers from 0 to 4194303.
- similarly, if we have an address containing n bits, we can use it to address 
a space containing 2**n bytes
- suppose we had an address containing 3 bits (very small!).  We can represent 
2**3 = 8 different numbers in 3 bits.  The binary numbers are 000, 001, 010, 
011, 100, 101, 110, 111.  We use these numbers to address bytes 0 through 7.
- if we had an address containing 32 bits, we could represent 2**32 =
4.29 * 10**9 different numbers.  Thus we could address 4.29 * 10**9 bytes.
We may prefer to say that 2**32 = 2**2 * 2**30 = 4G.  Thus, with 32 bits,
we can address 4G bytes, which is abbreviated as 4Gb.  

------------------------------------------------------------------------------
ADDRESSES and ADDRESS SPACES

- we are interested in 2 types of address spaces and 2 types of
addresses:
	- logical address space (virtual address space):
	the set of addresses that a process uses....

		==> the address space that it imagines it has

	- logical address:  identifies a particular byte in the 
	logical address space.

	- physical address space:  the set of addresses actually
	available in the computer's memory.  For example, if your
	computer has 8Mb, then it has a physical address space
	containing 8M bytes.

	- physical address:  identifies a particular byte in the 
	physical address space.

----------------------------------------------------------------------------

SAMPLE PROBLEM
- suppose that the logical address space is 64Kb, 
the page size is 1024 bytes, and the physical address has 20 bits
- determine 
(1) # of bits in logical address
(2) # of bits in page offset
(3) size of physical address space
(4) number of pages in the logical address space
(5) number of frames in main memory

(1) logical address space is 64Kb = 64 * 2**10 bytes = 2**6 * 2**10 bytes
= 2**16 bytes.  It takes 16 bits to address a space that contains 2**16 bytes
OR log_2(64K) = log_2(64*1024) = log_2(65536) = 16

(2) page offset is from 0 to 1023.  It requires log_2(1024) = log_2(2**10)
= 10 bits

(3) the physical address contains 20 bits, so the physical address
space contains 2**20 bytes = 1Mb

(4) # of pages = total logical size / page size = 64K / 1024 = 65536 / 1024 = 64
OR  2**16 / 2**10 = 2**6 = 64

(5) # of frames = total physical size / page size = 2**20 / 2**10 = 2**10 
= 1024 frames
OR 1Mb / 1024 = 1048576 / 1024 = 1024 frames

----------------------------------------------------------------------------

ANOTHER SAMPLE PROBLEM
- suppose the virtual address has 32 bits, the physical address has 
  20 bits, and the page size is 64 Kb, which is large.
- determine: 
(1) the size of the physical address space
(2) the size of the virtual address space
(3) the number of bits in the page offset
(4) the number of pages in the virtual address space
(5) the number of frames in the physical memory
(6) the size of the page table (in bytes) if there is one entry per page 
    and each entry requires 4 bytes


(1)  size of physical address space = 2 ^ x
     where x = total number of bits to store frame# and offset

(2)  size of virtual address space = 2 ^ x
     where x = total number of bits to store page# and offset

(3)  offset is 0..65 535 and that requires 16 bits
     log_10(65 535) / log_10(2) = 16
     Therefore, page# bits = 32 - 16 = 16 bits
                frame# bits = 20 - 16 = 4 bits

(4)  number of pages in virtual address space = 
     2 ^ 32 bytes / 1024 bytes = 4 194 304 Kb
     4 194 304 Kb / 64 Kb = 65 536 pages

(5)  number of frames in physical memory =
     2 ^ 20 bytes / 1024 bytes = 1024 Kb   
     1024 Kb / 64 Kb = 16 frames

(6)  size of the page table
     64 Kb * 4 bytes = 256 Kb --> 256 Kb * 1024 bytes = 262 144 bytes   OR 
     65 536 * 4 bytes = 262 144 bytes

- redo parts (3) to (6) with a page size of 4 Kb.
   
(3)  offset is 0..4095 and that requires 12 bits 
     log_10(4095) / log_10(2) = 12

(4)  number of pages in virtual address space =
     2 ^ 32 bytes / 1024 bytes = 4 194 304 Kb
     4 194 304 Kb / 4 Kb = 1 048 576 pages

(5)  number of frames in physical memory =
     2 ^ 20 bytes / 1024 bytes = 1024 Kb
     1024 Kb / 4 Kb = 256 frames

(6)  size of the page table
     4 Kb * 4 bytes = 16 Kb --> 16 Kb * 1024 bytes = 16 384 bytes