Lecture 1

In this lecture we will take a look at how modern computers are organized internally and the kinds of low-level data that computers work with.  Having a good understanding of the lowest levels of the computer will help us understand many aspects of programming in high level languages such as Java.

Basic Computer Organization

Here is a diagram showing how most modern computers, such as the Macs and PCs you use on a daily basis, are organized internally:

The CPU, or Central Processing Unit, is the device that makes the computer tick, or more accurately, execute programs.  It works by reading instructions out of the main memory.  Each instruction specifies a very simple task for the CPU to perform, such as adding two numbers.  Programs are comprised of sequences of instructions.  By arranging the instructions in the right way, the CPU can carry out any desired computation.

CPUs have a small number of registers.  Collectively, the registers act as a "scratch pad" for the CPU: they hold data that the CPU can access very quickly.  However, there are only a small number of registers, typically 64 or fewer.  (The CPU registers are often referred to collectively as the register file.)  Therefore, in general it is not possible for all of the data a program needs to access to fit in registers alone.

The main memory stores two things:

  1. Data being used by running programs

  2. The running programs themselves

This is a very interesting property: programs are simply a special kind of data!  A computer in which main memory stores both data and programs is said to have a von Neumann architecture.  All modern desktop and server CPU architectures work this way.

The main memory is much larger than the CPU's register file.  For example, on my PC the CPU has 8 registers, each of which can store 4 bytes of data.  That's 32 bytes of data in the register file.  The main memory on my PC can store 2 gigabytes of data: that's 2,147,483,648 bytes.  So, the main memory is 67,108,864 times larger than the CPU register file.  Because main memory is so much larger, it is used to store all important data used by programs.  Main memory is slightly slower to access than CPU registers.

Secondary storage refers to devices like hard disks, optical disks (CD and DVD ROM drives), flash memory, etc.  Secondary storage devices are much (hundreds or thousands of times) slower than main memory.  However, they are non-volatile, meaning that when you turn off the computer, data stored in secondary storage devices persists, unlike data stored in registers and main memory, which simply disappears.  Therefore, any data or programs that need to be saved when the computer is turned off reside in secondary storage.

So far, we have not described any devices that allow the computer to access the outside world: such devices fall under the category of input/output, or I/O, devices.  Examples of I/O devices include displays, keyboards, and mice, which allow human users to interact with the computer.  Another important kind of I/O device is the network interface, which allows the computer to communicate with other computers.

Connecting all of the components of the computer is a bus, which is simply a device that allows the various devices to communicate with each other.  For example, if the CPU wants to load a value from main memory, it uses the bus to request the value.  The memory responds to the request by locating the requested value and then using the bus to transfer it to the CPU.

Bits, Bytes, and Machine Data Types

Computers are machines for manipulating data.  By "data", we mean information.  So far we've talked about data at a high level.  To program a computer we need to know exactly how data is represented and manipulated by the computer.

How the computer represents data depends on what kind of CPU the computer uses.  In the early days of computing, there was little standardization, meaning that different kinds of CPUs might differ dramatically with regard to how data was represented.  However, recent CPU architectures use a very standard data representation.  This section attempts to describe machine data types in a way that summarizes the conventions used by most modern CPUs.

The bit is the fundamental unit of data.  "Bit" is short for "binary digit": in other words, a bit is a single digit in a base 2 numbering system.  A bit of information can have one of two possible values: 0 or 1.

A byte is a sequence of 8 bits.  The byte is the smallest addressable unit of data.  "Addressable" means that each byte has an address uniquely identifying its location in memory.  Main memory is simply a linear sequence of bytes.  The addresses of those bytes start at zero (the lowest address) and increase sequentially up to the maximum amount of memory that the computer has installed.  For example, in a computer with 1 gigabyte of memory, the addresses of bytes of main memory would range from 0 to 1,073,741,823.

In the same way that 8 bits are combined to form a byte, bytes are combined to form larger units of data.  A half word consists of two bytes, a word consists of four bytes, and a double word consists of eight bytes.  Each of these larger machine data types is addressable in the same way that bytes are addressable, because each larger type is simply a sequence of bytes.  However, most CPU architectures impose alignment restrictions on the larger data types.  Specifically, an address may only be used to refer to a multi-byte data type if the address is an even multiple of the size (in bytes) of the data type.  For example, an address may only be used to refer to a double word if it is a multiple of eight.  This is similar to tick marks on a ruler:

The longest ticks (multiples of 8) can be used as the address of any kind of data: double word, word, half word, or byte.  The next-to-longest ticks (multiples of 4) can be used as a word, half word, or byte address.  The next-to-shortest ticks (multiples of 2) can be used as a half-word or byte address.  The shortest ticks can only be used as a byte address.

The primitive data types used in Java programs correspond closely to machine data types:

Machine type Java type Number of bytes
byte byte 1
half word short or char 2
word int 4
double word long 8

The Java primitive data types are important because they are the lowest-level building blocks of all data structures.

Binary Arithmetic

Computers represent information using bits, bytes, and the machine data types.  But what do these data types really mean?

Consider a byte, which consists of a sequence of 8 bits:

10010110

What does this byte mean?  On the surface, it looks like a number, and indeed, that is usually how information in a computer is interpreted.  But, which number is it?  In decimal (base 10) numbering, this sequence of digits means 10 million, ten thousand, one hundred and ten.  However, computers use binary---base 2---arithmetic.  Here is how you find out the value of a binary number containing n binary digits.  Read from left to right, number the binary digits (bits) from n-1 down to 0.  We'll call the elements of this sequence

bn-1, bn-2, ..., b1, b0

The value of the binary number is computed using the following polynomial:

bn-1 x 2n-1 + bn-2 x 2n-2 + ... + b1 x 21 + b0 x 20

The value of our example byte 10010110 is thus

   1 x 27 + 0 x 26 + 0 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 1 x 21 + 0 x 20
= 128 + 0 + 0 + 16 + 0 + 4 + 2 + 0
= 150

Interpreted in this way, a byte value can represent any integer value in the range 0..255.

The larger machine data types can represent larger ranges of integers.  In particular, a machine data type with n bits can represent values in the range

0..2n - 1

The fact that only a finite range of value can be represented using machine data types has important consequences.  As an example, say that we add two byte values:

11111111 + 00000001

These byte values represent the decimal values 255 and 1.  What will the sum of this addition be?  Mathematically, it should be 256.  However, that integer cannot be represented as a byte value.  What happens is that the addition overflows.  On overflow, the result "wraps" around to the smallest possible value, in this case, 0.

We can see what is happening by explicitly adding the binary numbers an observing the carries that are needed when adding each pair of digits:

 11111111
  11111111
+ 00000001
  --------
  00000000

The carries are shown in red.  Although there is a carry in the addition of the leftmost (highest) bits, the result has no room to store it, and therefore it is discarded.

Another way to look at this phenomenon is that it is like a mechanical odometer in a car.  When the odometer reaches the maximum number of miles it can represent, it rolls over to zero.

The important message here is that computer arithmetic is finite.  The machine data types in a computer are not integers in the mathematical sense.  When writing a program, you need to choose data types that are "large enough" for the problem you are trying to solve.  Fortunately, for most problems, the 4 and 8 byte types are sufficient.  For example, the range of a 4 byte data type (4 bytes = 32 bits) is 0 to 4,294,967,295.

Signed Integers and Two's Complement

So far, we've treated machine data types as representing non-negative integers.  What if we need to have negative integers?

All modern computers use a representation scheme called two's complement for negative integers.  In two's complement, any value whose high (leftmost) bit is 1 is considered to be negative: the high bit is considered to be the sign bit.  The numeric value of a machine data type in two's complement is computed almost exactly the same way as for the plain unsigned version of the data type.  The only difference is that rather than multiplying the sign bit by 2n-1, you multiply it by -2n-1.  This form of representation is convenient for a variety of reasons: see the Wikipedia article for details.  One of the main motivations for two's complement is that it allows arithmetic operations (addition, subtraction, etc.) to work correctly regardless of whether the operands are positive or negative.

The important practical consequence of two's complement is that for a machine data type with n bits, the range of signed values that can be represented is

-2n-1 .. (2n-1)-1

In other words, in two's complement arithmetic, the number of negative values that can be represented is one greater than the number of positive values that can be represented.

You don't need to be concerned about all of the details of how negative integers are represented.  These details will have little consequence when writing actual programs.  However, you should keep in mind how the two's complement representation determines the range of values allowed for signed integer types.

So, when we are looking at a particular value, how do we know whether it is a signed two's complement value or a plain unsigned value?  The answer is that it is up to the program whether to treat values as signed or unsigned.  In Java, most values are treated as signed.  In fact, Java has only one integer data type treated as unsigned: the char type used to represent characters: since it is very rare to perform arithmetic on characters, almost all arithmetic in Java is done using signed values.

Floating Point

The data types we have described so far are useful for representing integer values.  However, what if we need to represent numbers that have a fractional part?

Floating point data types represent numbers in a way that allows fractions to be represented.  The idea is to divide a machine word or double word into two fixed-length parts: a mantissa and an exponent.  Both the mantisssa and the exponent are ordinary signed integers.  The numeric value of a floating point value is computed from the mantissa and exponent according the following formula:

value = mantissa x 2exponent

Let's consider an example.  Say we have a floating point value with mantissa 17 and exponent -2.  The numerically, this floating point value represents

17 x 2-2 = 17 x 1/4 = 4.25

Arithmetic using floating point values is subject to some limitations.  For example, some values cannot be represented.  For example, it is not possible to represent the value one tenth (0.1) using binary floating point.  The precision of floating point arithmetic is limited, so some value are too small or too large to represent.  For these reasons, you should consider floating point arithmetic to be inherently "fuzzy": they represent a value that is "close to" the one you want, but may not be exactly equal.

The IEEE 754 standard defines the conventions for floating point values that are implemented in all modern CPUs.  Java also defines floating point arithmetic using this standard.  There are two floating point data types used by Java:

Machine type IEEE 754 type Java type Number of bytes Approximate range
word single precision float 4 +/- 10-46 to 1038
double word double precision double 8 +/- 10-324 to 10308

Note that the floating point types correspond to machine data types (word and double word) that can also be used to store integer values.  It is up to the program to decide whether a value is an integer or a floating point value.

How the CPU Executes Programs

So far we have described how data is stored in a computer's main memory.  The next question to answer is how the computer carries out computations on the data.

As mentioned earlier, programs are just a kind of data stored in the computer's main memory.  Specifically, a program is a sequence of machine instructions, where each machine instruction tells the CPU to perform a specific task.  Some examples of machine instructions might be:

As you can see, the amount of work done by a single machine instruction is fairly small.  However, by combining sequences of machine instructions in interesting ways, a program can perform any computation.

The following diagram shows how the CPU executes a program's machine instructions:

The CPU fetches instructions from main memory at the address specified by a special register called the program counter, or PC.  An instruction is simply a sequence of bytes in memory.  After fetching the instruction, the CPU decodes and executes it, carrying out whatever computation the instruction specifies.  Ordinarily, after executing one instruction the CPU changes the PC address to refer to the instruction immediately after the just-executed instruction.  This means that sequences of instructions are executed one after another.  However, if the executed instruction is a branch, then the processor changes the PC to contain the branch target address specified by the instruction.  Branches allow a program to vary the flow of execution according the data values that the program is working with.

Each CPU family defines its own machine language, which is the specific format and meaning of the instruction supported by the processor.  This means that a program written in machine language can only be executed on the kind of CPU it was designed for.  This is one of the reasons why you can't run a Windows program written for a Pentium CPU on a Sun workstation with a SPARC CPU.