Lecture 2

In this lecture we will discuss high level languages, the notion of object-oriented programming, and start to look at the Java language.

Machine Language and High Level Languages

Programming in machine language is cumbersome.  Machine language instructions express computation directly in bits and bytes.  To be productive, we need to be able to create high-level abstractions that define the computation in terms that are meaningful to human beings.

Another reason why we wouldn't want to do all of our programming in machine language is that machine language is tied to a particular CPU architecture.  For example, it is not possible to run a machine language program written for Intel Pentium CPUs on a PowerPC CPU.

A high-level language allows humans to write programs that employ higher-level abstractions.  The bits and bytes are still there behind the scenes, but we organize them into objects that represent concepts in our problem domain.  For example, if we are writing a drawing program, then we might have objects that represent lines, circles, squares, rectangles, and other shapes.

Some examples of high level languages include:

Compilers and Virtual Machines

In order to execute a program written in a high level language, it must be compiled.  Compiling a program translates it from a high-level source language into low-level machine language:

In the example above, a C program (Prog.c) is translated into a executable file (Prog.exe).  An executable file is a file that contains the machine language instructions for a program, and any initial data the program will use, in a format that is convenient for the computer operating system to load into memory and run.

The compilation model for Java is somewhat different than the compilation model for C (and similar languages like C++):

The example shows a Java source file (Prog.java) being compiled into a class file.  A class file is a lower level representation of the program than the source file: however, it does not contain machine instructions for any real CPU.  Instead it contains machine instructions for a Java Virtual Machine, or JVM.  A Java Virtual Machine is a program that loads class files into memory and translates their virtual machine instructions into actual (CPU) machine instructions.  One advantage of using a JVM is that it allows compiled Java programs to run on any kind of computer that has a JVM available, rather than being tied to a particular CPU architecture (as would be the case for executable files generated by a C compiler).

Classes and Objects

Java is an object-oriented programming language.  In an object-oriented language, programs are expressed in terms of objects.  So, what is an object?

An object is a bundle of data and methods

Data defines the state of the object.  Methods define the behavior of the object: in other words, what kinds of things the object is capable of doing.

Here is an example.  Say that we are writing an implementation of the video game Pong.  One of the objects that the game will have is the ball that the players hit back and forth.  The ball object will have both state and behavior.

Some of the state that a ball will have includes:

The behavior of the ball includes the operations that define how the ball will interact with other parts of the program.  Some possible operations that the ball object might support include

A class is a template for defining instances of a particular type of object.  For example, we could define a class called "Ball" that would describe how to create ball objects and their operations.

The distinction between classes and objects is a fundamental one.  The class is an abstraction representing all instances of a particular abstract catgegory of objects: for example, the Ball class represents the category of all ball objects.  An object is a particular instance, or individual, belonging to the category represented by its class.  Thus a ball object is an instance of the Ball class.

Let's visualize this distinction concretely.  Say that we are going to make our Pong game a bit more exciting, and have two balls instead of just one.  The Ball class allows us to define as many instances (objects) as we would like.  Each instance has its own private data (state), but shares the common methods (operations) defined by the Ball class:

Classes and objects are a very powerful way of expressing concepts in the form of programs.  We will explore object-oriented programming in Java a bit later in the semester.

Java Lexical Elements

Programming languages are like human languages.  They have a lexicon of words and punctuation, and a grammar that determines how the words and punctuation are combined to form phrases, sentences, and documents.  Let's make this correspondence explicit:

Human language Programming language
Phrase Expression
Sentence Statement
Document Program

Once you know the lexicon and grammar rules, you can use them to define any possible Java program.  For now, let's look at the lexical elements of Java.

Tokens and Whitespace

At the lexical level, Java programs consist of a sequence of tokens separated by whitespace or comments.  The tokens are the words and punctuation of the program.  Whitespace and comments are used to format and annotate the program in order to make it easier to read and understand.

Whitespace consists of any sequence of space, tab, or newline characters.  They have no significance on the meaning of the program.  The purpose of whitespace is to separate tokens and make the program easier to read.  For example, the Java compiler would be perfectly happy for your entire program to be written on a single line.  However, a program written this way would be very hard to read.

Comments are annotations added by the programmer to describe what the program is trying to do.  They also have no effect on the meaning of the program.  However, comments provide useful cues for human observers who are reading the program and trying to understand what it does.  There are two kinds of comments in Java.

End of line comments look like this:

// This is an end of line comment

In an end of line comment, everything between the "//" characters and the end of the line are ignored by the compiler.  However, the following line is not part of the comment.

Multi-line comments look like this:

/*
  This is a multi-line comment.
  Like the name suggests, it can span multiple lines.
*/

Everything between the opening "/*" characters to the closing "*/" characters is ignored by the Java compiler.  Sometimes you will see multi-line comments that look like this:

/*
 * Multi-line comments often have a column of asterisks
 * on the left-hand side.
 */

Identifiers

Identifiers are simply names.  An identifier must begin with either a letter or the underscore ("_") character.  After the initial character, it may contain letters, underscores, or digits ("0"-"9").  The purpose of identifers is to name a variable, method, or class.

Examples of identifiers:

Identifiers in Java are case-sensitive: "Ball" and "ball" are different identifiers.

Keywords

Keywords are tokens that superficially resemble identifiers, but have a special meaning to the Java compiler.  They can not be used as identifiers.  Some examples:

  • if and else are keywords that are used to specify conditional execution
  • int and double are keywords that specify a type (recall our discussion of types in Lecture 1
  • The complete list of Java keywords may be found on Sun's Java website.

    Literal

    Literals are tokens that specify a literal numeric, character, or string value.  Some examples:

    Kind of literal Java type Example
    Integer int 42
    Floating point double 3.14159
    Character char 'Q'
    String String "Hello, CS 101"

    Each kind of literal has a type.  The first three kinds of literals---integer, floating point, and character---correspond to primitive Java types (int, double, and char) that correspond closely to machine data types.  String constants, unlike the other kinds of literals, have the Java type String.  That means that a String constant is represented by a String object.

    Note that String constants can contain space characters.  As long as the spaces are between the opening and closing double quote characters, the Java compiler will know that they are part of the String.

    Operators

    Operators compute functions based on one or two input values.  For example, the standard operators for performing arithmetic are

    Operator Meaning
    + Addition
    - Subtraction
    * Multiplication
    / Division
    % Modulus (integer remainder)

    The logical operators && and || implement boolean functions.  && is the boolean and function: it evaluates as true if both of its operands are true and false otherwise.  || is the boolean or function, which evaluates as true if either operand is true and false otherwise.

    Punctuation

    Some tokens serve as punctuation: their meaning is to group or separate other constructs.  For example:

    Token(s) Meaning
    { } Grouping
    ; Terminate a statement
    , Separate values in a list
    . Choose a method or field in a class or object