YCP Logo Lecture 23: C/C++ Build Process, Makefiles

C/C++ Build Process

The process for building C and C++ programs --- translating them from source code into an executable form --- is somewhat more complicated than the build process for virtual-machine-based languages like Java.

Source and Header Files

At the source level, a C/C++ program is a collection of

  • header files (.h files)
  • source files (.c or .cpp files)

The header files contain declarations that need to be shared among multiple source files. For example:

  • data types such as struct and class types
  • function prototypes (i.e., function declarations)
  • constant values
  • enumeration types

The source files contain the actual executable functions and methods of the program, as well as any global data.

Each source file may use the #include directive to cause the declarations in a particular header file to be visible within the source file. In addition, one header file may include other header files. So, let's say that

  • Game.cpp includes Game.h
  • Game.h includes Item.h

That would mean that all of the declarations in both Game.h and Item.h are visible in Game.cpp.

The inclusion of a header file creates a dependency. If the header file is changed, that change affects the source or header file that includes it. Dependencies are transitive: for example, in the example above, if Item.h changes, that change affects Game.cpp because Item.h is included from Game.h, which is directly included by Game.cpp.

More formally, header file dependencies are a directed acyclic graph.

Here is a possible graph of header file dependencies for a dungeon exploration game:

figures/headerDep.png

Separate Compilation

Because the overall C/C++ program may contain a large number of functions and methods, it is helpful to organize the program using multiple source files. For C++ programs, a common strategy is to have one source file per class. E.g., Game.cpp will contain all of the methods for the Game class.

Each source file is compiled into an object file which contains the generated machine language instructions for each function and method in the source file, as well as descriptions of any global variables defined in the source file. Object files generally have a .o or .obj file extension.

Once each source file has been compiled into an object file, all of the object files are combined by the linker into a single executable file. The executable file can then be loaded by the operating system and executed.

Here is how the source files in a dungeon exploration game would be compiled and linked:

figures/compileAndLink.png

Make

Make is the original tool for implementing a build system. It is a specialized programming language for describing how derived files can be created from source files.

Make may take some getting used to because it uses a different programming paradigm than you are probably used to. The programming languages you have (probably) used so far are imperative languages. Examples of imperative languages are C, C++, C#, Java, etc.

Make is a programming language based on logic programming. A logic programming language uses rules to describe how computations are performed. In the case of Make, the rules describe how derived files may be constructed from source files.

Macros

A make macro is essentially a variable with a string value. Macros are useful for defining text that will be referenced multiple times. For example, it is common to define a SRC macro which specifies the names of all of the source files. E.g.:

SRC = calc.c expr.c parse.c err.c

The value of a macro can be expanded using the syntax $(macro name). E.g., an occurrence of

$(SRC)

would expand to

calc.c expr.c parse.c err.c

Rules

The general form of a make rule looks like this:

targets : dependencies
        command

The targets and dependencies are simply lists of files. The dependencies are the files that are required to build each of the target files.

The command describes how to run a program that will use the dependencies to generate (or regenerate) the targets. Note that the command is optional: we'll see why in a moment.

Example: if a C source file is a complete program, then a single command can be used to compile an executable from that source file. For example:

hello.exe : hello.c
        gcc -o hello.exe hello.c

This rule means:

in order to build the target hello.exe:

  1. hello.c must exist
  2. the command "gcc -o hello.exe hello.c" must be run

A target is out of date if at least one of its dependency files has a modification time that is newer (more recent) than the target. When a target is out of date, make knows that it needs to re-run the command used to create that target.

Pattern Rules

One of the most powerful features of make is pattern rules. A pattern rule describes how to generate a category of output files based on a category of input files.

For example, when compiling a program consisting of multiple C source files, each source file is first compiled into an object file. Object files usually have the file extension ".o". The following pattern rule describes how to compile a C source file into an object file:

%.o : %.c
        gcc -c $*.c -o $*.o

The "%" character in the target and dependency is a wildcard: it must match the same sequence of characters in both the target filename and the dependency filename. That means, for example, that the source file "calc.c" will be compiled into an object file called "calc.o", because the wildcard matches the common portion of each filename, namely "calc".

In the command associated with the pattern rule, the special character sequence "$*" can be used to specify whatever sequence of characters was matched by the "%" wildcard. That means when the rule is applied to compile expr.o from expr.c, the command executed will be

gcc -c expr.c -o expr.o

Dependency rules

One limitation of pattern rules is that there may be other required source files needed to build a particular derived file in addition to the dependency file described by the pattern. For example, the object file "expr.o" might require not only "expr.c" as a source file, but also "parse.h", "err.h", and "expr.h". This is a common situation in languages like C and C++ where header files are used for definitions shared between multiple source code files.

We can solve this problem using a dependency rule: that is a rule that describes target files and dependency files, but does not specify a command. For example:

expr.o : expr.c parse.h err.h expr.h

This dependency rule will tell Make that "expr.o" needs to be regenerated whenever expr.c, parse.h, err.h, or expr.h is modified. When Make detects that expr.o must be regenerated, it searches for a pattern rule that can be applied. In this case, we have a pattern rule describing how to generate a .o file from a .c file, so Make will run the command associated with that rule.

Automatic header dependency generation

Maintaining header file dependencies "by hand" is tedious and error prone. Fortunately, most compilers can automate the generation of dependencies. For example, gcc has a -M option to generate a dependency rule for a particular source file.

For example, the command

gcc -M expr.c

would output something like

expr.o: expr.c parse.h err.h expr.h

The output of gcc -M for all source files can be collected into an auxiliary makefile. This auxiliary makefile can then be automatically loaded by the main makefile so that the latest header file dependencies are accounted for. This is often done by a depend target. E.g.

depend :
        gcc -M $(SRC) > depend.mak

This will generate dependency rules for each file defined in the SRC macro, saving them in a file called depend.mak. We can then add an include directive in the make program to automatically include this file if available:

ifeq ($(shell if [ -r depend.mak ]; then echo "yes"; fi),yes)
include depend.mak
endif

Note that the ifeq directive tests to see whether the depend.mak file exists before including it.

This approach still requires the developer to build the depend target whenever source or header files are modified in a way that might affect header dependencies. If the developer forgets to do this, then the excecutable might not be built correctly. A very simple approach to making header dependency generation happen automatically is to ensure that the depend target is a prerequisite for all other targets in the makefile. E.g.:

all : depend
        $(MAKE) calc.exe

In this example, whenever the all target is built, the depend target must be built first, regenerating the depend.mak file containing the latest header file dependencies. Once that is done, a recursive invocation of make builds the calc.exe target, taking the new header file dependencies into account.

Makefiles, Makefile syntax

A Make program is written in a file called "Makefile". (Note that there is no file extension.)

A makefile consists of a series of rules. Each rule must be separated by one or more blank lines.

If a rule has a command, the command must be written immediately below the first line of the rule (the one with the targets and dependencies), and must be indented with a single tab character.

Pattern rules should appear first, before the rules that build specific targets.

The first non-pattern rule in the makefile is the default rule. When you run the make command and do not explicitly specify a target, make will attempt to build the target(s) specified by the default rule. Typically, a makefile will contain a default rule for a target called all.