(Handout) Lab 1 - C Programming and Makefiles
(Handout) Lab 1 - C Programming and Makefiles
Why C?
Here are some of highlights: C is an imperative programming language that was mainly developed as a
systems programming language to write operating systems. The main features of the C language include low-
level access to memory, a simple set of keywords, and clean style, these features make C suitable and
widely-used for system programming. C gives you a huge amount of power over what the computer does,
which helps optimize the performance of your programs and allows writing low-level sofware that interacts
directly with hardware. It also gives you the awesome feeling of really being in control. But with that power
comes the responsibility to use it correctly: C has very few safeguards to protect your program’s data or exit
gracefully when you make mistakes, and it will happily overwrite your memory with garbage or make your
program explode if you make mistakes. Don’t worry, though, we’ll help you find and avoid them!
Assignment
Assignment Setup
Start with the cs300-s23-labs-YOURNAME repository you cloned in Lab 0. Remember, you need to do all these
steps inside your container. So, run your container first, attach to it from inside VS Code and use the VS Code
terminal.
First, cd into the directory of cloned repo and ensure that your repository has a handout remote. Type:
$ git pull
$ git pull handout main
This will merge the Lab 1 stencil code with your previous work of Lab 0.
fatal: could not read Username for 'https://github.com': terminal prompts disabled
then you'd have to generate and add SSH keys for your Github account and make SSH as the default login
method for git. Use the following links to see how you can do this.
1. Generate SSH Key: Generating a new SSH key and adding it to the ssh-agent - GitHub Docs
2. Add SSH Key: Adding a new SSH key to your GitHub account - GitHub Docs
3. Make SSH as the default login method: https://stackoverflow.com/a/44247040/7737870
After doing all these steps, now run the git pull commands again. They should run without any error.
If you have any “conflicts” from Lab 0 (although this is unlikely), resolve them before continuing further.
Run git push to save your work back to your personal repository.
Setup
After you set up the lab, you should find within the lab1 folder a couple of files:
| File | Description |
|------------------|--------------------|
| reverse.h | Header file for `reverse.c`. Contains declarations for the function you
should be implementing. (Explained Below) |
| reverse.c | You will be writing your code in this file. |
| test_reverse.c | Contains the test suite in which your implementation will be tested. |
Header Files
You’ll notice that there are three files in the provided stencil code. reverse.c and test_reverse.c are similar
to what we’ve seen before, containing C code. But what about reverse.h ?
files ending in the .h extension are called header files, and declare functions so that they can be used in
multiple different .c files. Without a header file, test_reverse.c wouldn’t be able to use the functions that
you create in reverse.c , which would make testing impossible!
reverse.h includes the signature of the reverse_arr function, but with no implementation:
#include "reverse.h"
This way, when the reverse_arr function is used in test_reverse.c , the C compiler checks reverse.h for a
matching signature, and then checks reverse.c for an implementation of the function.
Since store is defined as a char pointer, store will point to a byte of memory that stores a character. And if
you increment the value of the pointer by 1 (going to the next box) and dereference that value, you would get
the next character of the string. This raises the question: couldn’t you just keep incrementing this pointer?
How would you know where the end of the string is?
The answer: all strings in C are terminated by a NUL byte (a char storing numeric value 0), also known
as \0 . This byte indicates that you have reached the end of the string.
Consider the following memory layout for the example program above:
if i == 0 , then *(store + 0) dereferences the memory address stored in store , which is 0x2000 , and
at address 0x2000 , the character “h” is stored as a single byte. This is equivalent to writing store[0] .
if i == 1 , then *(store + 1) dereferences the memory address stored in store + 1 , which is 0x2001 .
At 0x2001 , there is the character “e”. This is equivalent to writing store[1] .
What you saw here is an example of pointer arithmetic, that is, arithmetic on memory addresses.
reverse_arr will take in two inputs, a char* array and the number of input elements in the array.
And reverse_arr will reverse the inputted array with the help of another function called swap .
Note: For this part (Part I), you can assume that you will have the same number of elements
in the array as specified by the second argument, and you will not have to reverse an empty
array and all elements will be defined (i.e., not NULL ).
swap will take in two elements from the array and swap them.
This generates an executable called reverse_test . An executable is a special file that contains machine
instructions which are made up of machine instructions encoded as 0s and 1s. And running this file causes
the computer to perform the operations specified by those instructions.
In this case, those instructions are to run the program starting from main() , which first parses input from the
command line, reverses the array given, and calls functions that run the tests found in test_reverse() . (One
of the tests will open a file called test.txt in the current directory, reverses each line of the file, and writes it
to an output file called testout.txt. )
You run your executable via:
For instance:
will print out the results of reversing the input array and running the test suite. If you fail a test, the output
provides the expected result at a given index in the array and the actual result.
To debug, you may find it helpful to print what’s happening in your swap and reverse_arr function. For
instance, if you wanted to print the variable store from the code sample above, you can do:
to print a string.
Hint: If you want to print out the value of a pointer, use the %p syntax for printf .
Once all of your tests pass, you are ready to move on!
Assignment Part II: More on Compiling
As you saw from the previous section, you compiled your program by running:
With the -o flag, you can direct the output of the gcc compiler into a file specified by the argument following
the flag. If you didn’t use the -o flag, you could run:
And this will produce an executable file called a.out (this is just a default filename defined by the compiler),
which you can run by typing ./a.out .
Flags
The gcc compiler supports the use of hundreds of different flags, which we can use to customize the
compilation process. Flags, typically prefixed by a dash or two ( -<flag> or --<flag> ), can help us in many
ways from warning us about programming errors to optimizing our code so that it runs faster.
The general structure for compiling a program with flags is:
Warning Flags:
1. -Wall
One of the most common flags is the -Wall flag. It will cause the compiler to warn you about
technically legal but potentially problematic syntax, including:
Uninitialized and unused variables
Incorrect return types
Invalid type comparisons
2. -Werror
The -Werror flag forces the compiler to treat all compiler warnings as errors, meaning that your
code won’t be compiled until you fix the errors. This may seem annoying at first, but in the long run,
it can save you lots of time by forcing you to take a look at potentially problematic code.
3. -Wextra
This flag adds a few more warnings (which will appear as errors thanks to -Werror , but are not
covered by -Wall. Some problems that -Wextra will warn you about include:
Assert statements that always evaluate to true because of the datatype of the argument
Unused function parameters (only when used in conjunction with -Wall )
Empty if/else statements.
Task 2
Add the -Wall , -Werror , and -Wextra flag when compiling test_reverse.c and fix the errors that come
up.
Notice that in test_reverse.c the main() function takes in two parameters:
What’s argc supposed to do?
argc indicates the number of arguments passed into the program.
Debugging with Sanitizers: The warning flags don’t catch all errors. For example, memory leaks, stack or
heap corruption, and cases of undefined behavior are often not detected by the compiler. You can use
sanitizers to help with identifying these bugs! Sanitizers sacrifice efficiency to add additional checks and
perform analysis on your code. You will be using these flags in the next lab in greater detail.
4. -fsanitize=address
- This flag enables the AddressSanitizer program, which is a memory error detector developed by
Google. This can detect bugs such as out-of-bounds access to heap / stack, global variables, and
dangling pointers (using a pointer after the object being pointed to is freed). In practice,
this flag also adds another sanitizer, the LeakSanitizer, which detects memory leaks (also
available via `-fsanitize=leak`).
5. -fsanitize=undefined
This flag enables the UndefinedBehaviorSanitizer program. It can detect and catch various kinds of
undefined behavior during program execution, such as using null pointers, or signed integer
overflow.
6. -g
- This flag requests the compiler to generate and embed debugging information in the
executable, especially the source code. This provides more specific debugging information
when you’re running your executable with gdb or address sanitizers. You will see this flag
being utilized in the next lab.
Optimizations
In addition to flags that let you know about problems in your code, there are also optimization flags that
will speed up the runtime of your code at the cost of longer compilation times. Higher optimization levels
will optimize by running analyses on your program to determine if the compiler can make certain
changes that improve its speed. The higher the optimization level, the longer the compiler will take to
compile the program, because it performs more sophisticated analyses on your program. These are the
capital O flags, which include -O0 , -O1 , -O2 , -O3 , and -Os .
7. -O0
This will compile your code without optimizations — it’s equivalent to not specifying the -O option at
all. Because higher optimization levels will often remove and modify portions of your original code,
it’s best to use this flag when you’re debugging with gdb or address sanitizers.
8. -O3
This will enable the most aggressive optimizations, making your code run the fastest.
Task
Task 3
Time your program before you add the -O3 flag and then after you’ve added the -O3 flag to your
compilation. Because this program is so small, you probably won’t be able to detect a difference in
speed, but in future assignments where there is a lot more code, the optimization flag will come in
handy.
The -O3 flag will ask the compiler to examine what your code is trying to do and rather than following
the provided code verbatim it will replace it with machine instructions that functionally do the same thing,
but in a more efficient manner.
You can time your program by running the time command in your Docker container. For this exercise,
pay attention to the real time, but if you’re curious about the different types of times below, check out
this post.
time ./reverse_test
real 0m0.007s
user 0m0.002s
sys 0m0.000s
A Makefile consists of one or more rules. The basic structure of a Makefile rule is:
<target>: <dependencies>
[ tab ]<shell_command>
The target is the name of an output file generated by this rule, or a rule label that you choose in certain
special cases.
The dependencies are the files or other targets that this target depends on.
The shell command is the command which is run when the target or dependencies are out of date.
General Rules:
- From gnu.org: A target is out of date if it does not exist or if it is older than any of the dependencies (by
comparison of last-modification times). The idea is that the contents of the target file are computed
based on information in the dependencies, so if any of the dependencies changes, the contents of the
existing target file are no longer necessarily valid.
- If a target is out of date, running make <target> will first remake any of its target dependencies and
then run the <shell_command> .
- In general, the name of the Makefile target should be the same as the name of the output file, because
then running make <target> will rebuild the target when the output file is older than its dependencies.
Linking is the process of combining many object files and libraries into a single (usually executable) file.
If you look at the file test_reverse.c , at the top, you can see there is an #include “reverse.h” . This is
so that we can use the functions that you wrote to test them, and as you can see, reverse_arr is called
in the function test_reverse . You can link these two files together with the following Makefile rule:
The target is the executable named reverse_test, the dependencies are test_reverse.c , reverse.c ,
and reverse.h . And to compile, instead of typing the shell command, you can just type:
$ make reverse_test
This will cause the Makefile to run the reverse_test target, which will execute the command gcc
test_reverse.c -o reverse_test if a reverse_test executable doesn’t exist or if
the reverse_test executable is older than any of the dependencies. Notice how this only works properly if the
name of the output executable is the same as the target name.
That was a lot of reading and information, but now you are ready to create your own Makefile!
Task
Task 4
Variables
Makefiles support defining variables, so that you can reuse flags and names you commonly use. MY_VAR =
"something" will define a variable that can be used as $(MY_VAR) or ${MY_VAR} in your rules. A common way
to define flags for C program compilation is to have a CFLAGS variable that you include whenever you run gcc.
For example, you can then rewrite your target like this:
Automatic Variables are special variables called automatic variables that can have a different value for each
rule in a Makefile and are designed to make writing rules simpler. They can only be used in the command
portion of a rule!
Here are some common automatic variables:
$@ represents the name of the current rule’s target.
$^ represents the names of all of the current rule’s dependencies, with spaces in between.
$< represents the name of the current rule’s first dependency.
If we wanted to stop using test_reverse.c and reverse.c to avoid repetitiveness, we could rewrite our target
like this:
Task
Task 5
Use regular variables (i.e. CFLAGS ) and automatic variables to simplify your Makefile and add the -
O3 flag.
Note: you can do MY_VAR += <additional flags> if you want to compile with more flags and only use
one variable.
Phony Targets
There are also targets known as ‘phony’ targets. These are targets that themselves create no files, but rather
exist to provide shortcuts for doing other common operations, like making all the targets in our Makefile or
getting rid of all the executables that we made.
Here are some common phony targets that we’ll be using in this course:
all target
We use the all target to make all of the executables (non-phony targets) in our project simultaneously. This
is what it generally looks like:
all: target1 target2 target3
As you can see, there are no shell commands associated with the all target. In fact, we don’t need to
include shell commands for all , because by including each target (target1, target2, target3) as
dependencies for the all target, the Makefile will automatically build those targets in order to fulfill the
requirements of all .
In other words, since the all target depends on all the executables in a project, building the all target
causes make to first build every other target in our Makefile.
clean target
We also have a target for getting rid of all the executables (and other files we created with make) in our
project. This is the clean target.
The clean target generally looks like this:
clean:
rm -f exec1 exec2 obj1.o obj2.o
As you can see, the clean target is fundamentally just a shell command to remove all the executables and
object files that we made earlier. By convention, the clean target should remove all content automatically
generated by make. It must be a phony target, because by definition, make clean doesn’t generate output
files (but rather removes them)!
Note: Be careful which files you put after the rm -f command , as they will be deleted when you run make
clean. Don’t put your .c or .h files because you might lose the code that you wrote!
format target
In this lab, you will notice that all of the Makefiles will also contain a format target, which use a command
called clang-format to style your .c and .h files following a specified standard. A typical format command
would look like this:
format:
clang-format -style=Google -i <file1>.h <file2>.c ...
The above command will format any listed files according to Google’s coding conventions (a set of stylistic
and technical conventions that Google engineers agreed to use).
Note: When using this, keep in mind the order of your #include files. Formatting might change the order of
include statements. This is something to consider if, for example, you are importing a header file that relies on
standard libraries from the file you’re importing it in. To avoid this, make sure that your header files are self-
contained (i.e., include all the headers they need).
check target
You’ll also notice a check target in the Makefiles we provide in future labs. If you were to create a check
target in this particular instance, the dependency for the check target is the reverse_test executable.
Note (from gnu.org): The phony target will not work, if there exists a filename with the same name as the
phony target's name. To understand this, let's consider the clean phony target that we mentioned earlier. The
phony target will cease to work if anything ever does create a file named clean in the directory. Since it has
no prerequisites, the file clean would inevitably be considered up to date, and its commands would not be
executed. To avoid this problem, you can explicitly declare the target to be phony, using the special
target .PHONY as follows:
.PHONY : clean
Once this is done, make clean will run the commands regardless of whether there is a file named clean'.
Since it knows that phony targets do not name actual files that could be remade from other
files, make skips the implicit rule search for phony targets (see [Using Implicit Rules]
(https://ftp.gnu.org/old-gnu/Manuals/make-3.79.1/html_node/make_94.html#SEC93)). This is why
declaring a target phony is good for performance, even if you are not worried about the actual file
existing. Thus, you first write the line that states that clean` is a phony target, then you write the
rule, like this:
.PHONY: clean
clean:
rm *.o temp
Task
Task 6
Running make without any targets will run the first target in your Makefile. Consequently, you
should place the all target as the first target so that typing make will automatically generate all the
executables.
Don’t forget to mark these targets as phony!
Simplifying Linking
It is often a good idea to break compilation of a large program into smaller sub-steps. Consider, for example,
this command you used earlier:
gcc test_reverse.c reverse.c -o reverse_test
For this program, gcc creates two separate .o files, one for test_reverse.c and one for reverse.c and then
links them together. But what if you had hundreds of source files?
Large vs. Small Projects: For small projects, the above works well. However, for large projects it can be
much faster to generate intermediate .o files (so-called “object files”) and then separately link the .o files
together into an executable. Linking is the process of combining multiple object files (which already contain
machine code, but not a full program) into a full executable program.
Why does this make sense? Imagine a project that generates two shared libraries and four executables, all of
which separately link a file called data.c . Let’s say the data.o file takes 1 second to compile. If you compile
and link each executable in one command (without creating intermediate .o files), gcc will rebuild the data.o
file five times, resulting in 5 seconds of build time. If you separately build the data.o file, you’ll build the
data.c file only once (taking 1 second) and then link it (which is much faster than compiling from scratch,
especially with large source files) against each file. So, if linking takes 0.2 seconds per file, the total build time
will be 2 seconds instead of 5 seconds.
Although this technique won’t yield a huge performance benefit in the case of our small lab, let’s try this to
drive the concept of linking home! We can then use our Makefile to automate this process for us, so that we
don’t have to regenerate all object and source files every time we edit one of them.
To create the object files without linking them, we use the -c flag when running gcc. For example, to create
object files for test_reverse.c and reverse.c , we would run:
This will generate reverse.o and test_reverse.o files. Then, to link the object files into an executable, we
would run:
The advantage of creating object files independently is that when a source file is changed, we only need to
create the object file for that source file. For example, if we changed reverse.c , we would just have to
run gcc -c reverse.c -o reverse.o to get the object file, and then gcc reverse.o test_reverse.o instead of
also regenerating test_reverse.o to get the final executable.
Task
Task 7
In your Makefile, create targets for test_reverse.o and reverse.o , that each include the corresponding
source file as a dependency.
Each of these targets should compile their source file into an object file (not an executable). They
also need their correct flags for optimization and debugging.
Update your reverse_test targets to use the .o files.
Update your clean and format targets.
Thanks to this, make will only recompile each individual object file if that file’s source was changed.
It may not make the biggest difference for this lab, but in a larger project doing this will save you
lots of time.
Pattern Rules
The last Makefile technique we’ll discuss are pattern rules. These are very commonly used in Makefiles. A
pattern rule uses the % character in the target to create a general rule. As an example:
file_%: %.c
gcc $< -o $@
The % will match any non empty substring in the target, and the % used in dependencies will substitute the
target’s matched string. In this case, this will specify how to make any file_<name> executable with another
file called <name>.c as a dependency. If <name>.c doesn’t exist or can’t be made, this will throw an error.
As you may have noticed, both the test_reverse.o and reverse.o targets are running the same command,
which means that we can simplify it.
Task
Task 8
Use pattern rules to simplify your Makefile targets such that you can
generate reverse.o and test_reverse.o using only one rule rather than two seperate rules.
If you need help, this documentation might help.
Submission
After you complete each task, you commit and push it to your CS300 github repo. Make sure to
complete all the tasks within the lab container. In the commit message before your push , add the
message "Completed Task X" where X is the task number. You should complete the tasks in order by
going through each section sequentially.
All of your commits must be made before the assignment deadline mentioned in the Google
Classroom. If there're any commits made after the assignment deadline, your marks will be deducted.
So make sure to complete the tasks and commit them before the deadline.
In the Google Classroom assignment, you have to submit a text file which will contain your CS300
Github repo link. Also add imtiajahmed@iut-dhaka.edu as a collaborator in the github repo before
submission.
Again, complete all the steps before deadline.