01) Linux - Lab Manual
01) Linux - Lab Manual
for
CSE220 Programming for
Computer Engineering
Kevin Burger
Rev 1.5(a)
1 Sep 2009
Contents
1 Introduction to Unix and GNU/Linux................................................................... 3
1.1 Short History of Unix ................................................................................... 3
1.2 The GNU Project and the Linux Kernel ....................................................... 5
1.3 GNU/Linux Distributions .............................................................................. 5
1.4 Other Unix and Unix-Like Distros ................................................................ 7
1.5 Cygwin ........................................................................................................ 7
2 Logging in to GNU/Linux .................................................................................... 8
2.1 KDE, Konqueror, the Terminal Window ....................................................... 9
2.2 Logging Out ............................................................................................... 11
2.3 Logging In To "research.asu.edu" ............................................................. 11
2.4 Logging Out Of research.asu.edu ............................................................. 12
2.5 Transferring Files with WinSCP................................................................. 12
3 Shells ............................................................................................................... 15
3.1 The Bash Shell .......................................................................................... 15
4 The *nix File System and Common Commands .............................................. 18
4.1 Common Directories in *nix ....................................................................... 18
4.2 Pathnames ................................................................................................ 20
4.3 Commands, Arguments, and Options ....................................................... 20
4.4 Common File and Directory Commands ................................................... 21
4.5 How Do I Know Which Directory I Am In? ................................................. 21
4.6 List Files, File Permissions, The Chmod Command .................................. 22
4.7 Deleting Files............................................................................................. 23
4.8 Creating, Changing, and Removing Directories ........................................ 23
4.9 Filename Globbing .................................................................................... 24
4.10 More on Deleting Subdirectories ............................................................. 25
4.11 Copying Files and Directories.................................................................. 26
4.12 Moving and Renaming Files .................................................................... 27
4.13 Finding a File ........................................................................................... 27
4.14 Displaying a File at the Terminal: Cat and Less ...................................... 28
4.15 Archiving Files ......................................................................................... 29
4.16 Compressing Files ................................................................................... 30
4.17 Hexdump ................................................................................................. 31
5 Getting Help ..................................................................................................... 32
5.1 Man Pages ................................................................................................ 32
5.1.1 Reading a Man Page .......................................................................... 32
5.2 Apropos ..................................................................................................... 36
5.3 Info ............................................................................................................ 36
5.4 Final Notes ................................................................................................ 37
6 Environment Variables, Search Path, Prompt .................................................. 38
6.1 Changing your Prompt .............................................................................. 38
7 Searching Files ................................................................................................ 40
8 I/O and Redirection .......................................................................................... 41
1
http://en.wikipedia.org/wiki/Unix.
An operating system is software which manages the resources in a computer system.
Ken Thompson3, Dennis Ritchie4, and others. Thompson had been working on
an earlier operating system project named Multics (Multiplexed Information and
Computing Service) in conjunction with MIT and General Electric. While working
on the project, Thompson had written a game called Space Travel for the GE
mainframe being used for Multics development. After AT&T Bell Labs withdrew
from the Multics project, Thompson began porting his Space Travel game to a
DEC PDP-7 minicomputer at Bell Labs, and in the process began work on a new
operating system he named Unics (for Uniplexed Information and Computing
Service); the spelling was later changed to Unix and when it was owned by AT&T
was written in all capitals as UNIX.
Over time, Thompson, Ritchie, and others at Bell Labs continued developing
UNIX. In 1973 the entire operating system was rewritten in Ritchie's new C programming language, and during this time AT&T began giving the operating system to universities, research institutions, and the US government under licenses,
which included all source code. In 1982, when AT&T realized that they had an
OS on their hands that was worth something, they changed their licensing model
and began selling UNIX as a commercial product without source code.
Meanwhile, at the University of California, Berkeley, Bill Joy and others had begun to develop BSD5 Unix as an alternative to the now-commercialized AT&T
Unix. BSD Unix went on to become immensely popular, especially in the academic world, and Bill Joy eventually left Berkeley to go to work for a small startup
named Sun Microsystems where he created the SunOS version of Unix which
eventually became Sun Solaris.
Throughout the next two decades, many companies and organizations produced
their own versions of Unix, usually under license from AT&T or from the Berkeley
branch. This led to a proliferation of differing operating systems that were called
Unix. For example, at one time, one could acquire "Unix" from various entities
such as: AT&T (System V), UC-Berkeley (BSD Unix), IBM (AIX), SGI (IRIX), Microsoft (Xenix), SCO, HP (HP-UX), and others. During the late 1980's, it was
thought that Unix would become the dominant operating system among PC users, but the divisions among the participants and the wrangling for control of Unix
created an opportunity for Microsoft Windows NT to fill the gap in the market, and
for Microsoft Windows to, instead, dominate the industry6.
Eventually, while getting out of the computer business, AT&T washed their hands
of Unix and sold all rights to Novell, which eventually transferred control to the
Open Group. The Open Group is not a company, but rather is an industry con3
4
5
6
http://en.wikipedia.org/wiki/Ken_Thompson_(computer_programmer).
http://en.wikipedia.org/wiki/Dennis_Ritchie. Dennis Ritchie is the creator of the C programming language. It was based
on an earlier programming language named B that was created by Ken Thompson.
Berkeley Software Distribution.
Supposedly. Sigh.
sortium7 which sets standards for Unix. The Open Group published the Single
Unix Specification (SUS) in 2002 which is a family of standards that defines
what Unix is, is not, and what OS's qualify for the name. The SUS is now maintained by the Austin Group for the Open Group.
7
8
9
10
11
12
Often, additional software packages are added to complete the particular software distribution or distro as these are called. These additional software packages often include Open Office13 which is a free productivity suite with applications similar to Microsoft Office (Open Office includes a word processor, spreadsheet, presentation, database, and graphics drawing program). The quality of
Open Office may not be as high as Microsoft Office14, but it is very good, and additionally it is free (and open source).
Graphics support in most GNU/Linux distros is based on the X Window System15
first developed at MIT in 1984. Modern, open-source Graphical User Interfaces
(GUI's) such as KDE16 and Gnome17,18 are written to work with X as the underlying graphics system while KDE and Gnome make the interface intuitive and
"pretty" for the usermuch as the GUI does in Microsoft Windows and Apple's
OS X. Many GNU/Linux distros include support for both KDE and Gnome, and
you can choose which to run when logging in. I personally prefer KDE, but many
people swear by Gnome. Truth be told19, they do more-or-less the same thing.
Much information about GNU/Linux distros can be found at 20 where there are
links to hundreds, if not over a thousand, different Linux distros. Currently, the
most popular distros seem to include: Ubuntu (based on Debian), Debian, RedHat Fedora, Mandriva, CentOS, SUSE, Knoppix, PCLinuxOS, Slackware, Gentoo, and Puppy.
A lot of distros are based on other distros. For example, Ubuntu is based on Debian, CentOS on RedHat Enterprise Linux, Mint on Ubuntu, PCLinuxOS on Mandrake (Mandriva) etc. In general, the underlying OS is more-or-less the same
(the kernels may be slightly different versions) but the packages included in the
distro may differ. For example, Ubuntu is "like" Debian, but it is packaged so as
to make it easier to install and configure, which is, of course, important for beginners. I personally have been using Debian for a number of years, and have not
seen a great reason to switch, but if you are starting out you might want to look at
Ubuntu; it has become very popular. Support varies among the distros, and with
the larger ones, you are more likely to find online support, especially through online forums.
Another option, if you don't want to install GNU/Linux to your hard drive is to boot
from a LiveCD21 (Live USB22) which is a version of GNU/Linux which is bootable
13
14
15
16
17
18
19
20
21
22
http://www.openoffice.org.
It's buggy, but then again, so is Office.
http://en.wikipedia.org/wiki/X_Window_System. Commonly called X11 or just X.
http://en.wikipedia.org/wiki/KDE.
http://en.wikipedia.org/wiki/GNOME
There are other GUI's for GNU/Linux but KDE and Gnome are the two most popular.
Don't shoot me.
http://distrowatch.com. More info than you could possibly use.
http://en.wikipedia.org/wiki/Live_CD
http://en.wikipedia.org/wiki/Live_USB
off of a CD-ROM disc (USB flash drive). Another, newer, cool option is to use
Wubi which is an installer which will installed Kubuntu23 directly onto one of your
existing Windows partitions. It will modify the PC bootup screen so you can select to boot unto Windows or Kubuntu. The nice thing about this is that it does not
require you to create a separate hard drive partition for GNU/Linux, and if you
decide you do not want Wubi/Kubuntu on your system anymore, all you have to
do is uninstall it like any other Windows application.
It is possible, of course, to build your own GNU/Linux OS by starting with the
kernel and adding GNU source and other packages to that. One such project
teaching you how to do this is Linux from Scratch24 which provides step-by-step
instructions for building a working GNU/Linux OS from source (i.e., this means
you build/compile all of the necessary programs yourself, including the kernel). I
haven't tried it yet, and although it sounds fun, it also sounds like a lot of work.
1.5 Cygwin
Cygwin26 is a Linux-like environment for Windows. It consists of a DLL (dynamic
link layer; named cygwin1.dll) which acts as a GNU/Linux API (application programming interface) emulation layer. When a program makes a Linux system
call, this API layer traps the call and transforms it into a functionally equivalent
Windows system call. From the program's perspective, it appears to be running
on GNU/Linux, even though it is actually executing on a Windows machine.
In addition to the Cygwin DLL, there are a large number of native GNU/Linux applications which have been ported to Cygwin (native GNU/Linux applications will
not run directly, but rather, must be modified in order to run with Cygwin at the
API layer).
23
24
25
26
To use Cygwin, you would want to install at a minimum the following packages
and tools:
Base
Devel
Doc
Shells
Utils
everything
binutils, gcc-core, gcc-g++, gdb, make
cygwin-doc, man
bash
bzip2, cygutils
It will work best if you let Cygwin install itself where it wants to in C:\Cygwin; the
main thing is to not put it in C:\Program Folders or any other Windows folder
with a space in the folder name.
The default Cygwin shell is Bash which we will be discussing later in Chapter 3.
2 Logging in to GNU/Linux
The machines in the Brickyard BY214 engineering lab are set up to dual boot
both Windows and GNU/Linux (the particular GNU/Linux distribution, that is, "distro" is CentOS27 which is based on RedHat's Enterprise Linux distro). To boot
up to GNU/Linux, turn off the machine, turn it back on, and select CentOS from
the boot window when it appears.
After CentOS boots, the login screen will be displayed as shown below. Your user identifier and password are your ASURITE login identifier and password.
27
Historically, the Unix user interface was text-based28 as it was developed in the
early 1970's before graphical terminals were common. Today, most GNU/Linux
distros support a graphical user interface. The two GUI's that are most commonly
used are the K Desktop Environment, KDE29 and Gnome30.
Some distros include KDE but not Gnome. Others support Gnome, but not KDE.
Some include both and give you the option of selecting which one you wish to
use when you log in. In CentOS, You can select which desktop environment you
wish to use through the Session link in the lower left corner of the login screen.
By default, Linux will start with the desktop manager you last used. In the remainder of this Chapter we assume you are using KDE.
The two main applications you need to learn to use for this course are the file
manager and the terminal. The KDE file manager is Konqueror and it is
icon and then Home. Konqueror allows you to
launched by clicking on the
view the files in your directories, move/copy/delete/rename files, etc.
28
29
30
http://en.wikipedia.org/wiki/Text-based
http://www.kde.org
http://www.gnome.org; pronounced "gnome" and not "guh-nome".
c
To launch the terminal window (i.e., the shell) click on the
icon, then select
System and then Terminal. Note that you can have multiple terminals running at
one time.
10
Then, once connected you will be asked to enter your password. Enter your
ASURITE password. Once logged in, you will be at the shell prompt (see Chapter
3 for a discussion of the shell). If you want to access research.asu.edu off campus, another very good free, secure shell program is putty32. I highly recommend
it.
31
32
http://en.wikipedia.org/wiki/Secure_Shell
http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
11
Click the New button and the following dialog box will appear,
33
http://winscp.net/eng/index.php
12
In the textfield for Host Name, enter research.asu.edu. In the textfield for User
name enter your ASURITE user identifier. You may enter you ASURITE password in the Password textfield if you do not want to be prompted to log in every
time you connect. Make sure the Port Number is set to 22, and that SFTP is selected for the File protocol. When you click Save you may be presented with a
dialog box that says something about saving your password (this only occurs if
you entered a password). Click OK. You are then asked to give the session a
name, you might use ASU - Research Cluster for example.
Then to connect to research.asu.edu, select ASU - Research Cluster from the
list of saved sessions, and click the Login button. You may get a dialog box that
says something about the server's host key and the cache. Just click Yes to accept it. If everything succeeds the following window will be displayed,
In the left pane of the window are the files on your local machine; the right pane
displays the files in your home directory on the remote machine. You can copy
and move files between these two machines by drag-and-drop. You can also delete files, rename files, create directories, etc.
13
34
http://winscp.net/eng/docs/scripting
14
3 Shells
In *nix the shell35 is a text-based interface to the operating system. The kernel36
is the lowest-level innermost part of the operating system, so the shell is so
named because it surrounds the kernel.
Over the years, various shell programs have been written. The first was the
Thompson shell (commonly referred to as simply sh37) written by Ken Thompson at Bell Labs. Other popular shells have that have been written over the years
include the Berkeley Unix C shell (csh), the TC shell (tcsh; the T comes from the
TENEX operating system), the Korn shell (ksh), the Z shell (zsh), the Bourne
shell (sh), and the Bourne-again shell (Bash38).
Bash was the shell originally developed by GNU for the GNU operating system. It
is commonly available in most GNU/Linux distributions and this manual is written
assuming you are using Bash. Extensive online documentation for Bash is available39. Another reference is available at 40.
You can see which shells are installed on your system by typing cat /etc/shells
(the chsh -l command simply displays this file). Most likely if you are using Linux,
your default shell is bash. An easy way to tell what shell you are running is to
type less /etc/passwd and look for the line containing your username. The last
entry on the line will be your default login shell.
If you wish to change your default login shell you can use the chsh -s shell
command. This will modify the shell entry for your username in /etc/passwd so
the next time you login you will be running the new shell.
35
36
37
38
39
40
http://en.wikipedia.org/wiki/Shell_(computing)
http://en.wikipedia.org/wiki/Kernel_(computer_science)
http://en.wikipedia.org/wiki/Thompson_shell
http://www.gnu.org/software/bash
http://www.gnu.org/software/bash/manual/bashref.html
http://tiswww.case.edu/php/ chet/bash/bashref.html
15
The $ prompt is the Bash shell prompt. It is Bash's way of letting you know that it
is waiting for you to type a command. The research.asu.edu:~ part lets me
know the name of the machine I am logged in to and my current directory. The ~
symbol in Bash always refers to a user's home directory. Your home directory is
where you are placed when you first log in and it is the area in the file system
where you can store your files. The prompt string can be changed, see Section
6.1.
When you type commands at the prompt and press the Enter key, Bash will
read the command string you typed in, parse it to break it into words and operators, perform some post-processing, and will then execute the command. Unless
you tell it otherwise, Bash will wait for the command to finish executing before
displaying the $ prompt again (see Chapter 9 on Processes for how to run a
command in the background).
Some of the commands that you may use are built-in to Bash, i.e., the code that
gets executed is inside the Bash program itself. However, most *nix commands
are standalone programs that are found in the directories /bin, /usr/bin, and
/sbin directories (see Section 4.1 for a discussion of these directories). For example, we can determine where the GNU C compiler is installed on research.
asu.edu by typing,
$ whereis gcc
gcc: /usr/bin/gcc /usr/lib/gcc /usr/libexec/gcc /usr/share/man/man1/gcc.1.gz
whereis is a command which will search for the program file given to it. In this
case we are searching for the program file named gcc. whereis tells us that a file
named "gcc" can be found in four separate directories. In fact, the first one
16
/usr/bin/gcc is the actual executable file which is the GNU C compiler. For fun,
try to determine where the whereis program is installed,
$ whereis whereis
whereis: /usr/bin/whereis /usr/share/man/man1/whereis.1.gz
We can see that it is also installed in /usr/bin. In fact, most of the basic *nix
commands (i.e. executable programs) are located in this directory.
Bash will also execute special programs called scripts written in the Bash shell
scripting language. See Chapter 11 for a basic introduction to Bash shell scripting.
Overall, Bash is a very complex program, and we will only learn a small part of it
in this course. However, what we learn in this course is supposed to provide you
a solid foundation on which you can learn and obtain more knowledge of *nix and
Bash.
17
41
42
43
http://en.wikipedia.org/wiki/Berkeley_Fast_File_System
http://amath.colorado.edu/computing/ unix/tree.html
http://www.pathname.com/fhs/pub/fhs-2.3.html
18
/home
Users' personal directories are located here. Normally each user is given a home
directory where he/she may store his/her files. This is the directory where, as a
user, you would spend most of your time. A typical home directory for user fredf
would be /home/fredf. Note that the special symbol ~ is used to refer to the
home directory in the shell (discussed later).
/lib
Contains shared library images needed to boot the system and to run the commands in bin and sbin.
/mnt
Used to temporarily mount another file system under the current file system,
e.g., to connect the file system for a floppy disk or a cdrom to the current file system. How these devices are mounted is not something ordinary users care about.
/opt
Third-party application programs can be installed here, each within its own subdirectory.
/root
This is root's home directory. The special login identifier root is the master login
for the entire system. Root has privileges that other users do not have, and he or
she is typically the system administrator who is responsible for administering the
system, installing programs, adding users, mounting file systems, etc.
/sbin
This directory contains system binary files, i.e., executables that are normally
used only by root in administering the system. Other system binaries may be
found in /usr/sbin and /usr/local/sbin.
/tmp
Temporary files are often placed here.
/usr
The files and directories under usr are mounted read-only (except to root). Underneath usr are several important subdirectories.
/usr/bin
Other user-specific binaries. For example, the gcc program might be installed here.
/usr/include
Include header files for the C compiler are placed here.
/usr/lib
Contains object files and library files used in software development.
/usr/local
Another place where root can install user programs. For example, if you
install the Firefox web browser, you may install it under /usr/local/firefox
19
for example. Multiple networked host systems may access the files in
/usr/local.
/usr/share
Shared data files, e.g., font files, man pages, other documentation, icon
images files, etc.
/usr/src
Source code, generally for reference only (i.e., not for development purposes). On many GNU/Linux systesms, the Linux source code will be installed in /usr/src/linux (during installation, you usually have the choice to
download and install the Linux source code).
/usr/X11
Files used by X Window (X is part of the graphical system of your Linux
installation).
/var
Variable data files (i.e., data files that contain data that are changing) are stored
here. These would include temporary files and log files.
4.2 Pathnames
When referring to files in the file system, the shell maintains a current directory
pointer which also "points" to the current directory. When you log in to the system, you will be placed in your home directory, e.g., /home/fredf if your log in
identifier were fredf. That is now your current directory and is called an absolute
pathname because it refers to the complete path that must be followed from root
to get to the directory fredf. Another example of an absolute pathname might be
/usr/local/bin/firefox/v1.4.0/bin/firefox.
A relative pathname is a path to a file or directory which is specified relative to
your current directory. Two special relative pathnames are . (pronounced dot)
and .. (pronounced dot dot). Dot always refers to your current directory and dot
dot refers to the directory above your current directory (i.e., the parent directory). For example, if my current directory is /usr/local/bin then . is /usr/local/ bin
and .. is /usr/local. These are quite useful as we will see.
20
21
$ pwd
/home/fredf
1
1
1
1
fredf
fredf
fredf
fredf
fredf
fredf
fredf
fredf
20987
1098
654
0
The first column displays the permissions for the files44. Permission attributes
are changed with the chmod command45.
$ chmod go=
$ ls -l
-rwx------rwxr-xr-x
-rwxr-xr-x
drwxr-xr-x+
a.out
1
1
1
1
fredf
fredf
fredf
fredf
fredf
fredf
fredf
fredf
20987
1098
654
0
The above command removes all of the group and other's permission bits.
$ chmod ugo=rwx a.out
$ ls -l
-rwxrwxrwx 1 fredf fredf
-rwxr-xr-x 1 fredf fredf
-rwxr-xr-x 1 fredf fredf
The above command gives everyone in the group and all other's full permissions.
See the online website46 for more explanation.
44
45
46
http://zzee.com/solutions/unix-permissions.shtml
http://catcode.com/teachmod
http://catcode.com/teachmod
22
src
fredf fredf
fredf fredf
fredf fredf
36 Sep 23
1098 Mar 7
0 Jan 01
7:53 .bash_profile
2005 file1.c
2005 src
To change your working (current) directory, use the cd (change dir) command,
$ cd cse220
$ pwd
/home/fredf/cse220
$ ls
assgn02
23
$ cd ..
$ pwd
/home/fredf
The command cd by itself will always change you back to your home directory.
The symbol ~ also always refers to your home directory as well.
$ cd /usr/local/bin
$ pwd
/usr/local/bin
$ cd
$ pwd
/home/fredf
$ cd /usr/local/bin
$ pwd
/usr/local/bin
$ cd ~
$ pwd
/home/fredf
$ cd cse220
$ cd assgn02
$ pwd
/home/fredf/cse220/assgn02
$ rm ~/file1.c
http://man-wiki.net/index.php/7: glob
25
$ ls cse220
ls: cannot access cse220: No such file or directory
This last rm command deletes everything in your home directory (all files, all directories, all subdirectories, and all files in all subdirectories). The option -f
means force and will cause even files marked as read-only to be deleted.
26
To copy all the files in a directory (and its subdirectories) to another directory,
use the -r command line option with cp (-r stands for recursive),
$ cd
$ cp -r src src-backup
$ ls src-backup
file01.c file02.c file03.c file01.o file02.o file03.o
Note that the . is necessary in the command to specify the destination for the file
that is being copied. Without it you will get an error message,
$ cp ../dir/file
cp: missing destination file operand after `/dir/file'
Try `cp --help' for more information.
To rename a file in *nix you move it to a new file, e.g., to rename file1.c to
file01.c,
$ mv file1.c file01.c
cate.db somewhere in the file system but you don't know where. The command
to type would be,
$ find / -name mlocate.db
This command tells the find program to start at the root directory (the / part) and
search for a file a filed named (-name) mlocate.db.
It should be noted that a *nix installation might have 10,000 or more files in the
file system, so this command could run for quite a long time before it finds the
file. If you have a hunch that the file you are looking for is in a subdirectory below
root, say /usr, then you could narrow down your search by typing,
$ find /usr name mlocate.db
Now, only the directory /usr and its subdirectories will be searched. To perform a
case-insensitive search on the file use -iname rather than -name.
With cat, the contents of the file will scroll right off the top of the window if the file
is too large. You can use the less command to display a file on the window and
then use the Page Up (or b for backwards) and Page Down (or f for forwards)
keys to scroll through the file.
$ less file1.c
file1.c is displayed on window
28
!cmd
v
Note that the "tarred" files are not removed. This tar command would create an
archive file named archive.tar which contains the files file1.c and file3. The v
option to tar specifies that the program be verbose and display the files it is archiving as they are being archived. The c option means create an archive file. The
f option means that what follows the options (the word "archive.tar") is to be the
name of the archive file. It is common to give this file a .tar extension. Following
the name of the archive file is a list of one or more filenames (or directories) to be
archived. To tar all of the .c files in a directory,
$ ls src
file01.c file02.c file03.c file01.o file02.o file03.o
$ cd src
$ tar cvf archive-c.tar *.c
file01.c
file02.c
file03.c
$ ls
archive-c.tar file01.c file02.c file03.c file01.o file02.o file03.o
To list the contents of a tar archive to the screen, i.e., to see what is in the archive, use the t option (t stands for table of contents),
$ tar tf archive-c.tar
file01.c
file02.c
file03.c
To extract the files from a tar archive use the x (extract) option,
29
$ cd
$ mkdir gromulate
$ mv src/archive-c.tar gromulate
$ cd gromulate
$ tar xvf archive-c.tar
file01.c
file02.c
file03.c
$ ls
file01.c file02.c file03.c
The file names will be displayed as each of them is extracted (because of the v
option). Tar is often use to archive an entire directory and its subdirectories,
$ cd
$ ls src
file01.c file02.c file03.c file01.o file02.o file03.o
$ tar cvf src.tar src
file01.c
file02.c
file03.c
file01.o
file02.o
file03.o
$ ls
a.out file1.c file3 src src.tar
$ tar tf src.tar
file01.c
file02.c
file03.c
file01.o
file02.o
file03.o
This compresses the file and leaves file1.c.gz in the current directory; the original file file1.c is replaced by the .gz file. To decompress the file,
$ gzip -d file1.c.gz
$ ls
a.out file1.c file3 src
This uncompresses the file and leaves file1.c in the current directory; the .gz file
is removed. The program bzip2 is an alternative compression program which
30
uses a different algorithm than gzip; often compressing with bzip2 will result in a
smaller compressed file than if gzip were used. It works similarly to gzip,
$ bzip2 file1.c
This compresses the file and leaves file1.c.bz2 in the current directory; the original file file1.c is replaced by the .bz2 file. To decompress the file,
$ bzip2 -d file1.c.bz2
This uncompresses the file and leaves file1.c in the current directory; the .bz2
file is removed. To compress every .c file in a directory,
$ cd
$ ls src
file01.c file02.c file03.c file01.o file02.o file03.o
$ cd src
$ gzip *.c
$ ls
file01.c.gz file02.c.gz file03.c.gz file01.o file02.o file03.o
4.17 Hexdump
A hex dump is a display of the contents of a file in hexadecimal format. When
we want to "see inside" a binary file, we often will use a program that produces a
hex dump. In GNU/Linux the program is, aptly, named hexdump. In its simplest
form, you would type hexdump file to view the contents of file in hex. More
commonly, it is useful to use "canonical" mode by typing a -C option, e.g.,
hexdump -C file. Note that hexdump will display the entire contents of the file on
the terminal; you will probably want to pipe the output through less, for example,
hexdump -C file | less. There are many other options you can explore by reading the man page.
31
5 Getting Help
5.1 Man Pages
Historically the online Unix manual was divided into eight sections:
1.
2.
3.
4.
5.
6.
7.
8.
Commands
System Calls
Library Functions
Special Files
File Formats
Games
Miscellaneous Information
System Administration
The man (manual) command can be used in *nix to get help on many commands. For example, type man cd to display the "man page" for the cd command. Typing man kill would display information on the kill command from section 1 of the manual. However, if there is also information about "kill" in section 2,
one could type man 2 kill to display that entry (on some systems you may have
to enter man -s 2 kill instead).
When reading man pages you may see a command or phrase followed by a
number in parentheses, such as chmod(2). This tells you that man information
about chmod can be found in section 2.
Man output is displayed with less so you can use the Page Up and Page Down
keys to scroll through the manual page (or use the b and f keys; b for backward; f
for forward). Hit q to get out of less.
5.1.1 Reading a Man Page
Man pages can be difficult to read sometimes because of the complexity of the
commands and the various options and arguments. Nonetheless, if you really
want to become a proficient *nix user, you will have to learn how to read them.
There is no standard format for a man page, but many of them are organized in a
a similar format:
Heading
Name
Synopsis
Description
Environment
Author
Files
Copyright
See Also
Meaning
name and purpose of the command
syntax of the command; shows whether it accepts options or args
full description of the command; may be quite lengthy (e.g., bash)
environment variables used by the command
the person or persons who wrote the program
list of important files to this command
who holds the copyright to the program
where to look for related information
32
Heading
Diagnostics
Bugs
Meaning
possible errors and warnings
known mistakes or shortcomings
For example, let's consider the man page for the cp command.
CP(1)
User Commands
CP(1)
NAME
cp - copy files and directories
SYNOPSIS
cp [OPTION]... [-T] SOURCE DEST
cp [OPTION]... SOURCE... DIRECTORY
cp [OPTION]... -t DIRECTORY SOURCE...
DESCRIPTION
Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.
Mandatory arguments to long options are mandatory for short options too.
-a, --archive
same as -dpR
--backup[=CONTROL]
make a backup of each existing destination file
-b
--copy-contents
copy contents of special files when recursive
-d
-f, --force
if an existing destination file cannot be opened, remove it and try again
-i, --interactive
prompt before overwrite
-H
-l, --link
link files instead of copying
-L, --dereference
always follow symbolic links in SOURCE
-P, --no-dereference
never follow symbolic links in SOURCE
-p
same as --preserve=mode,ownership,timestamps
--preserve[=ATTR_LIST]
preserve the specified attributes (default: mode,ownership,timestamps),
if possible additional attributes: context, links, all
--no-preserve=ATTR_LIST
don't preserve the specified attributes
--parents
use full source file name under DIRECTORY
-R, -r, --recursive
copy directories recursively
--remove-destination
remove each existing destination file before attempting to open it
(contrast with --force)
--sparse=WHEN
control creation of sparse files
--strip-trailing-slashes
remove any trailing slashes from each SOURCE argument
-s, --symbolic-link
make symbolic links instead of copying
33
-S, --suffix=SUFFIX
override the usual backup suffix
-t, --target-directory=DIRECTORY
copy all SOURCE arguments into DIRECTORY
-T, --no-target-directory
treat DEST as a normal file
-u, --update
copy only when the SOURCE file is newer than the destination file or when
the destination file is missing
-v, --verbose
explain what is being done
-x, --one-file-system
stay on this file system
--help display this help and exit
--version
output version information and exit
By default, sparse SOURCE files are detected by a crude heuristic and the
corresponding DEST file is made sparse as well.
That is the behavior
selected by --sparse=auto. Specify --sparse=always to create a sparse
DEST file whenever the SOURCE file contains a long enough sequence
of zero bytes. Use --sparse=never to inhibit creation of sparse files.
The backup suffix is `~', unless set with --suffix or SIMPLE_BACKUP_SUFFIX.
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable. Here are the values:
none, off
never make backups (even if --backup is given)
numbered, t
make numbered backups
existing, nil
numbered if numbered backups exist, simple otherwise
simple, never
always make simple backups
As a special case, cp makes a backup of SOURCE when the force and backup options
are given and SOURCE and DEST are the same name for an existing, regular file.
AUTHOR
Written by Torbjorn Granlund, David MacKenzie, and Jim Meyering.
REPORTING BUGS
Report bugs to <bug-coreutils@gnu.org>.
COPYRIGHT
Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL
version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
The full documentation for cp is maintained as a Texinfo manual.
If the info and cp programs are properly installed at your site, the command
info cp
should give you access to the complete manual.
The following sections are present in this man page: Name, Synopsis, Description, Author, Reporting bugs, Copyright, See Also.
The Name section is very simple: it states that the name of the command is cp
and that it performs the action, "copy file and directories."
34
The Synopsis section states that there are three different "forms" of the cp command. The brackets indicates that the item is optional and the ellipses indicate
or more (as in one or more). Therefore each of these three forms of the cp
command start with the letters "cp" followed by zero or more options.
The first form (cp [OPTION]... [-T] SOURCE DEST) states that the options
can be followed by a -T option and then a source (SOURCE) and a destination
(DEST). To know what -T does you have to read about it in the Description section. There it tells us that -T tells cp to treat DEST as a normal file and not as a
destination directory. Thus we are copying SOURCE to a file and not to a directory. What is SOURCE? It's a single file. An example of this command would be cp
-T afile ../someotherfile. Note that if ../someotherfile is a directory, you will get
an error message.
The second form (cp [OPTION]... SOURCE... DIRECTORY) states that multiple source files can be copied to a single directory. Note that the directory must
be the last argument on the command line. An example of this command would
be cp file1 file2 file3 somedir.
The final form (cp [OPTION]... -t DIRECTORY SOURCE...) requires the -t
option. Looking in the Description section we see that -t means, "copy all
SOURCE arguments into DIRECTORY". The target directory is specified following the -t option, and the list of one or more source code files is on the end of the
command line. An example of this command would be cp t ../../somedir file1
file2 file3.
Next is the Description section. This is usually the lengthiest section in a man
page. Among other things, the Description section will provide a description of
each of the options. We're not going to go through all of them here, but let's just
look at a couple of them.
The -i (or --interactive) option will prompt you before it overwrites the destination
file, e.g.,
$ cp i file1 file2
cp: overwrite `file2'?
At this point you hit 'y' or 'n' to either allow or disallow the overwrite.
The -u (--update) option will copy the source to the destination only when the
source is newer or when the destination is missing.
The Author section is self-obvious. Note that if you find a genuine bug in the cp
program, you are told whom to contact in the Reporting Bugs section. Finally, the
35
See Also section tells you that there is more information about cp in info (described Section 5.3).
5.2 Apropos
Sometimes you know what you would like to do but you are not sure what command to use. For example, suppose you want to rename a file but you don't know
the command. You could type man -k rename and man would display a list of
manual pages with a one-line description that contains the word "rename". You
could look through that list to see if there is an entry for a command that renames
files. The apropos foo command is equivalent to man -k foo. Note that apropos
is pronounced "ap-pro-po" and not "ap-pro-pose".
5.3 Info
Info is an online help system, separate from the *nix manual, that is used to document GNU utilities. To get help on the bzip2 command you would type info
bzip2. Where info differs from man is that info documentation is organized into a
hierarchical document of nodes (it's a tree structure) and the user uses keypresses to navigate around the document. Info is, like the Emacs editor, incredibly complicated to use at first. Important info commands are summarized below.
General Commands
q
quit
z
start help tutorial
?
display command summary
^x 0
get out of the command summary window pane (that's a zero)
Reading a Node
Pg Dn
display next screenful of text
Pg Up
display previous screenful of text
b
jump to first line of text in node
Up
move cursor up one line of text
Down
move cursor down one line of text
Left
move cursor one char to the left
Right
move cursor one char to the right
Moving Around Nodes
n
jump to next node in document
p
jump to previous node in document
t
jump to the top node in the document
u
move up from this node to the parent node
Home
move to top line of node
End
move to bottom line of node
36
Hyperlinks to other nodes are of the form * node:. For example, shown below is
part of the info node for the bzip2 command. The first hyperlink in this node is *
Menu:. The cursor is currently in column one of row 1. Pressing tab will move the
cursor to the Menu link. Pressing Return, then, would display the "Menu" node.
48
49
50
51
52
http://tldp.org
http://www.gnu.org/software/ bash/manual/bashref.html
http://www.computer hope.com/unix.htm
http://gcc.gnu.org
http://gcc.gnu. org/onlinedocs
37
In the second set command, the output is being piped through the less program.
We will discuss piping in more detail in Chapter 8. To change the value of an environment variable, use varname=value,
$ MYNAME=Kevin
$ set | grep MYNAME
MYNAME=Kevin
Setting an environment variable at the Bash command prompt only changes the
value of the environment variable for this session. To make the change permanent (so the variable will be defined when you log in again), edit your .bashrc file
in your home directory, and add the lines,
MYNAME=Kevin; export MYNAME
PATH=/sbin:$PATH:.; export PATH
The special file .bashrc contains commands that are executed whenever a "new
environment" is created, i.e., whenever a new shell is started54. The next time
you log in the environment variables will be updated. To delete an environment
variable, use the unset command, e.g., unset MYNAME.
55
http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_02.html
The file .bash_profile contains commands that are executed when you log in to the *nix system. It differs from .bashrc
in that the commands in .bash_profile are only executed once, during log in. On the other hand the commands in
.bashrc are executed by the shell every time a new shell is started.
http://tldp.org/HOWTO/Bash-Prompt-HOWTO
38
56
http://tldp.org/HOWTO/Bash-Prompt-HOWTO/x329.html
39
7 Searching Files
It is very common to want to search a file for a certain string. The grep (get regular expression) command57 can be used to do this,
$ cd
$ grep MYNAME .bash_profile
MYNAME=Kevin
The string you are searching for is typed first and then you type the filename of
the file in which you wish to search. To search for the string in every file of the
current directory try grep string * (or grep string .* to include hidden files).
Grep can be used for more than simply searching a file or group of files for a
fixed string. Grep allows searching files for patterns which are specified using a
particularly obtuse syntax. We are not going to discuss grep patterns in this
course, but remember there is a lot of good online documentation that can be
used to learn more. For example, see 58 or 59.
57
58
59
http://en.wikipedia.org/wiki/Grep
http://www.gnu.org/software/grep/doc/grep.html
http://www. panix.com/~elflord/unix/grep.html
40
The > symbol is used to send cat's output to the file fooie rather than to the console window. Note that we have just made a copy of file01.c. The < symbol is
used to redirect stdin. Consider a C program that reads in two integers and prints
out the larger one.
41
/* larger.c */
#include <stdio.h>
int main() {
int n1, n2;
fprintf(stdout,"Enter an integer: "); fscanf(stdin,"%d",&n1);
fprintf(stdout,"Enter an integer: "); fscanf(stdin,"%d",&n2);
fprintf("The largest integer was the ");
if (n1 > n2) fprintf(stdout,"first one: %d\n",n1);
else fprintf(stdout,"second one: %d\n",n2);
return 0;
}
Since this program reads its input from stdin and writes its output to stdout, we
can redirect its input and output so it reads its input from a file and sends its output to a file. For example, suppose we create a text file named larger.in which
has two lines, the first with a 2 and the second with a 4. The following command
will execute our program and send its output to larger.out.
$ cat larger.in
2
4
$ larger < larger.in > larger.out
$ cat larger.out
Enter an integer: Enter an integer: The largest integer was the second
one: 4
This is a commonly used quick-and-dirty way to create a small, simple text file
without invoking a text editor.
42
Note that < is really 0< and > is really 1>, so the above command is equivalent
to,
$ larger 0< larger.in 1> larger.out 2> larger.err
Stderr is not used in C and C++ programs as much as stdout is. The idea with
stderr is that error messageswhich are special in that they are not part of the
normal program outputcan be sent to a different location than the normal output. For example, when running a program, you might send stdout to an output
file and stderr to the console window,
$ blobx < data.in 1> data.out
Since the normal connection for stderr is to the console window, we only need to
redirect stdout to the output data file. Any error messages printed to stderr will
appear on the console window.
Anything written to stdout within the program will be redirected to /dev/null and
will not be displayed. Stderr can also be redirected to /dev/null.
43
red blue
Here, the contents of the files file1, file2, and file3 have been concatenated together when creating the new file file4. Now you can see where the name cat
came from (it is short for concatenate)
The drawback is that you have to specify the same filename twice. To send
stdout and stderr to the same file without having to specify the filename twice,
use 1>&2 or 2>&1,
$ blobx 1> data.out 2>&1
The notation 2>&1 tells the shell to send stderr's output (that is, 2>'s output) to
wherever stdout (that is 1>) is going. Note that if 2>&1 is used then 1> must appear before 2>&1. If 1>&2 is used (to send stdout to wherever stderr is going)
then 2> must appear before 1>&2. If you do not do that, then something else
happens. For example, $ blobx 2>&1 1> file.err would send stdout to file.err
and stderr would go to wherever stdout normally goes, i.e., the console window.
The notation >> is equivalent to 1>> and the notation 2>> can be used to concatenate to stderr.
44
8.8 Pipes
To connect the stdout of one program to the stdin of another program, in *nix we
use a pipe (the output flows out one program through the pipe to the input of the
other program). The pipe symbol is | and the way to send the output of one program to the input of another program is program1 | program2. For example, to
display a long listing of the files in a directory and sort them based on the time
the file was last modified,
$ ls -l | sort +7
The program sort sorts its input (which is a text file) and sends the sorted output
to stdout. The option +7 tells sort to sort the input file based on the data in field 7
which in the ls -l output is the column which contains the timestamp. To sort on
size try ls -l | sort + 4. To display the contents of a directory and display the output using less,
$ ls -l | less
45
9 Processes
When you run a program in *nix, the OS loads it into memory and sets up some
other information concerning your program. The loaded and running program is
called a process. When the program terminates, so does the process.
Every process has a system-wide unique identifier called the process id, or pid;
it is just an integer. The process is owned by the user who launched it. The ps
command will show you a list of the running processes of which you are the
owner.
$ ps
PID TTY
17776 pts/18
19041 pts/18
TIME CMD
00:00:00 bash
00:00:00 ps
I am running two programs right now: bash and the ps program itself. The PID of
each process is shown. The time column is the cumulative amount of CPU time
the process has consumedthis is not the same as wall time, i.e., how many
seconds it has been since you ran the program.
To see all of the processes running on the system, type ps -A or ps -e. The
command ps -f will display full process information including the user id of the
process owner. It can be combined with -A or -e. To sort this list by user id and
view it in less, use ps -ef | sort | less.
When you run a program (i.e., a process) you can ask the OS to run it in the
background. Do this by putting a & on the end of the command line,
$ some-program &
$ now-you-can-type-another-command
Now some-program will run in the background and you can get back to work
typing commands. To bring a running program back into the foreground, type the
fg command,
$ fg
Now some-program would be running and if it is displaying output, the output
would be going to the console window. To put the program back into the background, hit Ctrl+Z to pause it, and then type bg to send it back to the background.
To see a list of your processes running in the background use the jobs command. If you have more than one background job they will be numbered with a
46
job number 1, 2, 3, .... To bring job number 3 to the foreground, type fg %3. To
send this job back to the background, hit Ctrl+Z, and type bg.
To terminate a runaway process, use the kill command. The syntax is kill pidnumber. You cannot kill processes that you do not own.
To kill a process running in the background, issue a jobs command to see your
running processes and the determine the job number. Then use kill job number
to kill the job; e.g., kill %3 to kill job number 3.
The top command will display a table of the processes which are currently using
the most CPU time. It is a highly configurable program; try man top. (Be aware
that top is a fairly resource-intensive program; it's not cool or nice to run top a
lot).
The nice command was designed to allow a user to run a program with lower
than normal priority, so it will not hog the CPU as much. The idea is that other
people may be peeved if you're sucking up too much CPU time, so you can be
nice about it and still run your program. The syntax is nice -n adjust command
where command is the program you want to run (e.g., it could be a big make
process). Adjust is an integer which determines the adjusted process priority:
-20 is highest, to 19 lowest. No one ever uses nice.
47
10 Editing Files
10.1 Absolutely Painless Introduction to Vim
The VI editor60 is the standard editor available on most *nix systems. Some
people who know it very well, love it. I don't know it very well; I don't love it. I've
never taken the time to learn it very well because I would rather gouge my eyes
out with infected hypodermic syringes found in the dumpster behind the plasma
donation center until bloody pus oozes out of them than learn VI61; I simply do
not have the time for it. Nonetheless, learning a bit of VI can be useful, if for nothing else, you may find yourself in a situation where it is the only editor that is
available on the system you are on. At the least, knowing how to load a file,
make some simple changes to it, and save it can be useful.
The version of VI found on most GNU/Linux systems is Vim (VI Improved)62.
Purdue University has a good VI tutorial63. Also see 64.
10.1.1 Starting VI
At the Bash command prompt, type vim or vim some-filename. VI is a modal
editor in which you are either in editing mode or command mode.
10.1.2 Getting out of VI
This is the most important thing you need to know about VI: how to get out of it. If
you are in editing mode, hit ESC to enter command mode. Enter :wq to write
your file and quit. To quit without writing the file, hit :q. If you really want to quit
without saving the file you have made changes to hit :q!. To save the file under a
different name, try :wq new-filename.
10.1.3 Switching Between Editing Mode and Command Mode
If you are in editing mode, hit ESC to enter command mode. If you are in command mode hit i to enter insert (editing) mode. If you cannot tell which mode you
are in, trying hitting ESC several times to make sure you are in command mode,
then type :set showmode. This may tell VI to let you know when you are in insert mode (it depends on the version of VI you are using).
10.1.4 Other Useful Settings
All of these are recommended in addition to :set showmode. In command mode,
type :set nocompatible to enable advanced VIM features. Type :set ruler to tell
VI to display your current cursor position. Type :set number to display line numbers.
60
61
62
63
64
http://en.wikipedia.org/wiki/Vi
It would be preferable if I were also on fire and being eaten alive by rabid hyenas at the same time. That would still be
more pleasurable than using VI.
http://en.wikipedia.org/wiki/Vim_(text_editor). It's really not improved because it's still VI.
http://engineering.purdue.edu/ECN/Support/KnowledgeBase/Docs/20020202121609
http://thomer.com/vi/vi.html
48
49
50
65
66
http://en.wikipedia.org/wiki/Editor_wars
http://en.wikipedia.org/wiki/ Ed_(Unix)
51
11.1 Variables
Shell variables are used to store values. A variable can be defined at the same
time it is assigned by the = operator, e.g.,
LONG_NAME="Flintstone Fred"
SHORTNAME=Fred
Note in the first case that quotation marks are required because the string contains spaces. The backquote operator will execute the enclosed command and
assign the output to the designated variable, e.g.,
LS_OUT=`ls`
When referring to the variable, its name must be preceded by $, e.g., X1 = $Y2.
Here we are assigning the value of Y2 to X1. Note that Y2 is preceded by $ whereas X1 is not.
67
68
http://tldp.org/HOWTO/ Bash-Prog-Intro-HOWTO.html
http://tldp.org/LDP/abs/html
52
+
~
/
-
%
>>
<
>=
!=
<=
Variables can be assigned integer values using normal assignment, or using let,
e.g.,
x=33
let y=12
let z = 95
# error
The last statement will cause the shell script to fail; there should be no spaces to
the left and right of the = operator. Expressions are evaluated by enclosing them
in $[ ], e.g.,
x=33
let y=12
z=$[x/y]
echo $z
53
#!/bin/bash
# usage: rmo
mv tmp
rm *.o
$ ./rmo.sh
./rmo.sh: Permission denied
The shell script failed to run because shell scripts are not executable files by default. To make the shell script file executable, you must change the permissions
using the chmod command to add x,
$ ls -l rmo.sh
-rw-r--r-- 1 kburger2 kburger 122 Oct 17 11:35 rmo.sh
$ chmod 744 rmo.sh
$ ls -l rmo.sh
-rwxr--r-- 1 kburger2 kburger 122 Oct 17 11:35 rmo.sh
$ ./rmo.sh
Here dirname and pattern are variables. Note that when the variable is assigned
to (using =), we do not put a $ in front of the variable name, but we do use a $
when referring to the variable later on. If you don't, the shell will think that the variable name is the filename of a program to be executed and it will probably fail
with an error message.
54
11.5 Aliases
An alias is another name for something. In Bash, you can define aliases for
commands using the syntax alias name=command,
$ alias rmwild="./rmwild"
$ rmwild tmp *.o
$ alias gohome=cd ~
$ pwd
/usr/bin
$ gohome
$ pwd
/home/fredf
Aliases defined on the command line only exist for the current Bash session. If
you want the alias name to be available every time you log in, define the alias in
your .bashrc file.
To delete an alias that was defined at the command prompt, use the unalias
command, e.g., unalias rmwild. To delete all of your aliases, type unalias -a.
55
mv $1.tar.bz2 mybackups
echo "$1 backed up."
$ bu src
src backed up.
$ ls mybackups
src.tar.bz2
Or
#!/bin/bash
DOW=$(date +%a)
MONTH=$(date +%b)
DAY=$(date +%d)
YEAR=$(date +%Y)
echo "Today is ${DOW} ${MONTH}-${DAY}-${YEAR}"
$ alias today="./today.sh"
$ today
Today is Wed Oct-17-2007
11.10 If Statements
The syntax of the if statement is,
if [ conditional-expression ]; then
some commands
fi
or
if [ conditional-expression ]; then
some commands
else
some other commands
fi
56
The character "[" actually refers to a built-in Bash commandit is a synonym for
another command named test. The result of the conditional expression is 0 or
nonzero with 0 being false and nonzero being true. Some examples,
Use VI to edit if.sh
-------------------------------today=0
if [ $today = 1 ]; then
echo "today = 1"
else
echo "today <> 1"
fi
-------------------------------$ chmod 755 if.sh
$ ./if.sh
today <> 1
The test command has command line options for checking the status of files,
e.g.,
test
test
test
test
test
test
-d
-e
-r
-s
-w
-x
file
file
file
file
file
file
57
In a schell script, the ! symbol is used as the NOT logical operator, e.g.,
if [ ! -d src ]; then
echo '"'src'"' does not exist, creating directory.
mkdir src
fi
To learn more about conditional expressions I suggest you try $ man test.
Note the semicolon; it is important to not omit it. The commands are executed
once for every item in the list. The current item is accessed through the variable
$var, e.g.,
Use VI to edit for.sh
-------------------------------for i in 1 2 3 4 5 6 7 8 9 10; do
echo -n "$i "
done
echo
-------------------------------$ chmod 755 for.sh
$ ./for.sh
1 2 3 4 5 6 7 8 9 10
The option -n to echo means do not print a newline character after outputting the
text.
Another example,
for i in $(ls); do
echo $i
done
This script display the name of each of the files in the current directory one per
output line. Another example,
Use VI to edit ls-dirs.sh
-------------------------------for i in *; do
if [ -d "$i" ]; then
echo "$i [dir]"
else
echo $i
58
fi
done
-------------------------------$ mkdir foo1
$ mkdir foo2
$ chmod 755 ls-dirs.sh
$ alias ls-dirs=./ls-dirs.sh
$ ls
foo1 foo2 for.sh if.sh ls-dirs.sh
$ ls-dirs
foo1 [dir]
foo2 [dir]
The seq command (type man seq for help) can be used to generate an integer
sequence. The syntax is,
seq first increment last
which will generate an integer sequence starting at first, incrementing by increment, and stopping at last. For example, Here is a loop which displays the first
100 odd integers,
for i in `seq 1 2 100`; do
echo -n $i " "
done
Note that the seq command must be enclosed in backquotes `seq` so the shell
will execute it to generate the desired output. This loop prints the first 100 odd
integers in reverse order,
for i in `seq 99 -2 1`; do
echo -n $i " "
done
69
http://tldp.org/LDP/abs/html/loops1.html
59
$ cat testsum.sh
#!/bin/bash
v1=12
v2=39
v3=`expr $v1 + $v2`
echo $v3
$ ./testsum.sh
51
$
Note that the expr command must be enclosed in backquotes `` for this to
properly work. Typing man expr will display a help screen about the expr command.
The expr substr command can be used to extract a substring from a string variable, e.g., this program prints the individual characters of a string stored in a variable v,
v="foobar"
len=`expr length $v`
for i in `seq 1 1 $len`; do
echo -n `expr substr $v $i 1`
done
echo
which will extract len characters from the string string starting at position pos.
Note that the first character of the string string is at position 1.
60
In this course we will focus on learning to use the GCC C/C++ compilers, GNU
Make, and GDB. The compilers and GNU make will be discussed in this chapter.
The GNU debugger GDB is discussed in the next chapter.
where file.c is the filename of the C program source code file. This command will
compile the code and link it with the C libraries to create an executable file. This
70
71
72
http://en.wikipedia.org/wiki/Front_panel
http://en.wikipedia.org/wiki/GNU_toolchain
http://gcc.gnu.org/onlinedocs
61
executable will typically be named a.exe in Cygwin and a.out in *nix. To run the
program,
$ ./a.out
Then log out, and log back in. This line appends "." to the end of the list of directories already set in the PATH environment variable. The shell will examine the
directories in the order they are listed in the path when you type a command. It
will execute the program from the directory in which it first finds the program executable file. Hence, because "." appears last, it will only run the program in the
current directory if it is not present in any of the other directories in the PATH.
Now to execute your program you can type:
$ a.out
73
http://en.wikipedia.org/wiki/Command_line_argument
62
The output above tells me that the version of the GCC C compiler I have installed
is 3.4.4, and that it was compiled for the Cygwin/Mingw platform (I am using
Cygwin right now).
12.1.2.2 -ansi
Tells GCC to compile the C code as if it conforms to ANSI Standard C. This is
equivalent to the command line option: -std=c89. Non ANSI Standard C code
should be rejected as syntactically incorrect.
12.1.2.3 -Dname[=def]
This option defines a macro named name. These macros are interpreted by the
C preprocessor74. One common use of preprocessor macros is to include code
that is conditionally compiled only if the macro is defined. For example,
/* file.c */
int main() {
#ifdef DEBUG
printf("Calling init()\n");
#endif
init();
#ifdef DEBUG
printf("Calling process()\n");
#endif
process();
#ifdef DEBUG
printf("Calling terminate()\n");
#endif
terminate();
return 0;
}
$ gcc -DDEBUG file.c
If we compile with DEBUG defined, then the printf() statements will be compiled
as well, so when we run the program, the debugging printf() statements will be
executed. If we compile with DEBUG not defined, then the printf() statements will
not be compiled. This is a handy way to put debugging printf() statements in the
program and decide at compile time if we want them to be output or not.
As another example, consider writing C code for a program which depends on
the endianness75 of the bytes in an integer. Generally in the Intel x86 world, multibyte integers are stored in little-endian format. Motorola's PowerPC architecture uses big-endian. In order to make our C code portable (i.e., so it will run on
both an x86 system or a PowerPC system), we might do something like the following:
74
75
http://en.wikipedia.org/wiki/Preprocessor
http://en.wikipedia.org/wiki/Endianness
63
/* file.c */
...
#if ARCH == x86
/* little-endian specific code here */
#elif ARCH == ppc
/* big-endian specific code here */
#endif
...
$ gcc -DARCH=x86 file.c
12.1.2.4 -g
This option tells the C compiler to include additional debugging information in the
executable file so we can debug the program using GDB. If you do not compile a
file with -g, you can still use GDB to debug the program but symbolic information
(names and types of variables, line numbers, etc) will be missing.
12.1.2.5 -o file
Put the output in file. For example,
$ gcc file.c -o file
$ ./file
This command will compile and link the source code file file.c and will name the
executable file file rather than a.out.
12.1.2.6 -S (capital)
Stops the compiler after the assembly code generation phase, e.g.,
$ gcc -S file.c
This command will convert the code in file.c into functionally equivalent assembly language in the file named file.s. It will not assemble, link, and produce an
executable. This is a handy way to help yourself learn assembly language: write
some C code, compile it to assembly code, and study the assembly code to see
how it is implemented.
12.1.2.7 -c
Compile only, do not link.
$ gcc -c file.c
This command will compile file.c and produce the output file file.o. However, it
will stop at that point and not attempt to link to produce an executable.
64
Normally, when you make a project containing multiple C source code files, you
will compile all of the source code files individually using -c and compile the main
source code file last without using -c. We will discuss this in more detail when we
talk about makefiles later in this chapter. For example, suppose we have a
project with four C source code files: file1.c, file2.c, file3.c, and file_main.c:
$
$
$
$
gcc
gcc
gcc
gcc
-c
-c
-c
-o
file1.c
file2.c
file3.c
file file_main.c file1.o file2.o file3.o
produces file1.o
produces file2.o
produces file3.o
produces executable named file
12.1.2.8 -w
Inhibits all warnings. Warnings are messages from the compiler that something
about the code it is compiling seems fishy but it is not syntactically incorrect. The
canonical example would be,
if (a = b) {
...
}
This is perfectly legal C, but often it is a mistake because the programmer really
intended to write a == b. Some compilers will issue a warning message when
they compile this code to let the programmer know that what the compiler is seeing may not be what the programmer expected.
If you compile with -w these sorts of warning messages will not be displayed. In
general, warning messages are a good thing and your code should compile successfully without any errors or warnings. If you do get warnings from the compiler
you should fix them. For warnings which are issued by the compiler for code
which is okay, then pragmas76 or __attributes__77 should be used to inform the
compiler that what you wrote is what you intended.
12.1.2.9 -Wall
Turns on all warning checks. I highly advise you use this option and check all of
the warning messages generated by using it. This may be helpful in tracking
down difficult-to-find bugs which are the result of syntactically correct code.
12.1.2.10 Optimization Options
-O0 (i.e., dash capital-oh zero) turns off all optimization. This is the default. -O1,
-O2, and -O3 successively increase the amount of optimization that the compiler
performs on the final machine language code. -Os optimizes for size over speed.
76
77
http://gcc.gnu.org/onlinedocs/cpp/Pragmas.html
http://unixwiz.net/techtips/gnu-c-attributes.html
65
12.1.2.11 -Idir
(That is: dash capital-eye followed by dir). Add dir to the list of directories
searched for included header files. By default the GCC system directories and
the current directory will be searched.
12.1.2.12 -M and -MM
Useful for generating dependencies for makefiles. We will discuss this more
when we discuss makefiles.
12.1.2.13 -std=standard
Treats the code as conforming to C/C++ standard standard.
-std=c89
-std=c99
-std=gnu89
-std=gnu99
-std=c++98
-std=gnu++98
Compile using the g++ compiler. Define the macro dbg to YES. Search the directory include located under the current directory for header files. Search the directory lib located under the current directory for library files. Send the output
executable to the file named foo. Optimize the heck out of the code. Disable all
warnings. And compile the files foo.c, bar.cpp, and baz.cpp.
66
Note that C++ programs commonly have an extension of .cc, .cpp, or occasionally .c++. C++ header files are usually either .h or .hpp or do not have a filename extension.
78
79
http://www.faqs.org/docs/Linux-HOWTO/ Program-Library-HOWTO.html
http://en.wikipedia.org/wiki/Executable_and_Linking_Format
67
The option r to the ar command means insert the named file(s) into the archive
and replace any files already in the archive with the same filename. The c option
tells ar to create the archive without complaining about it if it does not exist. The
s option writes an index of the object files contained in the archive. There are additional command line options but these are sufficient to create a simple archive.
For more help, type man ar at the $ prompt.
The major disadvantage of static libraries is that each executable file that is produced by linking with the library will end up with identical copies of the object
code linked from the library. For example, in libm.a there is a function named
sin() that can be used to obtain the sine of an angle given in radians. If Bob
compiles a program executable named foo which uses sin() by linking with
libm.a then Bob's executable file will contain the machine language code for the
sin() function. If Alice does the same thing, then her program executable will also
contain the same machine language code.
Now consider when Bob and Alice both run their programs at the same time.
Each program will be loaded into memory with two identical copies of the machine language code for sin() taking up room in memory. This is not a very efficient allocation of resources.
12.3.2 Shared Libraries
A shared library is designed to overcome the major problem with static libraries.
When Bob links his code against the shared library, rather than including the machine language code for the sin() function in Bob's executable, a reference to the
machine language code will be inserted. This reference does not take up much
space. Alice's program will end up with the same sort of reference.
Now suppose Bob runs his program first. When it is loaded into memory, the OS
will check to see if the machine language code for the sin() function is loaded. If
it is not, it will be loaded. Then Bob's program code will be modified to point to
the address of the sin() function so that when Bob's program makes a call to
sin() it will jump to that address and execute the code of the shared sin() function. Alice's program will work similarly.
Shared libraries have a more complex naming scheme that static libraries. We
won't get into it in this course, but if you are interested, I would recommend reading the FAQ referenced at the beginning of this section.
The gcc command can be used to compile files and create a shared library. Also,
as with static libraries, we can link our code with existing shared libraries. In fact,
this happens automatically behind the scenes when we compile C programs calling common library functions such as printf(); gcc handles this for us.
68
Note that gcc already looks in /lib for library files by default so we do not need to
use -L/lib on the command line to tell gcc to look there. Note also that the syntax
-lm means link with the library named libm.a. In general, the syntax -lfoo means
link with the static library named libfoo.a (gcc will look in /lib and /usr/lib). Finally,
note that because the linking stage comes after the compilation stage in the build
process, the library file must be listed after the C source code file and output file.
It is an error to do this,
$ gcc -lm myprogram.c -o myprogram
Most of the standard *nix libraries are found in the /lib and /usr/lib directories,
e.g., /usr/lib/libc.a is the Standard C librarywhich is linked automatically so we
do not need to include -lc on the command line of the gcc command.
Many Standard C library header files are in /usr/include, e.g., /usr/include/stdio.h.
Use #include <> for system libraries. The compiler will know to look in /usr/ include and its subdirectories for the header file.
On the other hand, using #include "" will tell the compiler to look in the current
directory and any directories specified with the -I command line option for the
header file. In general, <> is used for system header files, and "" for your own
header files.
69
12.4 Makefiles
Most software projects of a reasonable size consist of multiple source code files,
for very large projects perhaps thousands. Consider these numbers,
Debian 4.0
Vista
Mac OS X 10.4
The project may also include libraries (prebuilt as well as built) and many other
types of resource files.
Dependencies: One source code file (A) may depend on another source code
file (B). Changes to B may cause A to change. In GNU terminology dependencies are called prerequisites.
One of the problems with large software projects containing hundreds or thousands of files is that we don't want to recompile every source code file every time
we do a build. Therefore, some automated way of recompiling only those source
code files that need to be recompiled is desired.
12.4.1 Building a Project the Hard Way
Now the hard way to build a multifile program would be,
$ gcc -c file1.c
wait
$ gcc -c file2.c
wait
$ gcc -c file3.c
wait
...
$ gcc -c file100.c
wait
$ gcc file1.o file2.o ... file100.o -I./include -o myprog -L./lib -lmylib
$ ./myprog
Oops, found a bug. Edit file4.c and correct the bug. Start typing again. Sigh.
12.4.2 Building a Project with a Script File
A slightly better approach would be to put all of these commands in a shell script:
#!/bin/bash
#File: b.sh
gcc -c file1.c
gcc -c file2.c
gcc -c file3.c
...
gcc -c file100.c
gcc file1.o file2.o ... file100.o -I./include -o myprog -L./lib -lmylib
80
70
$ alias b=./b.sh
$ chmod 755 b.sh
$ b
You would know the build is complete when the $ prompt comes back. Note that
you could send the build to the background,
$ b&
$ tetris
Problems with this approach? Suppose our program grows to contain 1000 files.
Every time a new source code file is added to the project, someone has to edit
the b.sh script to add the new file. Or, every time a source code file is removed
from the project, someone has to edit the b.sh script to delete the file. Or, every
time we rename a source code file... I hope you get the idea.
Furthermore, there is nothing in the script file that determines if a source code file
really has to be recompiled. Note that the only time a source code file must be
recompiled is when it's text is modified. Recompiling 100 files over and over
every time we perform a build, when only one or two of those files has changed
is simply too inefficient (unless you're on a contract and getting paid by the hour).
Surely there has to be a better way.
12.4.3 Introduction to Make
Make is a common *nix utility that was designed to automate the build process.
According to Wikipedia81, the first version was created at Bell Labs in 1977.
Make has evolved over the years and there are many variants of it (e.g., Apache
ANT and Maven for Java programmers). GNU Make82 is the GNU project's version of make (online documentation83).
Make is an extremely complicated program and would require us to spend much
more time than we have to fully discuss it. Therefore, we will focus on a very
simplified make process to give you a feel for how it is used. I highly recommend
you read the online documentation for further understanding.
Make processes one or more makefiles which contain rules (or instructions) for
how to build the program. The syntax of these makefiles is, in usual *nix fashion,
weird if you don't understand it84.
81
82
83
84
http://en.wikipedia.org/wiki/Make_(software)
http://www.gnu.org/software/make/
http://www.gnu.org/software/make/manual/make.html
It's weird even if you do understand it.
71
We will start with a simple makefile and build upon it to make (ha ha) a complete,
final makefile. Suppose my program is comprised of two C source code files
main.c and mutils.c. Here is an initial, reasonably simple makefile,
#!/bin/bash
# File: Makefile
myprog: main.o mutils.o
gcc main.o mutils.o -o myprog
main.o: main.c
gcc -c main.c
mutils.o: mutils.c
gcc -c mutils.c
Note that the lines starting with gcc are indented with a hard tab, i.e., not spaces. The words myprog, main.o, and mutils.o are called targets. They specify
what make is to build. For the target myprog, main.o and mutils.o are dependencies (or prerequisites). This says in English that the target myprog depends
on main.o and mutils.o. If either one of main.o or mutils.o changes, then myprog is "out of date" and must be rebuilt.
How does make know when a file has changed? It looks at the timestamp, i.e.,
the date and time the file was last modified. If main.o was last modified (written)
at 10/01/2007 11:15:23pm and main.c was last modified at 10/01/2007 11:18:21
then make assumes main.c has changed since the last time it was rebuilt (because the timestamp on main.c is more recent).
The target myprog is the default target because it appears first in the file. If we
do not specify a target for make, it will attempt to build the default target.
$ make Makefile
gcc -c main.c
gcc -c mutils.c
gcc main.o mutils.o -o myprog
$ make makefile
make: 'myprog' is up to date.
$
Note that typing make is equivalent to typing make Makefile, i.e., make will always try to read commands from Makefile; if that file cannot be found it will look
for makefile.
12.4.4 Make Command Line Options
Make's default behavior can be modified with command line options. Some of the
more useful ones are described here.
-f file
Read make commands from named file rather than from Makefile.
72
-d
-n
-s
--version
-B
-i
-k
--help
There has to be a better way to specify that every .c file should be recompiled to
a .o file when the .c file is out of date. There is, using implicit or suffix rules. Try,
$ make -p > make-defaults.txt
$ less make-defaults.txt
CC = cc
OUTPUT_OPTION = -o $@
...
COMPILE.c = $(CC) $(CFLAGS) $(CPPFLAGS) -c
...
%.o: %.c
$(COMPILE.c) $(OUTPUT_OPTION) $<
The rule starting with %.o is a suffix rule. The characters %.o specify that this is
a pattern rule. The characters $@ refers to the name of the target .o file. The
characters $< refers to the name of the prerequisite .c file. Collectively, $@ and
$< are called automatic variables in make terminology (there are other automat73
ic variables as well, you should read the online manual about $? which is also
very useful).
Therefore, since make already has a predefined (or built-in) rule for building a .o
file from a .c file, we can modify our makefile:
#!/bin/bash
# File: Makefile
myprog: file1.o file2.o file3.o ... file1000.o
gcc file1.o file2.o ... file1000.o -l<libs> -o myprog
Or,
#!/bin/bash
# File: makefile
myprog: main.o mutils.o
gcc main.o mutils.o -o myprog
Now make,
$ make
cc
-c -o main.o main.c
cc
-c -o mutils.o mutils.c
gcc main.o mutils.o -o myprog
$
Then make,
$ !m
make
cc -O2 -g
cc -O2 -g
$
-c -o main.o main.c
-c -o mutils.o mutils.c
74
Thus, we can modify $(CPPFLAGS) to specify g++ compiler command line options.
12.4.7 Handling Header Files
Notice that mutils.h is not listed in the makefile. Header files are not compiled
separately, but .c files do depend on them, e.g., if we change mutils.h then we
have to recompile main.c because main.c includes mutils.h (in other words,
changing the text of mutils.h is equivalent to changing the text of any .c file
which includes mutils.h).
Our makefile does not work correctly then because if I change mutils.h and perform a make, then make tells me that myprog is up to date. So how do we specify that changes to .h files should cause .c files that depend on them to be recompiled? We modify the prerequisites for the .o file to include the .h file(s) as well as
the .c file(s),
#!/bin/bash
# File: makefile
CFLAGS=-O2 -g
myprog: main.o mutils.o
gcc main.o mutils.o -o myprog
main.o: main.c mutils.h
mutils.o: mutils.c mutils.h
The line starting main.o basically says that main.o depends on main.c and mutils.h. A change to one or both of those files would cause main.c to be recompiled to produce an up-to-date main.o.
But, wouldn't having to remember and list which .h files a .o file depends on get
tedious for a large project? (Trust me, it would.) What if I add one or more new
#include header files to a .c source code file, or remove one or more header
files? Sounds like a headache. Never fear, gcc to the rescue. The -M option to
gcc will generate a dependency rule listing all of the .h files that a .o file depends on,
$ gcc -M main.c
main.o: main.c /usr/include/stdio.h /usr/include/_ansi.h \
/usr/include/newlib.h /usr/include/sys/config.h \
/usr/include/machine/ieeefp.h /usr/include/cygwin/config.h \
/usr/lib/gcc/i686-pc-cygwin/3.4.4/include/stddef.h \
/usr/lib/gcc/i686-pc-cygwin/3.4.4/include/stdarg.h \
/usr/include/sys/reent.h /usr/include/_ansi.h
/usr/include/sys/_types.h \
/usr/include/sys/lock.h /usr/include/sys/types.h \
/usr/include/machine/_types.h /usr/include/machine/types.h \
/usr/include/sys/features.h /usr/include/cygwin/types.h \
/usr/include/sys/sysmacros.h /usr/include/stdint.h \
/usr/include/endian.h /usr/include/sys/stdio.h
/usr/include/sys/cdefs.h \
mutils.h
$
75
Note that system header files not do change that often (usually only when a
newer version of the compiler is installed). Hence, they do not need to be listed in
the makefile. Use the -MM command line option to gcc to omit system header
files:
$ gcc -MM main.c
main.o: main.c mutils.h
$
= -O2 -g
SOURCES = main.c
\
mutils.c
myprog: main.o mutils.o
gcc main.o mutils.o -o myprog
%.d: %.c
rm -f $@; gcc -MM $< > $@
include $(SOURCES:.c=.d)
The %.d: %.c rule says that a file with a .d extension depends on a file with a .c
extension, i.e., foo.d depends on foo.c. The automatic variable $@ refers to the
target dependency (the .d file) and $< to the prerequisite dependency (the .c file).
When this rule is applied to a .c file, the corresponding .d file is deleted (because
of the "rm -f $@" command. Then "gcc -MM" is performed on the .c file and the
output is redirected to the .d file (this is the "> $@" part). For main.c and mutils.c
this would create the files main.d and mutils.d:
$ cat main.d
main.o: main.c mutils.h
$ cat mutils.d
mutils.o mutils.d : mutils.c
$
The SOURCES= line defines a macro variable which contains the names of all of
the .c files in the project. The include line states that in the $(SOURCES) macro
variable, every filename that ends with .c should be changed to .d. Then everyone of those .d files are included in the makefile. This is equivalent to writing the
line,
include main.d mutils.d
but is more flexible, because now if we add a new C source code file to the
project, all we have to do is add the name of the source code file to the
SOURCES= line. Now that we know that trick, we can make another change,
76
#!/bin/bash
# File: makefile
BINARY = myprog
CFLAGS = -O2 g
SOURCES = main.c
\
mutils.c
OBJECTS = $(SOURCES:.c=.o)
$(BINARY): $(OBJECTS)
# the binary depends on the object files
gcc $(OBJECTS) -o $(BINARY) # link the object files to produce the binary
%.d: %.c
rm -f $@; gcc -MM $< > $@
include $(SOURCES:.c=.d)
$(BINARY): $(OBJECTS)
# the binary depends on the object files
gcc $(OBJECTS) -o $(BINARY) # link the object files to produce the binary
%.d: %.c
rm -f $@; gcc -MM $< > $@
include $(SOURCES:.c=.d)
.PHONY: clean
clean:
rm -f $(OBJECTS)
rm -f *.d
rm -f myprog
Now run,
$ make clean
rm -f main.o mutils.o
rm -f main.d mutils.d
rm -f myprog
$
Open-source programs that you might download from the Internet and build on
your GNU/Linux box often come with makefiles provided. A common way to build
the software and install it is to perform the commands,
$ ./configure
$ make
$ make install
77
where configure is a shell script that actually creates the Makefile file and install is a phony target which causes make to copy/move the executable and other installation files to their final destinations (e.g., to /usr/local/bin).
As I mentioned earlier, make is a very complicated program. We have only
scratched the surface of what it can do here, but I urge you to read the online tutorials and manuals and learn more about it. Someday you will make your employer happy that you did.
78
13 Debugging
13.1 Text-Based Debugging with GDB
13.1.1 Launching and Quitting gdb
To start gdb and debug the executable a.out, at the command prompt type gdb
a.out. Note that in order to debug a.out you must have built a.out with the gcc -g
option; it also best if you do not compile with optimization, i.e., use -O0. You will
see a screen similar to the following. The gdb prompt is (gdb) which means it is
waiting for input.
The normal way to generate a segmentation fault is to try to read from or write to
memory that does not belong to you. Seems simple enough to avoid that, but
when you're using pointers, crazy things can happen. The dump tells me which
program I was running, ".../Homework02/a.out". I was executing code in a function named set_elem() and you can see the input parameters to the function
were array which was assigned the value 0x501010, row which was 1, col which
was 0, and value which was 1. This code is on line 30 of the source code file
Assgn03-04-Bonus.c. The very last line in the display (starting with 30) is the
source code statement on line 30 which caused the exception.
79
Now that I know the line number of the statement which caused the exception, I
can go back to my source code and examine it to see if I can find the bug. Note
that just because that is the line we were executing when the segmentation fault
occurred does not mean that that is the line where the actually bug lies. C is an
extremely powerful and exceedingly dangerous language. It is entirely possible
that some other statement somewhere accidentally modified this block of memory I am now accessing and trashed it. Hence, the bug may not be on this line of
code but somewhere else. These bugs are hard as heck to find; I once spent two
8-hr work days tracking one down.
13.1.3 Listing Source Code
Gdb is a source code debugger, which means you can display your source code
as you are debugging and also view the assembly language code. You can step
through your program line by line executing instructions and viewing the contents
of registers and memory. This is extremely powerful when attempting to locate
bugs. To print source code lines, use one of the variants of the list command
(abbreviated l; that is, lowercase ell):
list linenum
Prints lines centered around line number linenum. The number of lines is controlled by set listsize count. The default
value for count is 10. To see the current value of count, use
show listsize.
list function
list
list -
list first,last
list file:number
list file:function
disas addr1
disas addr1,addr2
To find the memory address of the beginning of a line of source code, use info
line line, e.g., info line 30. Gdb will respond with the starting memory address of
the assembly language code for the statement and the ending memory address
(a single HLL statement may compile down to multiple assembly language instructions). The command x/i will begin displaying instructions or you can use
disas addr1, addr2 to display the entire range. Pressing x repeatedly after x/i
will display one instruction at a time.
13.1.5 Examining Data and Variables
When your program is running, or when it crashes, you may want to see what
values were in certain memory locations at the time of the crash. This is done
with the print or p command.
p file::var
p var
x/5 addr
x/3d addr
x/4b 0x501010
x/4t 0x501010
x/fg 0x501010
x/5fg 0x501010
display/i $pc
delete display 1
disable display 2
enable display 2
info display
one or more assembly language instructions). The actual bug will be found during execution of one of these assembly language instructions.
step n
next
next n
finish
Run until the function you are in returns, then stop and display the return value.
until
stepi
Abbreviated si. Execute one machine language instruction. Using display/i $pc in conjunction with si is an easy
way to display each machine language instruction before it is
executed, and then execute the instruction and stop. That is,
you would type display/i $pc once, and then you could do
si, si, ... to single-step through the program executing one
machine language instruction at a time. Note that stepi will
step into functions.
si n
nexti
ni n
Once you enter one of these commands, for example, s or si, continuing to hit
ENTER will execute the same command again.
13.1.12 Breakpoints
A breakpoint is a marker which tells the debugger to stop at that marker. For
example, you can set a breakpoint on a certain source code line, and then run
your program with r. When that line is reached during execution, the debugger
will stop. You can then examine the state of memory and variables to try and find
the bug. Then you could single step from that point on, or you could continue
83
running until the next breakpoint is reached, e.g., if the line containing the breakpoint is reached again during execution, the debugger will break again.
Breakpoints can be set on source code lines or machine language instructions.
Often it is helpful to identify the function where the bug is, then to find the line
within the function, and then to finally identify the machine language instruction
containing the bug if need be. Breakpoints are set with the break or b command.
Breakpoints are disabled with the disable command. Break points are enabled
with the enable command. Breakpoints are deleted with the delete (d) commands. Information on breakpoints is obtained with the info (i) command.
continue
b function
b linenum
b file:linenum
b file:function
b *addr
tbreak args
rbreak rexpr
Argument rexpr is a regular expression. This command will set a breakpoint on all functions matching
the rexpr pattern. Since the regular expression .
matches every function, rbreak . will set a breakpoint
on every function in your program. Then you can hit c
to continue execution until the next function gets
called.
info break
disable bps
enable bps
84
en bps once
Enables the breakpoints once, then once each breakpoint is hit, disables the breakpoint again.
en bps delete
Enables the breakpoints once, then once each breakpoint is hit, deletes the breakpoint.
delete bps
cond bnum
rwatch foo
awatch foo
info wat
The expression to watch does not have to be a simple variable. We could tell gdb
that we want it to watch a certain region of memory and break when that region is
read from or written to.
watch 0x501010
watch 0x500-0x600
current directory. If the make succeeds, you can reload the executable using file
a.out.
To log the output of a debugging session, use the set logging on command. The
default log file is gdb.txt. Use set logging file file to change the logging file from
gdb.txt to file. By default gdb appends to the logging file; use set logging
overwrite on to switch to overwrite mode. Use set logging overwrite off to
switch back to append mode.
Gdb will repeat the previous command if you just press the RETURN key. For
example, if you are single-stepping through the program using si, just type si
once, and from then on you can just keep hitting the RETURN key to continue
single stepping.
By default, gdb assumes numbers you enter are in decimal. Use 0x to specify
hex. To change the input radix to hex permanently, use set input-radix 16. From
now until the end of your debugging session, gdb will assume numbers you enter
are in hex without you having to type 0x.
Getting help is accomplished by using help or Google.
http://www.gnu.org/software/ddd
86
In DDD you can interact with gdb through the console window at the bottom of
the DDD window. The command tool is turned on by selecting View | Command
Tool.
To set a breakpoint, move the mouse pointer to
the left margin of the source line where you want
to set a breakpoint, and right-click. Select Set
Breakpoint from the popup menu. A stop sign
icon will be displayed on the line.
To start running the program, click Program |
Run from the main menu, or click Run on the
command tool. This will run the program non-stop. To start running and stop on
the first line of main(), type start in the console window, or set a breakpoint on
the first line of the code where you want to stop.
To inspect the value of a variable, right click on it and select Print. It's value will
be printed in the console window. To add the variable to the display, right-click on
it, and select Display. If you don't see the display pane, click View | Data Window on the main menu. If you hold the mouse pointer over a variable, after .5
seconds a popup box will appear with the value of the variable.
Clicking View | Machine Code Window will show you the disassembly listing.
You can set breakpoints here as in the source code listing window.
To see the contents of the registers, click Status | Registers.
To display line numbers in the source code listing window, click Source | Display Line Numbers.
If you click Source | Edit Source, VI will be loaded with your source code. You
can make changes to the code, save it in VI, exit VI, and the code will be reloaded into DDD.
If you want to remake the project without exiting DDD, you can type make in the
gdb console window to make your project (assuming you have a makefile). The
issue the file your-exec-file command to reload the remade binary file.
To view information about the variables of the current stack frame, click Data |
Status Displays. Then click Local Variables of Current Stack Frame. The local variables will be displayed in the data window and will be updated as the program is being executed step-by-step, see below.
87
To examine memory, click Data | Memory. In the box that pops up you can select the format for the display (e.g., decimal, binary, hex, char), the number of
words to display, and the starting memory address. Clicking print will dump the
memory contents to the gdb console window. Clicking display will display the
memory contents in the data window (see below).
Well, that's it for this basic introduction to gdb and DDD. For more info, consult
the online documentation.
88