Language Fundamentals
Language Fundamentals
Exam Objectives
20
SECTION 2.1: LANGUAGE BUILDING BLOCKS 21
Supplementary Objectives
Lexical Tokens
The low-level language elements are called lexical tokens (or just tokens for short)
and are the building blocks for more complex constructs. Identifiers, operators and
special characters are all examples of tokens that can be used to build high-level
constructs like expressions, statements, methods and classes.
Identifiers
A name in a program is called an identifier. Identifiers can be used to denote classes,
methods and variables.
In Java an identifier is composed of a sequence of characters, where each character
can be either a letter, a digit, a connecting punctuation (such as underscore _) or any
currency symbol (such as $, ¢, ¥ or £), and cannot start with a digit. Since Java
programs are written in the Unicode character set (p. 24), the definitions of letter
and digit are interpreted according to this character set.
Note that Java is case-sensitive, e.g. price and Price are two different identifiers.
Keywords
Keywords are reserved identifiers that are predefined in the language, and cannot
be used to denote other entities. Incorrect usage results in compilation errors.
Keywords currently defined in the language are listed in Table 2.1. In addition,
three identifiers are reserved as predefined literals in the language: null, true, false
(Table 2.3). Keywords currently reserved, but not in use, are listed in Table 2.2. All
22 CHAPTER 2: LANGUAGE FUNDAMENTALS
these reserved words cannot be used as identifiers. The index contains references
to relevant sections where currently defined keywords are explained.
const goto
Literals
A literal denotes a constant value. This value can be numerical (integer or floating-
point), character, boolean or a string. In addition there is the null literal (null)
which represents the null reference.
Integer 2000 0 -7
Integer Literals
Integer datatypes are comprised of the following primitive types: int, long, byte
and short.
SECTION 2.1: LANGUAGE BUILDING BLOCKS 23
The default type of an integer literal is int, but it can be specified as long by
appending the suffix L (or l) to the integer value; for example 2000L, 0l. There is no
way to specify a short or a byte literal.
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 10 8
9 11 9
10 12 a
11 13 b
12 14 c
13 15 d
14 16 e
15 17 f
16 20 10
In Java, octal and hexadecimal numbers are specified with 0 and 0x prefix respect-
ively. Some examples of octal and hexadecimal literals are shown in Table 2.6.
8 010 0x8
10 012 0xa
16 020 0x10
27 033 0x1b
90 0132 0x5a
2147483647 017777777777 0x7fffffff
-2147483648 -017777777777 -0x7fffffff
Floating-point Literals
Floating-point data types come in two flavors: float or double.
The default type of a floating-point literal is double, but this can be explicitly
designated by appending the suffix D (or d) to the value. A floating-point literal can
also be specified to be a float by appending the suffix F (or f).
Floating-point literals can also be specified in scientific notation, for example 5E-1
is equivalent to 5*10-1, i.e. 0.5, where E (or e) stands for Exponent.
Boolean Literals
Boolean truth-values can be denoted using the reserved literals true or false.
Character Literals
A character literal is quoted in single-quotes (').
All characters are represented by 16-bit Unicode. The Unicode character set sub-
sumes the 8-bit ISO-Latin-1 and the 7-bit ASCII characters. In Table 2.7, note that
digits (1 to 9), upper-case letters (A to Z) and lower-case letters (a to z) have contig-
uous Unicode values.
SECTION 2.1: LANGUAGE BUILDING BLOCKS 25
Unicode Literals
Alternatively, a character literal can be defined by quoting the Unicode value, as
shown in Table 2.8.
Escape Sequences
Certain escape sequences define special character values as shown in Table 2.9. These
escape sequences can be single-quoted to define character literals. For example, the
character literals '\t' and '\u0009' are equivalent.
26 CHAPTER 2: LANGUAGE FUNDAMENTALS
\b \u0008 Backspace
\t \u0009 Horizontal tabulation
\n \u000a Linefeed
\f \u000c Form feed
\r \u000d Carriage return
\' \u0027 Apostrophe-quote
\" \u0022 Quotation mark
\\ \u005c Backslash
String Literals
A string literal is a sequence of characters, which must be quoted in quotation
marks and which must occur on a single line.
Escape sequences as well as Unicode values can appear in string literals:
"Here comes a tab.\t And here comes another one\u0009!" // (1)
"What's on the menu?" // (2)
"\"String literals are double-quoted.\"" // (3)
In (1), the tab character is specified using the escape sequence and the Unicode
value respectively. In (2), the single apostrophe need not be escaped in strings, but
it would be if specified as a character literal('\''). In (3), the double apostrophes in
the string must be escaped. Printing these strings would give the following result:
Here comes a tab. And here comes another one !
What's on the menu?
"String literals are double-quoted."
White Spaces
A white space is a sequence of spaces, tabs, form feeds and line terminator charac-
ters. Line terminators can be newline, carriage return or carriage return-newline
sequence in a Java source file.
A Java program is a free-format sequence of characters which is tokenized by the
compiler, i.e. broken into a stream of tokens for further analysis. Separators and
operators help to distinguish tokens, but sometimes white space has to be inserted
explicitly. For example, the identifier classRoom will be interpreted as a single token,
unless white space is inserted to distinguish the keyword class from the identifier
Room.
SECTION 2.1: LANGUAGE BUILDING BLOCKS 27
White space aids not only in separating tokens, but also in formatting the program
so that it is easy for humans to read. The compiler ignores the white spaces once
the tokens are identified.
Comments
A program can be documented by inserting comments at relevant places. These
comments are for documentation purposes and are ignored by the compiler.
Java provides three types of comments to document a program:
• A single-line comment
• A multiple-line comment
• A documentation (or Javadoc) comment
Regardless of the type of comment, they cannot be nested. The comment-start
sequences (//, /*, /**) are not treated differently from other characters when occur-
ring within comments.
Single-line Comment
All characters after the comment-start sequence // through to the end of the line
constitute a single-line comment.
// This comment ends at the end of this line.
Multiple-line Comment
A multiple-line comment, as the name suggests, can span several lines. Such a
comment starts with /* and ends with */.
/* A comment
on several
lines.
*/
Documentation Comment
A documentation comment is a special-purpose comment which when placed at
appropriate places in the program can be extracted and used by the javadoc utility
to generate HTML documentation for the program. Documentation comments are
usually placed in front of class, interface, method and variable definitions. Groups
of special tags can be used inside a documentation comment to provide more
specific information. Such a comment starts with /** and ends with */:
/**
* This class implements a gizmo
* @author K.A.M.
* @version 1.0
*/
Review questions
• Boolean type:
The datatype boolean represents truth-values true and false.
Primitive data values are atomic and are not objects. Each primitive datatype
defines the range of values in the datatype, and operations on these values are
defined by special operators in the language.
Each primitive datatype has a corresponding wrapper class that can be used to
represent a primitive value as an object. Wrapper classes are discussed in Section
10.3.
A variable declaration, in its simplest form, can be used to specify the name and
the type of variables. This implicitly determines their size and the values that can
be stored in them.
char a, b, c; // a, b and c are character variables.
double area; // area is a floating-point variable.
boolean flag; // flag is a boolean variable.
A declaration can also include initialization code to specify an initial value for the
variable:
int i = 10, // i is an int variable with initial value 10.
j = 101; // j is an int variable with initial value 101.
long big = 2147483648L; // big is a long variable with specified initial value.
In Java, variables can only store values of primitive datatypes and references to
objects.
Initializers for initializing member variables in objects, classes and interfaces are
discussed in Section 8.2.
It is important to note that the declarations above do not create objects of class
Pizza or Hamburger. They only create variables which can store references to objects
of these classes.
A declaration can also include an initializer to create an object that can be assigned
to the reference variable:
Pizza yummyPizza = new Pizza("Hot&Spicy"); // Declaration with initializer.
The reference variable yummyPizza can reference objects of class Pizza. The keyword
new, together with the constructor call Pizza("Hot&Spicy"), creates an object of class
Pizza. The reference to this object is assigned to the variable yummyPizza. The newly
created object of class Pizza can now be manipulated through the reference stored
in this variable.
SECTION 2.4: INTEGERS 31
2.4 Integers
Table 2.10 Range of Integer Values
Integer values are represented as signed with 2’s complement (Section 3.12, p. 64).
int i = -215; // int literal
int max = 0x7fffffff; // 2147483647 as hex int literal
int min = 0x80000000; // -2147483648 as hex int literal
long isbn = 05402202647L; // octal long literal
long phone = 55584152L; // long literal
2.5 Characters
Table 2.11 Range of Character Values
The char datatype encompasses all the 65536 (216) characters in the Unicode char-
acter set as 16-bit values. The first 128 characters of the Unicode set are the same as
the 128 characters of the 7-bit ASCII character set, and the first 256 characters of the
Unicode set correspond to the 256 characters of the 8-bit ISO Latin-1 character set.
See Section 18.4 on page 570 for a discussion on character encodings.
Floating-point numbers conform to the IEEE 754-1985 standard. Table 2.12 shows
the range of values for positive floating-point numbers, but these apply equally to
negative floating-point numbers with the '-' sign as prefix. Zero can be either 0.0
or -0.0.
Since the size for representation is finite, certain floating-point numbers can only
be represented as approximations.
float pi = 3.14159F;
double p = 314.159e-2;
double fraction = 1.0/3.0;
2.7 Booleans
Table 2.13 Boolean Values
The boolean datatype is used to represent logical values that can be either the literal
true or the literal false.
Boolean values are returned by all relational (Section 3.8), conditional (Section 3.11)
and boolean logical operators (Section 3.10), and are primarily used to govern the
flow of control during program execution.
Note that boolean values cannot be converted to other primitive data values, and
vice versa.
Review questions
2.4 Which of the following does not denote a primitive data value in Java?
Select all valid answers.
(a) "t"
(b) ’k’
(c) 50.5F
(d) "hello"
(e) false
2.6 Which integral type in Java has the exact range from -2147483648 (-231) to
2147483647 (231-1), inclusive?
boolean false
char '\u0000'
Integer (byte, short, int, long) 0
Static variables in a class are initialized to default values when the class is loaded,
if they are not explicitly initialized.
Instance variables are also initialized to default values when the class is instanti-
ated, if they are not explicitly initialized.
Note that a reference variable is initialized with the value null.
// Instance variables
int noOfWatts = 100; // Explicitly set to 100.
boolean indicator; // Implicitly set to default value false.
String location; // Implicitly set to default value null.
SECTION 2.9: INITIAL VALUES FOR VARIABLES 35
Example 2.1 illustrates default initialization of member variables. Note that static
variables are initialized when the class is loaded the first time, and instance vari-
ables are initialized accordingly in every object created from the class Light.
In Example 2.2, the compiler complains that the local variable thePrice in the
println statement at (1) may not be initialized. However, from the program it can
be seen that the local variable thePrice gets the value 100 in the last if-statement
before it is used in the println statement. The compiler does not perform a rigorous
analysis of the program in this regard. The program will compile correctly if the
variable was initialized in the declaration, or if an unconditional assignment is
made to the variable in the method.
36 CHAPTER 2: LANGUAGE FUNDAMENTALS
In Example 2.3, the compiler complains that the local variable oneLongString in the
println statement may not be initialized. Objects should be created and their state
initialized appropriately (for example, in a constructor) before use. If the variable
oneLongString is set to the value null, the program will compile. However, at
runtime, a NullPointerException will be thrown since the variable oneLongString will
not reference any object. The golden rule is to ensure that a reference variable
denotes an object before invoking methods via the reference, i.e. it is not null.
Arrays and their default values are discussed in Section 4.1 on page 88.
Review questions
2.8 In which of these variable declarations will the variable remain uninitialized
unless explicitly initialized?
Select all valid answers.
(a) Declaration of an instance variable of type int.
(b) Declaration of a static class variable of type float.
(c) Declaration of a local variable of type float.
(d) Declaration of a static class variable of type Object.
(e) Declaration of an instance variable of type int[].
SECTION 2.10: JAVA SOURCE FILE STRUCTURE 37
// Filename: NewApp.java
// PART 1: (OPTIONAL)
// Package name
package com.company.project.fragilePackage;
// Packages used
import java.util.*;
import java.io.*;
class C1 { }
interface I1 { }
// ...
class Cn { }
interface Im { }
// end of file
Review questions
package com.acme.toolkit;
class Other {
int value;
}
The command
java TooSmartClass
results in a call to the TooSmartClass.main() method. Note that any class can have a
main() method. Only the main() method of the class specified to the Java interpreter
is executed.
SECTION 2.11: THE MAIN() METHOD 39
Review questions
Chapter summary
The following information was included in this chapter:
• Explanation of identifiers, keywords, literals, white spaces, and comments.
• Explanation of all the primitive datatypes in Java.
• Declaration, initialization and usage of variables, including reference variables.
• Usage of default values for member variables.
• Structure of a Java source file.
• Declaration of main() method.
40 CHAPTER 2: LANGUAGE FUNDAMENTALS
Programming exercises
2.1 The following program has several errors. Modify it so that it will compile and run
without errors.
import java.util.*;
package com.acme;
2.2 The following program has several errors. Modify it so that it will compile and run
without errors.
// Filename: Temperature.java
PUBLIC CLASS temperature {
PUBLIC void main(string args) {
double fahrenheit = 62.5;
*/ Convert /*
double celsius = f2c(fahrenheit);
System.out.println(fahrenheit + ’F = ’ + celsius + ’C’);
}