Compiler Design Laboratory
1. Lexical Analysis using LEX/Flex
Question: Write a LEX program to identify and count the number of
keywords, identifiers, operators, and numbers from a given C source file.
Answer:
This program uses LEX, a tool for generating lexical analyzers. The logic is
defined in a .l file. The rules section uses regular expressions to match
different patterns (tokens).
counter.l file:
%{
#include<stdio.h>
int kw_count = 0;
int id_count = 0;
int op_count = 0;
int num_count = 0;
%}
/* Rule Section */
%%
"int"|"char"|"float"|"double"|"if"|"else"|"while"|"for" { kw_count++; printf("\
nKeyword: %s", yytext); }
[a-zA-Z_][a-zA-Z0-9_]* { id_count++; printf("\nIdentifier: %s", yytext); }
[0-9]+ { num_count++; printf("\nNumber:
%s", yytext); }
"+"|"-"|"*"|"/"|"="|"=="|"<"|">"|"<="|">="|"!=" { op_count++;
printf("\nOperator: %s", yytext); }
. { /* Ignore all other characters */ }
\n { /* Ignore newlines */ }
%%
int yywrap(void){}
int main(int argc, char *argv[]){
if (argc > 1) {
FILE *file = fopen(argv[1], "r");
if(file) {
yyin = file;
yylex(); // Start the lexical analysis
printf("\n\n--- Analysis Report ---\n");
printf("Total Keywords: %d\n", kw_count);
printf("Total Identifiers: %d\n", id_count);
printf("Total Numbers: %d\n", num_count);
printf("Total Operators: %d\n", op_count);
return 0;
How to compile and run:
1. Save the code as counter.l.
2. Run the LEX tool: lex counter.l (This generates lex.yy.c).
3. Compile the C file: gcc lex.yy.c -o counter
4. Create a test C file, say test.c, with content like: int a = b + 10;
5. Run the program: ./counter test.c
Expected Output:
Keyword: int
Identifier: a
Operator: =
Identifier: b
Operator: +
Number: 10
--- Analysis Report ---
Total Keywords: 1
Total Identifiers: 2
Total Numbers: 1
Total Operators: 2
2. Syntax Analysis using YACC/Bison
Question: Write a YACC program to validate arithmetic expressions like a +
b * c. The expression should handle +, *, and parentheses ().
Answer:
This requires two files: a LEX file to provide tokens to the parser and a YACC
file to define the grammar rules.
lex.l file (for tokenization):
%{
#include "y.tab.h" // Header file generated by YACC
%}
%%
[0-9]+ { return NUMBER; }
[a-zA-Z]+ { return IDENTIFIER; }
[+\-*/()] { return yytext[0]; }
\n { return 0; } /* End of input */
. { /* Ignore other characters */ }
%%
int yywrap(void) {
return 1;
yacc.y file (for grammar rules):
Code snippet
%{
#include <stdio.h>
int yylex(void);
void yyerror(char const *s);
%}
%token NUMBER IDENTIFIER
/* Operator precedence and associativity */
%left '+' '-'
%left '*' '/'
%%
/* Grammar Rules */
program:
program statement '\n'
statement:
expression
printf("Valid Expression\n");
expression:
expression '+' expression
| expression '-' expression
| expression '*' expression
| expression '/' expression
| '(' expression ')'
| NUMBER
| IDENTIFIER
%%
void yyerror(char const *s) {
fprintf(stderr, "Invalid Expression: %s\n", s);
int main(void) {
printf("Enter an arithmetic expression:\n");
yyparse();
return 0;
How to compile and run:
1. Save the files as lex.l and yacc.y.
2. Run YACC: yacc -d yacc.y (This generates y.tab.c and y.tab.h).
3. Run LEX: lex lex.l.
4. Compile everything: gcc y.tab.c lex.yy.c -o parser.
5. Run the parser: ./parser.
6. Enter an expression like a * (b + 5) and press Enter.
Expected Output:
Enter an arithmetic expression:
a * (b + 5)
Valid Expression
3. Intermediate Code Generation
Question: Generate Three-Address Code (TAC) for the statement x = (a + b)
* (c - d / e).
Answer:
Three-Address Code (TAC) is an intermediate representation where each
instruction has at most three operands. We use temporary variables (t1, t2,
etc.) to store intermediate results.
The statement is x = (a + b) * (c - d / e).
The TAC is generated as follows:
1. t1 = d / e
2. t2 = c - t1
3. t3 = a + b
4. t4 = t3 * t2
5. x = t4
Representation as Quadruples:
Quadruples are a common way to implement TAC, using four fields:
(operator, arg1, arg2, result).
Operato Arg Arg Resul
r 1 2 t
/ d e t1
- c t1 t2
+ a b t3
* t3 t2 t4
= t4 - x
4. Code Optimization
Question: Apply Common Subexpression Elimination to optimize the
following code block:
a = b * c + d;
e = b * c * f;
Answer:
Common Subexpression Elimination is an optimization technique that
finds instances of identical expressions and replaces them with a single
variable holding the computed value.
1. Original Three-Address Code:
t1 = b * c
a = t1 + d
t2 = b * c
t3 = t2 * f
e = t3
2. Analysis:
The expression b * c is computed twice. This is a "common subexpression".
3. Optimized Three-Address Code:
We compute b * c only once and store its result in t1. We then reuse t1 for
the second computation.
t1 = b * c
a = t1 + d
t3 = t1 * f // Replaced t2 with t1
e = t3
This optimized code avoids re-calculating b * c, making the program faster. ⚡
5. Target Code Generation
Question: Generate a simple, assembly-like code for the following Three-
Address Code:
t1 = a - b
t2 = t1 + c
Answer:
Target Code Generation is the final phase of the compiler, where
intermediate code is translated into machine-understandable code (like
assembly). We'll assume a simple register-based machine.
Three-Address Assembly-like
Description
Code Code
Move the value of a into Register
t1 = a - b MOV a, R0
R0.
Subtract the value of b from
SUB b, R0
Register R0.
Add c to R0. The result t2 is now
t2 = t1 + c ADD c, R0
in R0.
(Optional) Store the final result in
MOV R0, t2
t2.
Final Assembly Code:
Code snippet
MOV a, R0
SUB b, R0
ADD c, R0
MOV R0, t2
This sequence uses a single register (R0) to perform the calculations
efficiently.
Another type of Explanation on same topic
🧩 Compiler Design Laboratory — Question & Answer Set
Experiment 1: Lexical Analyzer using C
Q: Write a C program to implement a simple lexical analyzer for tokenizing a given expression.
A:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
int main() {
char expr[100];
int i = 0;
printf("Enter an expression: ");
gets(expr);
while (expr[i] != '\0') {
if (isspace(expr[i])) {
i++;
continue;
if (isalpha(expr[i])) {
printf("Identifier: ");
while (isalnum(expr[i])) {
printf("%c", expr[i]);
i++;
printf("\n");
else if (isdigit(expr[i])) {
printf("Number: ");
while (isdigit(expr[i])) {
printf("%c", expr[i]);
i++;
printf("\n");
else {
printf("Operator: %c\n", expr[i]);
i++;
return 0;
🧠 Concept: Breaks input into tokens (identifiers, numbers, operators).
Experiment 2: Lexical Analyzer using LEX Tool
Q: Write a LEX program to count the number of words, lines, and characters in a given text.
A:
%{
int lines = 0, words = 0, chars = 0;
%}
%%
\n { lines++; chars++; }
[ \t]+ { chars++; }
[^ \t\n]+ { words++; chars += yyleng; }
. { chars++; }
%%
int main() {
yylex();
printf("Lines: %d\nWords: %d\nCharacters: %d\n", lines, words, chars);
return 0;
🧠 Concept: Uses pattern matching to analyze text.
Experiment 3: Recognize Valid Identifier
Q: Write a LEX program to identify valid identifiers.
A:
%{
#include <stdio.h>
%}
%%
[a-zA-Z_][a-zA-Z0-9_]* { printf("Valid Identifier: %s\n", yytext); }
[0-9]+ { printf("Invalid Identifier (starts with digit): %s\n", yytext); }
.|\n { /* ignore */ }
%%
int main() {
yylex();
return 0;
}
🧠 Concept: Identifiers start with a letter or underscore, followed by letters/digits.
Experiment 4: Keyword Recognition
Q: Write a LEX program to recognize C keywords.
A:
%{
#include <stdio.h>
%}
%%
"if"|"else"|"while"|"do"|"for"|"int"|"float"|"char"|"return" {
printf("Keyword: %s\n", yytext);
[a-zA-Z_][a-zA-Z0-9_]* { printf("Identifier: %s\n", yytext); }
.|\n { }
%%
int main() {
yylex();
return 0;
🧠 Concept: Matches reserved keywords using regular expressions.
Experiment 5: Arithmetic Expression Evaluation (YACC)
Q: Write a YACC program to evaluate arithmetic expressions.
A (lex + yacc):
lex file (expr.l):
%{
#include "y.tab.h"
%}
%%
[0-9]+ { yylval = atoi(yytext); return NUM; }
[\t ]+ ;
\n { return 0; }
. { return yytext[0]; }
%%
yacc file (expr.y):
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token NUM
%left '+' '-'
%left '*' '/'
%%
S: E { printf("Result = %d\n", $1); }
E: E '+' E { $$ = $1 + $3; }
| E '-' E { $$ = $1 - $3; }
| E '*' E { $$ = $1 * $3; }
| E '/' E { $$ = $1 / $3; }
| NUM { $$ = $1; }
%%
int main() {
printf("Enter expression: ");
yyparse();
return 0;
int yyerror() { printf("Invalid Expression\n"); }
🧠 Concept: Demonstrates syntax parsing and evaluation.
Experiment 6: Operator Precedence Parser
Q: Write a C program to implement an operator precedence parser.
A (outline):
#include <stdio.h>
#include <string.h>
char stack[20];
int top = -1;
void push(char c) { stack[++top] = c; }
char pop() { return stack[top--]; }
int main() {
char input[20];
printf("Enter expression: ");
scanf("%s", input);
printf("\nParsing steps:\n");
for (int i = 0; i < strlen(input); i++) {
push(input[i]);
printf("Stack: %s\n", stack);
printf("\nExpression Parsed Successfully!\n");
return 0;
🧠 Concept: Demonstrates operator precedence parsing process.
Experiment 7: Recursive Descent Parser
Q: Write a recursive descent parser for the grammar
E → E+T | T
T → T*F | F
F → (E) | id
A:
#include <stdio.h>
#include <string.h>
char input[10];
int i = 0;
void E();
void T();
void F();
void E() {
T();
if (input[i] == '+') {
i++;
E();
void T() {
F();
if (input[i] == '*') {
i++;
T();
void F() {
if (input[i] == '(') {
i++;
E();
if (input[i] == ')')
i++;
else
printf("Missing )\n");
} else if (input[i] == 'i' && input[i+1] == 'd') {
i += 2;
} else {
printf("Error at position %d\n", i);
}
int main() {
printf("Enter the string: ");
scanf("%s", input);
E();
if (input[i] == '\0')
printf("String accepted\n");
else
printf("String rejected\n");
return 0;
🧠 Concept: Simulates a top-down parsing approach.
Experiment 8: Intermediate Code Generation
Q: Write a program to generate three-address code for simple expressions.
A:
Input: a = b + c * d
Output:
t1 = c * d
t2 = b + t1
a = t2
🧠 Concept: Converts high-level expressions into 3-address intermediate form.
Experiment 9: Symbol Table Construction
Q: Write a C program to construct a symbol table.
A:
#include <stdio.h>
#include <string.h>
struct symbol {
char name[20];
char type[10];
} table[20];
int n = 0;
void insert(char name[], char type[]) {
strcpy(table[n].name, name);
strcpy(table[n].type, type);
n++;
void display() {
printf("\nSymbol Table:\n");
printf("Name\tType\n");
for (int i = 0; i < n; i++)
printf("%s\t%s\n", table[i].name, table[i].type);
int main() {
insert("a", "int");
insert("b", "float");
insert("sum", "int");
display();
return 0;
🧠 Concept: Maintains variable name/type mapping.
Experiment 10: Code Optimization (Constant Folding)
Q: Write a simple program to perform constant folding optimization.
Example:
Input: x = 4 + 5
Output: x = 9
A (Pseudo code):
if (expr == "x = 4 + 5")
printf("x = 9");
🧠 Concept: Pre-computes constant expressions at compile time.