Csc3120 Tutorial 2 - Javacc: Javacc Grammar File (Bantamjava - JJ) Javacc
Csc3120 Tutorial 2 - Javacc: Javacc Grammar File (Bantamjava - JJ) Javacc
Introduction
JavaCC stands for Java Compiler Compiler. It is a parser generator [like TOLD in
CSC3180 in last term]. It describes languages by BNF with regular expressions. It is purely
implemented in Java.
javacc
javac
Bantam Java Source Code input Bantam Java Compiler output MIPS code
(test.btm) (BantamJava.class) (test.s)
Environment Setup
Using JavaCC is not as simple as TOLD. Here we provide three ways to setup the proper
working environment: Command-Line Interface (CLI), NetBeans IDE and Eclipse IDE. We
assumed you have already setup Java environment properly and you work in lab. We en-
courage you to install software on either H drive or USB drive.
4. If the environment is setup correctly, you should be able to do something like in the
screenshots.
(under Windows)
(under sparc)
Netbeans IDE
1. Download the latest version of NetBeans IDE from
http://www.netbeans.org/downloads/. Please select platform as “OS Independent Zip”
and then download Java SE version.
6. Download the latest version of JavaCC plugin for NetBeans IDE from
http://plugins.netbeans.org/PluginPortal/faces/PluginDetailPage.jsp?pluginid=5402
Then, switch to Miscellaneous tab and find JavaCC sub-tab. Fill in JavaCC Home in
the textbox (e.g. H:\javacc-4.2\lib, please be aware that it is different from CLI setup,
which locates JavaCC at H:\javacc-4.2\bin, because NetBeans IDE requires the loca-
tion of JAR file rather than that of the executable) and click OK.
10. Create a project to contain JavaCC file (.jj)
First, select File > New Project... from the menu.
Choose Java from the Categories and Java Applications from Projects, then click Next.
Fill in Project Name (e.g. TestJavaCC) and Location (e.g. H:\NetBeansProjects), then
click Finish. A project is created.
Then you can create a new file inside the project. Choose Other from Categories and
Empty JavaCC file from File Types, then click Next. (If you cannot find such file type,
probably the JavaCC plugin is not properly installed)
Provide the file name of the JavaCC file (e.g. newJavaCCTemplate) and the folder (e.g.
src), then click Finish.
11. Finally edit the Ant file of the project (find build.xml from Files tab) and then we can run
JavaCC (generate Java files by parsing .jj file)
<target name="javacc" depends="-init-check">
<echo>${javacc.outdir}</echo>
<javacc outputdirectory="${javacc.outdir}"
javacchome="${javacc.home}"
target="${javacc.file}"
static="true"/>
</target>
12. Right-click of the JavaCC file and select JavaCC from the pop-up menu.
4. Extract the zip file in the Eclipse folder. The zip file contains
two folders “features” and “plugins”. You can find these two
folders in Eclipse folder as well (figure on the right).
Choose JavaCC > JavaCC Template File from the tree, and click Next.
Give a file name of the JavaCC file (e.g. new_file) and click Finish.
Grammar Syntax
To allow JavaCC to generate parsers for a particular language, we need to express the
grammar [cross-reference to week #1 lecture notes for presentation pp.53-62]. We store it
in a JavaCC grammar file (with .jj extension). It contains three parts (in order): JavaCC Op-
tions, Parser Declaration and Productions.
JavaCC Options
It is optional. It has its basic structure as follows:
options {
JDK_VERSION = "1.5";
}
Usually we do not need it, but the example above is required for Eclipse JavaCC Plugin.
For more reference on options you can check
https://javacc.dev.java.net/doc/javaccgrm.htm#prod2.
Parser Declaration
It is the entry point of the generated parser. It has its basic structure as follows:
PARSER_BEGIN(parser_name)
...
class parser_name ... {
}
...
PARSER_END(parser_name)
JavaCC will generate three files accordingly: ParserName.java (the generated parser),
ParserNameTokenManager.java (the lexer), and ParserNameConstants.java (a bunch of
useful constants).
Productions
There are four kinds of productions: Regular Expression Production for defining grammar
tokens, BNF Production for defining grammar, JAVACODE Production for writing grammar
by Java code, and Token Manager Declaration for extending the Token Manager. For more
reference on productions you can check
https://javacc.dev.java.net/doc/javaccgrm.htm#prod5.
We focus on the first one, regular expression productions. There are four kinds of regular
expressions productions, but now we only need two of them: SKIP which throws away the
matched string by the token manager, and TOKEN which creates a Token Java object and
send it to the parser. They have their basic structures as follows:
SKIP : {
" " | "\r" | "\t" | "\n"
}
TOKEN : {
<PLUS: "+"> | <MINUS: "-"> | <NUMBER: (["0"-"9"])+>
}
All characters and string must be in quotation marks (so treat everything as string):
Character list