Skip to content

marcohu/rules_antlr

Repository files navigation

Build Status Java 8+ License

ANTLR Rules for Bazel

These build rules are used for processing ANTLR grammars with Bazel.

Support Matrix

antlr4 antlr3 antlr2
C Gen Gen
C++ Gen + Runtime Gen + Runtime Gen + Runtime
Go Gen + Runtime
Java Gen + Runtime Gen + Runtime Gen + Runtime
ObjC Gen
Python2 Gen + Runtime Gen + Runtime Gen + Runtime
Python3 Gen + Runtime Gen + Runtime

Gen: Code Generation
Runtime: Runtime Library bundled

Setup

Add the following to your WORKSPACE file to include the external repository and load the necessary Java dependencies for the antlr rule:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "rules_antlr",
    sha256 = "26e6a83c665cf6c1093b628b3a749071322f0f70305d12ede30909695ed85591",
    strip_prefix = "rules_antlr-0.5.0",
    urls = ["https://github.com/marcohu/rules_antlr/archive/0.5.0.tar.gz"],
)

load("@rules_antlr//antlr:repositories.bzl", "rules_antlr_dependencies")
rules_antlr_dependencies("4.8")

More detailed instructions can be found in the Setup document.

Build Rules

To add ANTLR code generation to your BUILD files, you first have to load the extension for the desired ANTLR release.

For ANTLR 4:

load("@rules_antlr//antlr:antlr4.bzl", "antlr")

For ANTLR 3:

load("@rules_antlr//antlr:antlr3.bzl", "antlr")

For ANTLR 2:

load("@rules_antlr//antlr:antlr2.bzl", "antlr")

You can then invoke the rule:

antlr(
    name = "parser",
    srcs = ["Hello.g4"],
    package = "hello.world",
)

It's also possible to use different ANTLR versions in the same file via aliasing:

load("@rules_antlr//antlr:antlr4.bzl", antlr4 = "antlr")
load("@rules_antlr//antlr:antlr3.bzl", antlr3 = "antlr")

antlr4(
    name = "parser",
    srcs = ["Hello.g4"],
    package = "hello.world",
)

antlr3(
    name = "old_parser",
    srcs = ["OldHello.g"],
    package = "hello.world",
)

Refer to the rule reference documentation for the available rules and attributes:

Basic Java Example

Suppose you have the following directory structure for a simple ANTLR project:

HelloWorld/
└── src
    └── main
        └── antlr4
            ├── BUILD
            └── Hello.g4
WORKSPACE

HelloWorld/src/main/antlr4/Hello.g4

grammar Hello;
r  : 'hello' ID;
ID : [a-z]+;
WS : [ \t\r\n]+ -> skip;

To add code generation to a BUILD file, you load the desired build rule and create a new antlr target. The output—here a .jar file with the generated source files—can be used as input for other rules.

HelloWorld/src/main/antlr4/BUILD

load("@rules_antlr//antlr:antlr4.bzl", "antlr")

antlr(
    name = "parser",
    srcs = ["Hello.g4"],
    package = "hello.world",
    visibility = ["//visibility:public"],
)

Building the project generates the lexer/parser files:

$ bazel build //HelloWorld/...
INFO: Analyzed 2 targets (23 packages loaded, 400 targets configured).
INFO: Found 2 targets...
INFO: Elapsed time: 15.295s, Critical Path: 14.37s
INFO: 8 processes: 6 processwrapper-sandbox, 2 worker.
INFO: Build completed successfully, 12 total actions

To compile the generated files, add the generating target as input for the java_library or java_binary rules and reference the required ANTLR dependency:

load("@rules_java//java:defs.bzl", "java_library")

java_library(
    name = "HelloWorld",
    srcs = [":parser"],
    deps = ["@antlr4_runtime//jar"],
)

Refer to the examples directory for further samples.

Project Layout

ANTLR rules will store all generated source files in a target-name.srcjar zip archive below your workspace bazel-bin folder. Depending on the ANTLR version, there are three ways to control namespacing and directory structure for generated code, all with their pros and cons.

  1. The package rule attribute (antlr4 only). Setting the namespace via the package attribute will generate the corresponding target language specific namespacing code (where applicable) and puts the generated source files below a corresponding directory structure. To not create the directory structure, set the layout attribute to flat.
    Very expressive and allows language independent grammars, but only available with ANTLR 4, requires several runs for different namespaces, might complicate refactoring and can conflict with language specific code in @header {...} sections as they are mutually exclusive.

  2. Language specific application code in grammar @header {...} section. To not create the corresponding directory structure, set the layout attribute to flat.
    Allows different namespaces to be processed in a single run and will not require changes to build files upon refactoring, but ties grammars to a specific language and can conflict with the package attribute as they are mutually exclusive.

  3. The project layout (antlr4 only). Putting your grammars below a common project directory will determine namespace and corresponding directory structure for the generated source files from the relative project path. ANTLR rules uses different defaults for the different target languages (see below), but you can define the root directory yourself via the layout attribute.
    Allows different namespaces to be processed in a single run without language coupling, but requires conformity to a specific (albeit configurable) project layout and the layout attribute for certain languages.

Common Project Directories

The antlr4 rule supports a common directory layout to figure out namespacing from the relative directory structure. The table below lists the default paths for the different target languages. The version number at the end is optional.

Language Default Directory
C src/antlr4
Cpp src/antlr4
CSharp, CSharp2, CSharp3 src/antlr4
Go  
Java src/main/antlr4
JavaScript src/antlr4
Python, Python2, Python3 src/antlr4
Swift  

For languages with no default, you would have to set your preference with the layout attribute.