Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
BjoernLoetters committed Jan 31, 2025
1 parent dadfe14 commit 7cfa557
Showing 1 changed file with 85 additions and 32 deletions.
117 changes: 85 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,82 @@
![Test Status](https://github.com/BjoernLoetters/Java-Parser-Combinators/actions/workflows/test.yml/badge.svg?branch=main)
![Test Status](https://img.shields.io/github/v/release/BjoernLoetters/Java-Parser-Combinators?label=Release&logo=github)

# Java Parser Combinators
# Jar Jar Parse
*"Yousa needin’ a parser? Meesa help!"*

**Java Parser Combinators** is a lightweight library designed for the rapid prototyping of parsers in Java.
It aims to balance ease of use and functionality, deliberately placing less emphasis on performance (even though first tests show that we can parse ~ 130k lines of JSON data in around 1 second on a recent M3 chip).
### 🚀 What is Jar Jar Parse?
**Jar Jar Parse** (or **JJParse** for short) is a lightweight library designed for the rapid prototyping of parsers in Java.
It is highly inspired by the [Scala Parser Combinators](https://github.com/scala/scala-parser-combinators) and aims to balance ease of use and functionality, deliberately placing less emphasis on performance.
If you are new to the idea of parser combinators feel free to check out the [Section "What are Parser Combinators?"](#what-are-parser-combinators).
See [Section "Example"](#example) for a quick overview and [Section "Getting Started"](#getting-started) to install and use this library.
If you find this project interesting, please consider [contributing](#contributing) or sharing it with others.

### Parser Combinators
### 🎯 Features

In general, parser combinators are a powerful technique for constructing complex parsers by composing smaller, reusable parsers.
Each parser is responsible for recognizing a specific part of the input, and combinators are functions that combine these basic parsers into more complex ones (hence the name).
This approach allows the rapid development of parsers with all the benefits of ordinary program code.
So, instead of generating a parser using a third party tool and a grammar specification, we "program" the syntax of our language in our primary programming language.
- [x] Easy to use and fully typed ♥️
- [x] Supports backtracking by default
- [x] Scanner-less parsing with regular expressions
- [x] Built-in support for whitespace skipping
- [x] 🌟 Unicode 🦄 support with code point positions
- [x] Abstract parsers and inputs for advanced use cases
- [x] Tested and documented ✅

In Java Parser Combinators, parsers are implemented as functions:
They take an `Input` (which is basically a `String`) and produce a `Result<T>`.
Here, the type `T` indicates the type of the result.
For example:
- A character parser takes an `Input` and produces a `Result<Character>`
- A string parser takes an `Input` and produces a `Result<String>`
- A parser which concatenates a character with a string parser produces a `Result<Tuple<Character, String>>`
### 📖 Getting Started

##### Installation

At the moment, **JJParse** is only available as a `jar`-release via the [release page](https://github.com/BjoernLoetters/Java-Parser-Combinators/releases).
In the near future, it will also be available as a maven package, so stay tuned!

To install **JJParse**, follow these steps:
1. Download the latest `jar` file from the [release page](https://github.com/BjoernLoetters/Java-Parser-Combinators/releases).
2. Add the `jar` file to your project's classpath.
- In **IntelliJ IDEA**: Right-click the `jar` file → Select **"Add as Library ..."**
- In **Eclipse**: Right-click your project → **Build Path****Add External Archives ...** and select the `jar` file.
- If using **Gradle** or **Maven**, you can manually place the `jar` in a `libs/` directory and reference it in your configuration.
3. (Optional) Add the JavaDoc `jar` file to your environment.

Of course, a parse may also fail.
For this reason, a `Result<T>` may either be a `Success<T>` (containing the desired result) or a `Failure<T>` (containing an error message).
##### First Steps

### Getting Started
To start using **JJParse**, create a class that extends either `Parsing` or `StringParsing`:
- `Parsing` offers the core functionality and allows defining parsers for **arbitrary input types** (e.g., a custom token stream provided by a scanner).
- `StringParsing` is the best choice if you simply want to prototype a language. It provides utilities for **character-based and scanner-less parsing** (including regular expressions) and already skips any white space characters by default.

##### Installation
In general, a parser in **JJParse** is simply a function that transforms an `Input<T>` into a `Result<U>`.
Here, an instance of `Input<T>` can be seen as a stream of tokens of type `T`.
In case of `StringParsing`, this type `T` is fixed to be `Character`, while for `Parsing` it is up to you to choose an appropriate token type.
The `Result<U>` can either be a `Success<U>`, containing the parsed value of type `U`, or a `Failure<U>`, carrying an error message with further details.

At the moment, Java Parser Combinators are only available as a `jar`-release via the [release page](https://github.com/BjoernLoetters/Java-Parser-Combinators/releases).
In the near future, they will also be available as a maven package.
So, to get started just download the latest `jar`-release and add it to the classpath of your project.
In IntelliJ this can be done by right-clicking on the `jar`-file and selecting `Add as library ...`.
To parse an input, instantiate your parser class and call `parse(parser, input)` (to parse the whole input) or `parser.apply(input)` (to parse only a portion of the input).
For a quick start, also have a look at the [Section "Example"](#example).

##### Usage Example
### 🔍 Example

The example below provides a brief overview of the library. For a more detailed example, check out the subprojects in the [examples directory](examples).

```java
import static jjparse.StringParsing;
import jjparse.StringParsing;
import jjparse.input.Input;

// Extend 'StringParsing' for convenient character-based parsers.
public class MyParser extends StringParsing {

// A parser which parses an integer.
// A parser which parses a signed integer.
public Parser<Integer> number = regex("[+-]?[0-9]+").map(Integer::parseInt);

// A parser which parses additions.
// A parser which parses additions and returns their result.
public Parser<Integer> add = number.keepLeft(character('+')).and(number)
.map(product -> product.first() + product.second());

public static void main(final String[] arguments) {
// Create an input of characters with the name 'My Test Input' (for error reporting).
Input<Character> input = Input.of("My Test Input", "42 + 0");

// Use the 'add' parser to parse the above input. Note how a parser is
// just a function that takes an input and returns a parse result.
final MyParser parser = new MyParser();
// Use the 'add' parser to parse the above input.
MyParser parser = new MyParser();
Result<Integer> result = parser.parse(parser.add, input);

// The following statement prints 'Success: 42'.
switch (result) {
case Success<Integer> success:
System.out.printf("Success: %s\n", success.value);
Expand All @@ -69,12 +90,44 @@ public class MyParser extends StringParsing {
}
```

### Contributing
### 💡 What are Parser Combinators?

In traditional parsing approaches, commonly taught in university courses, we often rely on parser generators.
These tools take a grammar specification as input and generate a parser implementation based on it.
While the generated parsers are typically fast, this method has some downsides.

Firstly, modifying the grammar requires regenerating the entire parser, which can be cumbersome.
Additionally, to produce meaningful output with your parser, you must embed code from your programming language within the grammar specification itself.
The tool support for such embedded code can sometimes be difficult to work with and frustrating to debug.

This is where parser combinators come into play.
A parser combinator is a powerful technique for constructing complex parsers by combining simpler, reusable parsers.
Each individual parser is responsible for recognizing a specific part of the input, and combinators are functions that combine these basic parsers into more complex ones — hence the name "combinators".

The key advantage of this approach is that it enables the rapid development of parsers within the constraints and benefits of a general-purpose programming language.
Instead of relying on a third-party tool to generate a parser from a grammar specification, parser combinators allow us to "program" the syntax of a language directly in our favourite programming language.

In the case of **JJParse**, parsers are implemented as functions that take an `Input<T>` and produce a `Result<U>`.
Here ...
- ... the type `T` represents the element type of the input, which is `Character` in many cases.
- ... the type `U` denotes the type of the result.

For example:
- A character parser takes an `Input<Character>` and produces a `Result<Character>`.
- A string parser takes an `Input<Character>` and produces a `Result<String>`.
- A parser that combines a character parser with a string parser produces a `Result<Tuple<Character, String>>`.

As with any parsing method, things don’t always go smoothly.
A parsing attempt can fail, which is why a `Result<U>` may either be a `Success<U>` (containing the successful result) or a `Failure<U>` (containing an error message).

This approach, while simple, provides flexibility and expressiveness that traditional parser generators often lack, making it an ideal solution for day-to-day parsing tasks.

### 🤝 Contributing

Contributions of any kind are welcome!
Whether it’s reporting issues, suggesting features, or submitting pull requests, your help is appreciated.
If you find this library useful, consider sharing it with others.

### License
### ⚖️ License

This project is licensed under the [MIT](LICENSE) license.

0 comments on commit 7cfa557

Please sign in to comment.