Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile each format into only one decoder by taking the union of nexts. #139

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mikeday
Copy link
Contributor

@mikeday mikeday commented Dec 18, 2023

The compilation of a format that ends with a union or a repeat will depend on the format that follows it, as this may influence the match tree used for lookahead, so initially we compiled each format into multiple decoders, one for each possible "next".

This pull request compiles each format to a single decoder instead, taking the union of all the "nexts". I think this is sound: if it's valid for F to be followed by A and valid for F to be followed by B then it should be valid for F to be followed by (A|B).

It's nice to create exactly one decoder per format however this still requires "whole program analysis" in the sense that a format cannot be compiled independently of how it is used, as you would hope a function or module could be.

Also the code feels slightly fragile given the way it has some subtle invariants on the decoder indices, that could probably be improved a little.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant