Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port of Feature File Parser to Antlr 4 #1367

Merged
merged 4 commits into from
Jun 8, 2021

Conversation

iterumllc
Copy link
Collaborator

@iterumllc iterumllc commented May 31, 2021

Description

This is primarily a port of the feature file parser to Antlr 4.9. As Antlr 4 does not have a C backend the port is written in C++, and that change requires a number of other changes diminishing with distance from the hotconv library. In the library itself some of the code needed to be restructured so that include files can be shared. At the application level the "main" function is now a C++ stub (as compiling main as C++ is the traditional way of ensuring that statics are initialized, etc.) Further out under the shared (formerly public) directory there is some const change leakage.

It also changes the build to be CMake-based (this PR supersedes #1360). The build should now work in XCode and Visual Studio. The tests have also been shimmed into cmake/ctest (at a coarse grain) so that they can be run prior to installation without fiddling.

The PR is partly self-documenting in that changes to files in the docs explain both how to use the new build system and a bit about how the new Feature File parse tree translator works. The specification has also been updated to reflect some clarifications and changes in light of the new parser.

All tests pass. There is one new (aalt-related) test and some changes to tests. Most of the changes just sync up with the new console message format and/or library and program version numbers. eii-1.bad.fea was edited to be genuinely beyond the lower bound of int16_t. expected_output/fealib/GPOS_3.ttx had a bad AnchorPoint value due to a bug in the old feat.c code.

The new system has also been tested against a number of full-fledged fonts by comparing the ttx output of files prepared with the old and new versions of makeotfexe. Even so, given the scope of the feature file semantics it is unlikely we have full test coverage of all relevant cases.

Closes #1124
Closes #548
Closes #83

Checklist:

  • I have followed the Contribution Guidelines
  • I have added test code and data to prove that my code functions correctly
  • I have verified that new and existing tests pass locally with my changes
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation

@iterumllc
Copy link
Collaborator Author

One of the goals of this port was to keep the lexer and parser grammars as free of implementation as possible, to enable other projects to use them for parsers in other target languages. Given some small changes to the spec (e.g. slightly narrowing the set of valid tags) the only part of the spec that required some implementation was anon blocks (to match the end tag up with the start tag while lexing, see also #1352).

Accordingly, the lexer is broken up into FeatLexerBase.g4 (with almost all of the tokens, and free of any implementation) and FeatLexer.g4 (which "imports" FeatLexerBase and adds anon-block support). FeatParser.g4 uses the token set generated from FeatLexer and contains no implementation. Therefore any project wanting to use these grammars can port FeatLexer.g4 to their target language and then just use the other files unmodified.

@iterumllc
Copy link
Collaborator Author

One other aspect of this: I've tried to replace immediate FATALs with ERRORs where possible to increase the information available from a given run. Sometimes this requires adding a bit of fake data. In any case this increases the chances of hitting a crash (segfault, etc.) on error paths. If someone hits one of these they are easy enough to fix, either by adding more fake data or (easier) by putting the FATAL error back.

@iterumllc
Copy link
Collaborator Author

The lgtm failure is a repeated github timeout (downloading utfccp, which is brought in by the Antlr 4 runtime). I've heard at least one other tester report this, but its strange coming from the github build infrastructure itself.

If we absolutely have to copy everything into the repo and restructure the build around that we can but it seems beset to avoid if possible.

@iterumllc
Copy link
Collaborator Author

I mitigated the timeout by running

git config --global url.https://github.com/.insteadOf git://github.com/

in the workflow before building.

@iterumllc
Copy link
Collaborator Author

@khaledhosny Note that my comments were just attempting to clarify some of the thinking leading to this PR. I'll make similar notes on new issues as they are raised but otherwise let you and the reviewers work out these questions.

@iterumllc
Copy link
Collaborator Author

The already-high commit count almost doubled due to the Github workflow debugging so I squashed the PR for the sake of clarity. It doesn't look like Github managed to re-associate the existing review comments but the discussion has enough context to trace back to the relevant files.

@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@lgtm-com
Copy link

lgtm-com bot commented Jun 7, 2021

This pull request fixes 2 alerts when merging f8e3fea into 1523970 - view on LGTM.com

fixed alerts:

  • 1 for Use of potentially dangerous function
  • 1 for Empty branch of conditional

@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@adobe-type-tools adobe-type-tools deleted a comment from lgtm-com bot Jun 7, 2021
@lgtm-com
Copy link

lgtm-com bot commented Jun 7, 2021

This pull request fixes 2 alerts when merging 35f29a2 into 1523970 - view on LGTM.com

fixed alerts:

  • 1 for Use of potentially dangerous function
  • 1 for Empty branch of conditional

@lgtm-com
Copy link

lgtm-com bot commented Jun 8, 2021

This pull request fixes 2 alerts when merging 4ed60cd into 1523970 - view on LGTM.com

fixed alerts:

  • 1 for Use of potentially dangerous function
  • 1 for Empty branch of conditional

Copy link
Collaborator

@josh-hadley josh-hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @iterumllc for this great set of improvements!

@josh-hadley josh-hadley merged commit 01a35da into adobe-type-tools:develop Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants