-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full parser for license expressions (version II) #29
Conversation
* Use a custom Lexer to handle License vs Exception vs Reference * Don't copy strings, slice them from the input * Handle whole ABNF from https://spdx.org/spdx-specification-21-web-version#h.jxpfx0ykyb60
@withoutboats - You seem to be the main developer on this codebase historically, could you have a look at this PR please? |
I no longer maintain this code, but I'm pretty dubious of adding lalrpop (which is a heavy dependency that takes quite a bit of time to compile) to parse license expressions, which have a very simple grammar. This will seriously degrade the compile time of any users who aren't already depending on lalrpop. |
This is basically the same lexer introduced in ehuss/license-exprs#29
Doing some cleanup; it looks like this change is stale and not what's ultimately desired anyway, so I'm closing this PR. Thank you! |
@carols10cents - Who should I be talking to about the correct fix here? I read @withoutboats's suggestion but, as he's no longer a maintainer, I was hoping to get an opinion from one of the current maintainers before starting to re-work the change. I understand the worry about the dependency explosion, but I think most of these dependencies are already present in Cargo (and I don't believe that writing a parser by hand is a good idea, it's likely to be buggy and the lalrpop generator "just works":tm:) As a heads-up, there seems to be an alternative crate for handling SPDX being created (including some of this very change) at https://github.com/EmbarkStudios/spdx, they've hand-rolled a Shunting-Yard-based parser which I'm sure they'd be happy to share if that's a preferred solution? |
* Use a better lexer, taken from ehuss/license-exprs#29 * Add a simple parser/evaluator * Add an update exe for pulling new SDPX information, also taken from https://github.com/rust-lang-nursery/license-exprs * Add support for some of the metadata available from the SPDX format, namely "IsDeprecated", "IsFSFLibre", "IsOSIApproved" Resolves: #1
We'd be happy to share it or add more maintainers if people are interested, as @withoutboats mentioned, the expressions are rather simple and hand parsing wasn't difficult, though if you find any test cases that aren't correct I would love to incorporate them! |
@Jake-Shadle this crate doesn't handle license expressions with parens in them (as in |
I saw #5, which adds a full blown parser to this crate in order to resolve #3 and to add support for
DocumentRef-<blah>:LicenseRef-<blah>
. It seems that PR has been somewhat abandoned (last touched in 2016), and doesn't actually support returning the parsed license expression.This PR is similar in structure to that one, adding a lalrpop parser to handle the expression format, but has a few differences:
This does introduce a few breaking changes:
failure::Error
(subsuming Use failure for error handling #20 as well)MIT WITH
orMIT AND
- I assume this is fine, as the old behaviour was simply incorrectAnd finally it does introduce one technical bug, the SPDX specification states:
Ironically (?) this is harder to implement in most parser toolchains, since most lexers ignore whitespace between tokens (including lalrpop's default one, and my custom one). Currently this means that the code in this PR will accept
MIT +
as meaning the same asMIT+
. I think this is fixable by making the lexer more complicated, but I've not tried and I worry that it'll make the code less maintainable. Thoughts?Finally, I've not done anything around Cargo's slightly non-compliant use of the specification (
NOASSERTION
andNONE
), since those can be handled out of band if Cargo wishes (since there's no need for a parser to determine if the expression is one of those special cases). Again I'm interested in your thoughts on this - while I'm doing this work in the context of some non-OSS code, I can tell you that Cargo is no longer the only user of this crate 😀.