Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPDX license expression parser using lalrpop library. #5

Closed
wants to merge 3 commits into from
Closed

SPDX license expression parser using lalrpop library. #5

wants to merge 3 commits into from

Conversation

mspiegel
Copy link

@mspiegel mspiegel commented May 9, 2016

First off: apologies. It wasn't my intention to use an entirely new parser. It's only that I am new to Rust and this was the only path to success that I could implement. I was investigating spdx and Rust and discovered rust-lang/cargo#2039 which led me to #3 and then this patch was created to both become more familiar with Rust and to solve the reported issue.

Maybe the most controversial change is renaming the crate from "license-exprs" to "license_exprs". When I was writing the tests I tried extern crate "license-exprs" as license_exprs but that generated a compiler error.

  • Rename create from "license-exprs" to "license_exprs"
    I tried following the guidelines from
    github.com/rust-lang/rfcs/blob/master/text/0940-hyphens-considered-harmful.md
    I could not get the rename on import working so I bailed and
    renamed the package.
  • LicenseExpr enum removed
  • ParseError::InvalidStructure construction changed from LicenseExpr to str
  • version number bumped to 2.0.0 due to previous changes
  • Fixes Parens for setting precedence #3

Michael Spiegel added 2 commits May 9, 2016 14:18
- Rename create from "license-exprs" to "license_exprs"
  I tried following the guidelines from
  github.com/rust-lang/rfcs/blob/master/text/0940-hyphens-considered-harmful.md
  I could not get the rename on import working so I bailed and
  renamed the package.
- LicenseExpr enum removed
- ParseError::InvalidStructure construction changed from LicenseExpr to str
- version number bumped to 2.0.0 due to previous changes
- Fixes #3
These tests were accidentally removed in branch.
@withoutboats
Copy link
Collaborator

withoutboats commented May 10, 2016

Hey, thanks for this PR! I'll look into it more this week, but at the surface level it looks awesome; thanks for turning this into more of a real library.

Maybe the most controversial change is renaming the crate from "license-exprs" to "license_exprs". When I was writing the tests I tried extern crate "license-exprs" as license_exprs but that generated a compiler error.

I don't think this is necessary. The way RFC 940 was written is very unclear if you weren't around before it was accepted. It used to work that you'd have to use extern crate "license-exprs" as license_exprs; now a crate called "license-exprs" will be imported as extern crate license_exprs without doing anything special. crates.io won't allow you to upload a crate "foo_bar" if "foo-bar" already exists, and vice versa. For the purpose of crate names, underscore and hyphen are the same character.

A good example of this is the crate rustc-serialize.

Also, to help me review the PR, what aspect of the changes that you made are breaking changes, and why are they needed? Specifically, I would consider these to be breaking changes:

  • Changing the type signatures in the existing API (adding new API surface isn't breaking).
  • Rejecting a license expression that was accepted before (I don't consider accepting previously rejected license expressions a breaking change, since that was a bug that no one ought to rely on).

(It looks like you changed the definition of the error enum; I guess what I'm asking is if you could explain why.)

Misunderstood RFC 940. Everything works without renaming.
@mspiegel
Copy link
Author

Thanks! Yes the definition of the error enum is the only breaking change. I think. I've double-checked I hope I didn't miss another breaking change by accident.

The version of ParseError that is in master has the InvalidStructure variant that is constructed from a LicenseExpr. The LicenseExpr in the branch has been removed. I attempted two alternatives in the branch: in the first alternative I was constructing an AST that could be been exposed through a new public function. In that alternative the LicenseExpr would have been changed to store the AST. The second alternative is not to generate an AST. In this alternative the parser returns a void type and we are using the parser to report either success or failure. I found the second alternative a lot easier to implement the first alternative. In either of these cases LicenseExpr would have been changed leading to the change in ParseError.

In addition to changing the parameters for ParseError it was renamed to SpdxParseError only to reduce confusion with the ParseError enum in the LALRPOP library. I think the same goal could be achieved with "import foo as bar".

@withoutboats withoutboats self-assigned this May 11, 2016
@withoutboats
Copy link
Collaborator

I'm so sorry that reviewing this slipped my mind!

type Error = &'input str;
}

pub Expr: () = {
Copy link
Collaborator

@withoutboats withoutboats May 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the SPDX spec say that License OR License AND License is valid without parens? (this grammar makes that valid) Does it say License AND License OR License is not valid without parens? (this grammar makes that invalid).

If not, would it be simpler and more correct to define this like:

Expr = Term AND Term | Term OR Term | Term

Term = Id WithClause? | ( Expr )

Or this:

Expr = Expr AND Expr | Expr OR Expr | Term

Term = Id WithClause? | ( Expr )

Depending on which is correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the SPDX spec say that License OR License AND License is valid without parens? (this grammar makes that valid) Does it say License AND License OR License is not valid without parens? (this grammar makes that invalid).

The SPDX 2.1 spec defines operator precedence, and AND is higher than OR (as in C and likely lots of other syntaxes). So both of your examples are valid and well-defined.

@withoutboats
Copy link
Collaborator

Left some comments, most of them are very minor, a few are not. I'd like to know what you think we should do about the last, largest one. Thanks again for the PR, and apologies again for taking 2 weeks to look at it.

@carols10cents
Copy link
Contributor

Doing some cleanup; it looks like this change is stale and not what's ultimately desired anyway, so I'm closing this PR. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parens for setting precedence
4 participants