-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
purl with name contain whitespace can not be built from purl string #51
Labels
Comments
tommyknows
added a commit
to tommyknows/packageurl-go
that referenced
this issue
Jul 1, 2023
This commit refactors the `ToString` and `FromString` functions / methods to use the `url.URL` type, instead of trying to parse URLs directly. The [PURL Spec](https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#a-purl-is-a-url) explicitly says: > A purl is a URL: > A purl is a valid URL and URI that conforms to the URL definitions or > specifications ... So why not actually use Go's URL type to do operations on it? This commit does exactly that, and removes a lot of the previously dense parsing-logic with simpler checks based on the URL's fields. Especially `ToString()` has gotten a lot simpler through that. Additionaly, by switching to the URL package, this commit fixes a couple of outstanding bugs: - When a qualifier contained `+` signs, which are valid in URL paths, but not in URL queries, the sign did not get escaped. The previous code relied on `url.PathUnescape` to unescape keys and values, which should have used `url.QueryUnescape` instead. Further, encoding Qualifiers used `url.PathEscape` instead of `url.QueryEscape`. This could have led to pURLs losing `+` signs if they were properly encoded. - Fixes package-url#51, where spaces in names have not gotten encoded correctly. - Fixes most test-cases from package-url#22 that are round-trip-save (e.g. all cases where input == output), except the "pre-encoded qualifier value is unchanged" testcase that is wrong - a qualifier shouldn't be encoded with `%20` for a space, but with a plus-sign (query encoding). - Fixes most cases from package-url#41 as well, except: - where the query encoding in the test-cases are wrong (" " -> "+", "+" -> "%20") (test-cases `pre-encoded_qualifier_value_is_unchanged` and `unencoded_qualifier_value_is_encoded`), - `characters_are_unencoded_where_allowed`: `<` should be encoded, as far as I can tell. The Go stdlib also encodes it, so this should be fine. - `explicit_characters_are_encoded`: `@` does not need to be encoded. When reading through that list and all the nuances, it makes it clear that we shouldn't do this ourselves! And this commit doesn't, it simply relies on the Go standard library to handle all of these cases correctly. All of the aforementioned issues should have test-cases, presumably added to the test-suite-data.
shibumi
pushed a commit
that referenced
this issue
Jul 1, 2023
This commit refactors the `ToString` and `FromString` functions / methods to use the `url.URL` type, instead of trying to parse URLs directly. The [PURL Spec](https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#a-purl-is-a-url) explicitly says: > A purl is a URL: > A purl is a valid URL and URI that conforms to the URL definitions or > specifications ... So why not actually use Go's URL type to do operations on it? This commit does exactly that, and removes a lot of the previously dense parsing-logic with simpler checks based on the URL's fields. Especially `ToString()` has gotten a lot simpler through that. Additionaly, by switching to the URL package, this commit fixes a couple of outstanding bugs: - When a qualifier contained `+` signs, which are valid in URL paths, but not in URL queries, the sign did not get escaped. The previous code relied on `url.PathUnescape` to unescape keys and values, which should have used `url.QueryUnescape` instead. Further, encoding Qualifiers used `url.PathEscape` instead of `url.QueryEscape`. This could have led to pURLs losing `+` signs if they were properly encoded. - Fixes #51, where spaces in names have not gotten encoded correctly. - Fixes most test-cases from #22 that are round-trip-save (e.g. all cases where input == output), except the "pre-encoded qualifier value is unchanged" testcase that is wrong - a qualifier shouldn't be encoded with `%20` for a space, but with a plus-sign (query encoding). - Fixes most cases from #41 as well, except: - where the query encoding in the test-cases are wrong (" " -> "+", "+" -> "%20") (test-cases `pre-encoded_qualifier_value_is_unchanged` and `unencoded_qualifier_value_is_encoded`), - `characters_are_unencoded_where_allowed`: `<` should be encoded, as far as I can tell. The Go stdlib also encodes it, so this should be fine. - `explicit_characters_are_encoded`: `@` does not need to be encoded. When reading through that list and all the nuances, it makes it clear that we shouldn't do this ourselves! And this commit doesn't, it simply relies on the Go standard library to handle all of these cases correctly. All of the aforementioned issues should have test-cases, presumably added to the test-suite-data.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The text was updated successfully, but these errors were encountered: