Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python-cataloger: normalize package names #3069

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

Mikcl
Copy link
Contributor

@Mikcl Mikcl commented Jul 24, 2024

Fixes #3064

(provided there is agreement to normalize the names, discussion in the issue).

This PR adds a normalization function according to python packaging specification https://packaging.python.org/en/latest/specifications/name-normalization/ to each of the package types. And adds/updates existing tests.

The name and the purl are updated, the "metadata" still preserves the unnormalized name.

Signed-off-by: mikcl <mikesmikes400@gmail.com>
@Mikcl Mikcl force-pushed the mikcl/normalized-python-names branch from d4a8368 to 85057d8 Compare July 24, 2024 19:37
@@ -10,7 +12,16 @@ import (
"github.com/anchore/syft/syft/pkg"
)

func normalize(name string) string {
// https://packaging.python.org/en/latest/specifications/name-normalization/
re := regexp.MustCompile(`[-_.]+`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spiffcs
Copy link
Contributor

spiffcs commented Jul 25, 2024

Nice! There are no matching or downstream concerns here given that we already normalize these values for the PURL when using grype. This change should help consumers of syft SBOM going forward so 🟢

@spiffcs spiffcs merged commit 36f95d6 into anchore:main Jul 25, 2024
11 checks passed
@spiffcs spiffcs added the enhancement New feature or request label Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python packages: name normalization
3 participants