Skip to content

Malwarebytes/purl-license-checker

Repository files navigation

purl-license-checker

Retrieve missing licenses for purl documented dependencies.

CodeQL Downloads Supported Versions Contributors

This cli utility takes one or more purl formatted urls from stdin and will try to find the license attached to each of them, by querying various package managers databases.

This is particularly useful to fill GitHub's Dependabot gap of missing 90% of licenses when working at scale with ghas-cli for instance.

Supported package managers:

Installation

Builds are available in the Releases tab and on Pypi

  • Pypi:
pip install purl-license-checker
  • Manually:
python -m pip install /full/path/to/purl-license-checker-xxx.whl

# e.g: python3 -m pip install Downloads/purl-license-checker-0.5.0-none-any.whl

Usage

To show the help message for each command, run purl-license-checker -h:

Usage: purl-license-checker [OPTIONS] COMMAND [ARGS]...

  Retrieve licenses for purl documented dependencies.

  Get help: `@jboursier-mwb` on GitHub

Options:
  --help  Show this message and exit.

Commands:
  get_license
  load_file
  merge_csv

Get a license

get_license PURL GITHUB_TOKEN

e.g:

get_license pip:ghas-cli gh-123456789qwerty

Find licenses for a csv-list of purl dependencies

load_file PATH GITHUB_TOKEN

e.g:

With a PATH csv file formatted as follow:

repo_name, purl, version, license

Where missing licenses are set to Unknown, for instance:

ghas-cli, ghas-cli, com.github.Malwarebytes/ghas-cli,, MIT
ghas-cli, pip:charset-normalizer,3.3.2, MIT
ghas-cli, pip:colorama,0.4.6, BSD-2-Clause AND BSD-3-Clause
ghas-cli, pip:click,8.1.7, BSD-2-Clause AND BSD-3-Clause
ghas-cli, pip:python-magic,0.4.27, MIT
ghas-cli, pip:urllib3,2.2.3, MIT
ghas-cli, pip:requests,2.32.3, Apache-2.0
ghas-cli, pip:configparser,7.1.0, MIT
ghas-cli, pip:certifi,2024.8.30, MPL-2.0
ghas-cli, pip:idna,3.10, BSD-2-Clause AND BSD-3-Clause
ghas-cli, actions:actions/checkout,4.*.*, Unknown
ghas-cli, actions:github/codeql-action/analyze,3.*.*, Unknown
ghas-cli, actions:github/codeql-action/init,3.*.*, Unknown
ghas-cli, actions:actions/dependency-review-action,4.*.*, Unknown

load_file will do its best to find the licenses for all Unknown license fields and will output its results in output.csv.

The output format is as follow:

purl, license

For instance:

npm:unicode-match-property-ecmascript, MIT
npm:unicode-match-property-value-ecmascript, MIT
npm:unicode-property-aliases-ecmascript, MIT
npm:universalify, MIT
npm:unpipe, MIT
npm:use-sync-external-store, MIT
npm:util-deprecate, MIT
npm:utils-merge, MIT

Fill an existing partial csv list of purl licenses

merge_csv LICENSES_INPUT_PATH DEPENDENCIES_OUTPUT_PATH GITHUB_TOKEN

Allows to fill the unknown dependencies in DEPENDENCIES_OUTPUT_PATH formatted as repo_name, purl, version, license from LICENSES_INPUT_PATH containing only purl, license. Particularly useful with a workflow based on ghas-cli.

Development

Build

Install Poetry first, then:

make dev

Bump the version number

  • Bump the version number: poetry version x.x.x
  • Update the __version__ field in src/cli.py accordingly.

Publish a new version

Requires syft to be installed to generate the sbom.

  1. Bump the version number as described above
  2. make deps to update the dependencies
  3. make release to build the packages
  4. git commit -a -S Bump to version 1.1.2 and git tag -s v1.1.2 -m "1.1.2"
  5. Upload dist/*, checksums.sha512 and checksums.sha512.asc to a new release in GitHub.

Miscellaneous

This repository is provided as-is and isn't bound to Malwarebytes' SLA.