Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: failure in correctly parsing installed namespace packages #38

Closed
NicolaDonelli opened this issue Jun 7, 2023 · 0 comments · Fixed by #39
Closed

Bug: failure in correctly parsing installed namespace packages #38

NicolaDonelli opened this issue Jun 7, 2023 · 0 comments · Fixed by #39
Assignees
Labels
bug Something isn't working

Comments

@NicolaDonelli
Copy link
Contributor

NicolaDonelli commented Jun 7, 2023

System info

  • OS: MacOS
  • Version: 13.4

Describe the bug

I just realised that licensecheck does not parse correctly namespace packages locally installed (while it does not seem to face the same problem for namespace packages on PyPI).

Suppose you have a namespace package called 'hexagonal-repository-gcs' that has the following nested structure:

hexagonal
└─── repository
      └───  gcs
             │ __init__.py
             │ ...

In particular the issue arises in the function packageinfo.getPackageInfoLocal due to the fact that the call to resources.files(requirement) with requirement='hexagonal-repository-gcs' fails with a ModuleNotFoundError since importlib.import_module('hexagonal-repository-gcs') throws that error.

This is correct by importlib since the right call to importlib.import_module should have been importlib.import_module('hexagonal.repository.gcs') because my dependency is a module in a namespace but this does not seem correct from the point of view of the packageinfo.getPackageInfoLocal because this breaks the execution raising the exception but the package is actually installed an most of the package info had been correctly retrieved (only the size is missing but it is a ancillary information that is not actually used in the core of licensecheck business).

As of now the code is:

def getPackageInfoLocal(requirement: str) -> PackageInfo:
	"""Get package info from local files including version, author
	and	the license.

	:param str requirement: name of the package
	:raises ModuleNotFoundError: if the package does not exist
	:return PackageInfo: package information
	"""
	try:
		# Get pkg metadata: license, homepage + author
		pkgMetadata = metadata.metadata(requirement)
		lice = licenseFromClassifierlist(pkgMetadata.get_all("Classifier"))
		if lice == UNKNOWN:
			lice = pkgMetadata.get("License", UNKNOWN)
		homePage = pkgMetadata.get("Home-page", UNKNOWN)
		author = pkgMetadata.get("Author", UNKNOWN)
		name = pkgMetadata.get("Name", UNKNOWN)
		version = pkgMetadata.get("Version", UNKNOWN)
		size = 0
		try:
			packagePath = resources.files(requirement)
			size = getModuleSize(cast(Path, packagePath), name)
		except TypeError:
			pass
		# append to pkgInfo
		return PackageInfo(
			name=name,
			version=version,
			homePage=homePage,
			author=author,
			size=size,
			license=lice,
		)

	except (metadata.PackageNotFoundError, ModuleNotFoundError) as error:
		raise ModuleNotFoundError from error

I'd suggest to use instead directly the size computed by metadata.Distribution to avoid trying to import the package:

def getPackageInfoLocal(requirement: str) -> PackageInfo:
	"""Get package info from local files including version, author
	and	the license.

	:param str requirement: name of the package
	:raises ModuleNotFoundError: if the package does not exist
	:return PackageInfo: package information
	"""
	try:
		# Get pkg metadata: license, homepage + author
		pkgMetadata = metadata.metadata(requirement)
		lice = licenseFromClassifierlist(pkgMetadata.get_all("Classifier"))
		if lice == UNKNOWN:
			lice = pkgMetadata.get("License", UNKNOWN)
		homePage = pkgMetadata.get("Home-page", UNKNOWN)
		author = pkgMetadata.get("Author", UNKNOWN)
		name = pkgMetadata.get("Name", UNKNOWN)
		version = pkgMetadata.get("Version", UNKNOWN)		
                size = sum([pp.size for pp in metadata.Distribution.from_name(requirement).files if pp.size is not None])
		# append to pkgInfo
		return PackageInfo(
			name=name,
			version=version,
			homePage=homePage,
			author=author,
			size=size,
			license=lice,
		)

	except metadata.PackageNotFoundError as error:
		raise ModuleNotFoundError from error

Note: the size computed by metadata.Distribution is not the same as the on retrieved by PyPI (for published packages, ofc), they can differ significantly but, since this information is actually of no use for licensecheck business, I wouldn't care very much about understanding the motivations of this difference and try to solve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants