Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasource): add debian datasource #30071

Open
wants to merge 64 commits into
base: main
Choose a base branch
from

Conversation

oxdev03
Copy link
Contributor

@oxdev03 oxdev03 commented Jul 7, 2024

Changes

  • Built upon @Ka0o0 implementation of the Debian datasource.
  • Includes minor improvements.
  • Addresses coverage issues identified in the previous review of the Debian datasource.
  • Fixes earlier review findings (except the caching issue).

Context

Resolves: #7041, #24906

Concept

Based on the discussion and the reused implementation (see the related issue for more details):

  1. Evaluation of Provided or Default Registry URL

    • Example: The URL https://ftp.debian.org/debian?release=bullseye&components=main,contrib&binaryArch=amd64 is evaluated to:
      • .../debian/dists/bullseye/main/binary-amd64
      • .../debian/dists/bullseye/contrib/binary-amd64
    • Supports multiple component URLs.
  2. Check for Existing Compressed Package

    • If the compressed Packages.gz file has already been downloaded and extracted:
      • Send a HEAD request to check if the upstream file has been updated.
    • If not already downloaded:
      • Download the compressed file.
      • Extract the contents.
      • Delete the compressed file.
    • Currently, only gzip compression is supported.
  3. Iterate through Package Index

    • Gather relevant information for the requested package.
    • Stop iteration, if found
  4. Return relevant release information

Documentation (please check one with an [x])

  • I have updated the documentation, or
  • No documentation update is required

included documentation for new datasource

How I've tested my work (please select one)

I have verified these changes via:

  • Code inspection only, or
  • Newly added/modified unit tests, or
  • No unit tests but ran on a real repository, or
  • Both unit tests + ran on a real repository

Real Test

Open Points

  • Does the current caching implementation for the datasource need to be reworked? Can the Renovate team provide a concept/details for this? (e.g., invalidation of the compressed file based on its timestamp)

  • Is the current approach generic enough? Are there any points that were not considered?

  • If significant rework is required, please let me know in advance.

@CLAassistant
Copy link

CLAassistant commented Jul 7, 2024

CLA assistant check
All committers have signed the CLA.

@viceice viceice self-requested a review July 7, 2024 17:15
Copy link
Collaborator

@secustor secustor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, and the well documented functionality!

This is my first pass as there is a lot going on here and we have to be sure that the caching is setup efficiently.

Would this work for every APT repository?

lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/util/fs/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
@secustor
Copy link
Collaborator

secustor commented Jul 7, 2024

Does the current caching implementation for the datasource need to be reworked? Can the Renovate team provide a concept/details for this? (e.g., invalidation of the compressed file based on its timestamp)

We should start simple and then start to optimize.
The Package.gz will be cached non transparently by the http layer, so the file will be downloaded only when necessary.
I suggest to simply cache the extracted file on a time base ( extract it to a different function and use the annotation ).

@oxdev03
Copy link
Contributor Author

oxdev03 commented Jul 7, 2024

This is my first pass as there is a lot going on here and we have to be sure that the caching is setup efficiently.

Thx for the initial review, I will fix the findings soon.

Would this work for every APT repository?

should work with every apt repository, which follows the https://wiki.debian.org/DebianRepository standard.

lib/modules/datasource/deb/index.spec.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/readme.md Outdated Show resolved Hide resolved
lib/modules/datasource/repology/readme.md Outdated Show resolved Hide resolved
@rarkins rarkins requested review from viceice and secustor July 20, 2024 11:12
lib/modules/datasource/deb/index.spec.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/readme.md Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/util/fs/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/url.ts Show resolved Hide resolved
@rarkins rarkins requested a review from secustor August 20, 2024 08:38
@oxdev03
Copy link
Contributor Author

oxdev03 commented Aug 20, 2024

This will merge results from different registries. Is it really what you want here? This is only used if there is a a second registry which has more metadata.

The registryStrategy=merge is required for the following use case: when there are multiple apt packages, some of which are only included in the remote repository and others in the internal Artifactory. With the current registryStrategy (first), it would require two separate configurations and additional labeling of internal dependencies

Co-authored-by: Sebastian Poxhofer <secustor@users.noreply.github.com>
secustor
secustor previously approved these changes Aug 20, 2024
@oxdev03
Copy link
Contributor Author

oxdev03 commented Aug 20, 2024

This will merge results from different registries. Is it really what you want here? This is only used if there is a a second registry which has more metadata.

The registryStrategy=merge is required for the following use case: when there are multiple apt packages, some of which are only included in the remote repository and others in the internal Artifactory. With the current registryStrategy (first), it would require two separate configurations and additional labeling of internal dependencies

Can we add this though or something speaking against it?

@secustor
Copy link
Collaborator

This will merge results from different registries. Is it really what you want here? This is only used if there is a a second registry which has more metadata.

The registryStrategy=merge is required for the following use case: when there are multiple apt packages, some of which are only included in the remote repository and others in the internal Artifactory. With the current registryStrategy (first), it would require two separate configurations and additional labeling of internal dependencies

Can we add this though or something speaking against

Add it, the only downside is for scenarios in which the metadata conflicts.
Though in that case we are not extracting much of them.

secustor
secustor previously approved these changes Aug 21, 2024
lib/modules/datasource/deb/checksum.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/util/cache/package/types.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Show resolved Hide resolved
* @param config - Configuration for fetching releases.
* @returns The release result if the package is found, otherwise null.
*/
async getReleases({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zharinov should we enable the cachable property to true? so this call is also cached?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is already cached in getPackageIndex, so it would be redundant

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @cache decorator is the more functional alternative, e.g. you can set the TTL

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will get cache bust for each configuration option and their order in the registryUrl though.
The current implementation checks cache after breaking down this options.

Copy link
Contributor Author

@oxdev03 oxdev03 Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant its already done here.

  @cache({
    namespace: `datasource-${DebDatasource.id}`,
    key: (componentUrl: string) => componentUrl,
  })
  async getPackageIndex(
    componentUrl: string,
  ): Promise<Record<string, PackageDescription[]>> {
    const { extractedFile, lastTimestamp } =
      await this.downloadAndExtractPackage(componentUrl);
    return await this.parseExtractedPackageIndex(extractedFile, lastTimestamp);
  }

No async operations happen in getReleases, if the cache of getPackageIndex is not stale.

but I still added this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me revert the change again, github seems bugging and not loading latest the conversations 😅.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pleas add cache deocator to this function, so the aggregated result is cached too. (perf)

@oxdev03 oxdev03 requested a review from viceice August 22, 2024 07:10
lib/modules/datasource/deb/checksum.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/checksum.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/checksum.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/checksum.spec.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/checksum.spec.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
lib/modules/datasource/deb/index.ts Outdated Show resolved Hide resolved
* @param config - Configuration for fetching releases.
* @returns The release result if the package is found, otherwise null.
*/
async getReleases({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pleas add cache deocator to this function, so the aggregated result is cached too. (perf)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants