-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Packagist] Update Packagist service to use v2 api #6508
Conversation
Packagist deprecated the original `packagist.org/p/username/package` endpoint in favor of v2 `packagist.org/p2/username/package` endpoint. Because of this, new packages aren't being found using v1. This PR updates the Packagist service to use the new endpoint.
|
Thank you for the PR @GeoffSelby! The changes seem to have broken some of the badges, more details can be found in the CI output (e.g. https://app.circleci.com/pipelines/github/badges/shields/6648/workflows/ba687673-7fbf-4cb5-8579-5a699f2b9972/jobs/143021). Could you please take a look when you get a chance? Let us know if you have any questions |
Some packages don't return the same data structure as others with the new api endpoints. This changes the validation schema to account for the potential differences.
Thanks, @calebcartwright! Looks like it was the validation schema causing problems. Should be good when these tests wrap up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! I started going through this but as I'd noted in the inline comments I feel like I could use some more info about the change. Is there some background context, documentation, etc. you can share with us about the nature of the upstream changes and why there's all the cascading impacts relative to the branches?
Shields integrates with so many upstream services that even within the maintainer team we've got quite a few whose APIs we're not terribly familiar with, so any additional info you could share would be most appreciated!
@chris48s for some reason I feel like I recall you have some more hands on PHP experience than others (certainly more so than myself) so do you have any background on what we were doing before vs. the changes with the new API?
license: Joi.array().required(), | ||
license: Joi.array(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need to drop .required()
here? Is this because the new schema may not always define the key? If that's the case then I think in the transform function we need to account for that fact and throw a not found/invalid response error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new api endpoint now only returns the license array with the first item in the response (most recent release). At least that's what I've found in my testing. because of this, the schema validation would fail when the response contains more than 1 version. I do think that throwing a not found error would be good just in case there isn't a license at all for some reason. I'll add that in shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ideally you would run the MetadataMinifier::expand
logic on the response before validating the schema, and then you can restore the required()
I'd say. Note that packagist.org does not enforce the presence of a license key, so it may well be missing if the package author did not include it. I'd not expect they'd display a shields.io license badge if they haven't defined a license though but who knows :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Seldaek i thought about that too but I don’t think it would be possible with the way Shields makes a request since it’s shared with other services. Maybe with some sort of event hook system but that seems like overkill for this one situation.
const branch = this.getDefaultBranch(json, user, repo) | ||
if (!branch) { | ||
throw new NotFound({ prettyMessage: 'default branch not found' }) | ||
} | ||
const { license } = branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I'm not sure I know what we were doing here before. Do you know what, if any, the chances are of this API change resulting in us finding and rendering a different license because of the branch change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't ever find or render the wrong license because we only care about the most recent version which is always the first item in the json.packages[packageName]
array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, though my question isn't about "right" or "wrong". We take backwards compatibility pretty seriously, and one of the things we have to be mindful of is whether a change we make to the Shields code has the potential to modify any of our existing badge users without them changing anything in their corresponding project/repo/package/etc.
It's still not clear to me whether that's a possibility here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is technically possible here but there is only one way and it's likely never going to be an issue. The only way it could be a problem is if the default branch has a different license than the most recent tagged release. For example, if a project had an MIT license before (so in the most recent tagged release) but you changed it to Apache License 2.0 in your default branch, main, master, dev, whatever you set your default branch to, and did not tag a new release. The current Shields, with the old api, would return Apache License 2.0 but this PR would return MIT because it's the license of the most recent tagged release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's helpful clarity, thank you!
Based on a cursory glance of the docs under https://packagist.org/apidoc#getting-package-data it seems like we could maintain access to the full set of dev/branch data if we fetched packageName-dev.json
instead of packageName.json
Do you see any issues with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm taking a look at Composer's MetadataMinifier package and I realized that the expand method doesn't actually do what we need. All it's doing is "expanding" the array of releases with data we already have. In other words, it doesn't do anything useful if we want to get the PHP version requirements or license (or anything really) for the latest stable release if it's not the first item in the array.
Take a simplified version of the response from /p2/symfony/symfony.json
:
"packages": {
"symfony/symfony": [
{
"version": "v5.3.0-RC1",
"license": [
"MIT"
],
"require": {
"php": ">=7.2.5"
},
},
{
"version": "v5.3.0-BETA4",
},
{
"version": "v5.3.0-BETA3",
},
{
"version": "v5.3.0-BETA2",
},
{
"version": "v5.3.0-BETA1",
},
{
"version": "v5.2.9",
"require": {
"php": ">=7.2.5",
},
}
]
}
After "expanding", license
for v5.2.9 (latest stable release) will be whatever
licenseis for the first item,
v5.3.0-RC1 in this case. In other words, if Symfony were to change its license in a pre-release, that would be the license used for the badge regardless. The same could go for PHP version requirements though not in the case of Symfony right now.
The only way to ensure we actually have the right data is to use the standard JSON API like packagist.org/packages/symfony/symfony.json
. This is a more expensive request because the response is massive, but it does respond with all of the data we would need for every release. This poses some other issues as well. As far as I can tell, it still wouldn't return any branches/dev versions like dev-master
or dev-main
. If we needed dev
we could fall back to /p2/user/repo~dev.json
. The other issue is that it seems the response is in no particular order. It's not a huge deal but determining the most recent stable release could be a little time-consuming since it would have to go through the entire response.
I'll set some time aside tomorrow to start tackling this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After "expanding", license for v5.2.9 (latest stable release) will be whatever licenseis for the first item,v5.3.0-RC1 in this case. In other words, if Symfony were to change its license in a pre-release, that would be the license used for the badge regardless.
This is not the case. Please trust the code instead of trying to infer things :) If the license changed in a prerelease, then the minified json would look like:
"packages": {
"symfony/symfony": [
{
"version": "v5.3.0-RC1",
"license": [
"NEW-5.3-LICENSE" // new license here
],
"require": {
"php": ">=7.2.5"
},
},
{
"version": "v5.3.0-BETA4",
},
{
"version": "v5.3.0-BETA3",
},
{
"version": "v5.3.0-BETA2",
},
{
"version": "v5.3.0-BETA1", // new license was added in beta1 let's assume, so it is valid until here
},
{
"version": "v5.2.9",
"require": {
"php": ">=7.2.5",
},
"license": [
"MIT" // old license now is set as it changed from the versions above
],
}
]
}
In both cases, after running MetadataMinifier::expand, you would have license:["MIT"]
present in v5.2.9.
/p2/user/repo~dev.json
includes only dev releases, so if you're looking for latest stable this is entirely useless. Please just use /p2/user/repo.json
with a MetadataMinifier::expand port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please everyone take a couple minutes to read through https://packagist.org/apidoc#get-package-data which explains our new v2 api in details. Note the point about caching and If-Modified-Since
.. I don't know if you can easily integrate this here to save everyone some bandwidth but it would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the case. Please trust the code instead of trying to infer things :) If the license changed in a prerelease, then the minified json would look like:
@Seldaek so you’re saying that if there is a change between lisences, the minified response would still include the license with the last version to have that license before the change? I guess I just didn’t fully understand what was minified in the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you iterate through versions, the license will always be the last license you saw above. If it isn't anymore, then the license is present in the version object. If an older version lacks a license, the license is set to "__unset"
. But really IMO the best is to run the expand
logic and then use the objects as you would normally, so you don't need to consider any of this.
'1.0.0': { require: { php: '^5.6 || ^7' } }, | ||
'2.0.0': { require: { php: '^7.2' } }, | ||
'3.0.0': { require: { php: '^7.4 || 8' } }, | ||
'dev-main': { require: { php: '^8' }, 'default-branch': true }, | ||
}, | ||
'samwise/gardening': { | ||
'1.0.x-dev': {}, | ||
'2.0.x-dev': {}, | ||
}, | ||
'pippin/mischief': { | ||
'1.0.0': {}, | ||
'dev-main': { require: {}, 'default-branch': true }, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it typically concerns me when we have to modify tests as part of some other change. I realize that some updates are inevitable here since the new api has a different response structure, but I feel like there's more going on or at least that I'm missing some context (though I presume this is all branch related)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is solely because of the change in response structure as seen in the diff.
I'll try and have a proper look over this in the next couple of days and see if I have anything useful to add |
I did start looking at this one tonight and I've made some notes but I still need to spend a bit more time on it before I can post something useful. |
I'm kind of hesitant to bring this one up: I don't want to get too bogged down in the 1% case because I think fixing this for the 99% case (packagist.org) is the priority and that is currently broken or producing unexpected behaviour for some users. That said, this transition is slightly complicated by the fact that the packagist integration allows the user to specify a custom (self-hosted) registry using the If we completely drop compatibility with the v1 API and move to only allowing the v2 API to be used, if anyone is currently using this with a self-hosted registry that is running an older version of self-hosted packagist, their badges are going to break. I think with one of the other services (sonar? nexus?) we support multiple API versions and we try the most recent first then fall back to the second (but allow you to explicitly specify the version). This situation is slightly complicated by the fact that it seems like the old API is retained but serves stale data so falling back by default might not be a great idea either :/ I think my view on this is we should just optimise to work for packagist.org and other custom registries that maintain feature parity with packagist.org. I think a fallback is going to further complicate an integration which already has a few pain points to serve a use-case which may turn out to be hypothetical in any case. |
@chris48s I have added the changes you mentioned regarding the PHP version support badge and license badge. |
@GeoffSelby sorry I've not had a chance to get to this one yet. I've been a bit over-committed this week and I've been slightly putting this review off as I know I need to find a decent block of time to read over it properly but I'm not ignoring it. I will find some time for this over the weekend. |
I suggest we override class BasePackagistService extends BaseJsonService {
async _requestJson({ schema, url, options = {}, errorMessages = {} }) {
const mergedOptions = {
...{ headers: { Accept: 'application/json' } },
...options,
}
const { buffer } = await this._request({
url,
options: mergedOptions,
errorMessages,
})
const compressedJson = this._parseJson(buffer)
// decompress the response here
const expandedJson = this.decompressResponse(compressedJson)
return this.constructor._validateJson(expandedJson, schema)
}
} then because we apply the schema after decompressing we can make the things we need On the subject of validation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few more comments but this is coming along nicely. Thanks
@@ -59,17 +58,26 @@ module.exports = class PackagistLicense extends BasePackagistService { | |||
} | |||
|
|||
transform({ json, user, repo }) { | |||
const branch = this.getDefaultBranch(json, user, repo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can delete getDefaultBranch()
from the base class now. It is not being called anywhere.
version, | ||
server | ||
) | ||
} catch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stylistically we always use } catch (e) {
we should probably have an eslint rule for that
} | ||
} | ||
|
||
async transform({ json, user, repo, version = '', server }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic path for this badge is a bit muddled
Our sequence of execution is fetch --> transform --> render (or sometimes just fetch --> render)
- Fetch gets data from APIs
- Transform re-shapes the data (if necessary)
- Render turns it into a badge
Here we are making an API call in transform, so that's a good indicator we've split the logic up incorrectly or named things poorly.
In this situation, calling fetch()
, trying to find the latest version in there, falling back to calling fetchDev()
if necessary is all part of the "fetch" stage.
The "transform" stage (to the extent there is one for this badge) is just extracting .require.php
findRelease(json, versions = []) { | ||
json.forEach(version => { | ||
versions.push(version.version) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not that clearly named. How about findLatestRelease()
?
Also, we should have one method of determining the latest (stable) version that we use consistently - we shouldn't have 2 different methods for different badges. More commentry on this on the packagist-version.js
file..
I don't think versions
needs to be a function param here. We never pass anything in when we call it. How about we make this just
const versions = json.map(version => version.version)
@@ -90,25 +85,27 @@ class PackagistVersion extends BasePackagistService { | |||
|
|||
transform({ includePrereleases, json, user, repo }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted on findRelease()
, we should have one method for working out the latest (stable) version. I think most of the logic in this function was there to handle dev branches being mixed in with the releases. Now that they are on a different endpoint ..~dev.json
, I think we can ditch all this can't we (do correct me if I'm wrong)? If so, this simplifies to something like:
const packageSchema = Joi.array().items(
Joi.object({version: Joi.string().required()})
)
transform({ includePrereleases, json, user, repo }) {
const versionsData = json.packages[this.getPackageName(user, repo)]
const versions = versionsData.map( version => version.version)
if (includePrereleases) {
return { version: latest(versions) }
} else {
const stableVersion = latest(versions.filter(isStable))
return { version: stableVersion || latest(versions) }
}
}
Then at that point, the logic for identifying the latest release is the same logic as in your findRelease()
(which we should rename) but with a includePrereleases
flag.. so we can move this to the base class and call with includePrereleases=false
when we use it for the license/PHP version badges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Packagist do not have separate tagged version on different branch. Even the main branch is considered a development branch. The ~dev.json
endpoint is useful for accessing information of branch HEAD, regardless if it is tagged or not. The current version of documentation stated:
The
p2/$vendor/$package.json
file contains only tagged releases. If you want to fetch information about branches (i.e. dev versions) you need to downloadp2/$vendor/$package~dev.json
.
Sometimes the tagged pre-release versions (e.g. v0.1.2) of libraries are published and used in production by others. In which case, the latest version would be "not stable" but is not shown on the ~dev.json
endpoint but is still relevant.
It is possible for either .json to be empty as long as no tag exists. For ~dev.json to be empty is very unlikely and would indicate a broken package most likely, I wouldn't really care about handling this gracefully but it is an eventuality. If there is a package version object present then yes the version field will always be present. |
@GeoffSelby have you had a chance to read through @chris48s's latest review comments? It would be nice to get this landed. 😉 |
Given that there hasn't been any activity for a while, I'm going to close this pull request. Feel free to reopen it if you get a chance to look into it again. If someone else wants to pick this up, you can simply fetch the commits by running |
Address issues raised by @chris48s in badges#6508, which this PR is base on. Includes: * Remove getDefaultBranch() from base class for it is no longer used. * Change try-catch statement syntax to align code style. * Rename findRelease() to findLatestRelease() for clarity.
* Update Packagist service to use v2 api Packagist deprecated the original `packagist.org/p/username/package` endpoint in favor of v2 `packagist.org/p2/username/package` endpoint. Because of this, new packages aren't being found using v1. This PR updates the Packagist service to use the new endpoint. * Adjust validation schema Some packages don't return the same data structure as others with the new api endpoints. This changes the validation schema to account for the potential differences. * Fix typo * Throw NotFound when license not found * Expand response and find the correct release * Address issues raised by reviewer Address issues raised by @chris48s in #6508, which this PR is base on. Includes: * Remove getDefaultBranch() from base class for it is no longer used. * Change try-catch statement syntax to align code style. * Rename findRelease() to findLatestRelease() for clarity. * remove unusued param * throw if no version found * require version key * use a single consistent method for identifying the latest (tagged) release * don't throw in render() * rename method (this is not really 'transform' in our usual parlance) * Improve BasePackagistService testing * Change BasePackagistService.decompressResponse to static method BasePackagistService.expandPackageVersions. * Fix expandPackageVersions implementation. * Add unit test for the function. * Extend BasePackagistService.findLatestRelease * extend BasePackagistService.findLatestRelease to also handle PackagistVersion. * remove PackagistVersion.transform. * Improve BasePackagistService.findLatestRelease * Update findLatestRelease to throw NotFound itself. * Update PackagistLicense and PackagistVersion and remove the NotFound throwing logics. * Revert unneeded change * Corrected packagist php version spec * The test was on a false assumption that '__unset' might appear inside an array leaf while the composer's MetadataMinifier::minify never does thing recursively. The '__unset' should be value of the top level key. Co-authored-by: Geoff Selby <geoff@geoffcodesthings.com> Co-authored-by: Koala Yeung <koalay@gmail.com> Co-authored-by: repo-ranger[bot] <39074581+repo-ranger[bot]@users.noreply.github.com>
Packagist deprecated the original
packagist.org/p/username/package
endpoint in favor of v2packagist.org/p2/username/package
endpoint. Because of this, new packages aren't being found using v1.This PR updates the Packagist service to use the new endpoint.