-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/simple serves HTML that can't be parsed by Python's xml.etree if package has yanked releases #7886
Comments
Hi @boegel thanks for the report. From https://docs.python.org/3/library/xml.etree.elementtree.html:
PEP 503, which defines the simple API, says that this page is HTML, not XML:
Therefore I wouldn't expect to be able to parse a As is, this page is valid HTML5, so I don't think there's anything for us to do here. I'd recommend using |
Thanks for the feedback! Unfortunately It's a shame that there's no interest in fixing this on the PyPI side, since it seems like it could be an easy fix, for example by indicating yanked releases with I didn't see anything about the |
Yes, in PEP 592. Unfortunately adding a value to this attribute has meaning according to the PEP, so we can't just make it "yes" or something similar. |
@di How about using |
Yes, I think that would depend on how |
|
I'm not super enthused by attempting to maintain compatibility with an XML parser and I think it's likely going to be error prone since this response is html5 and not XML. In this case we can possibly do it, but I'm not sure that holds true moving forward. Maybe simple changes infrequently enough and is weird enough that it isn't a big deal and it's worth doing, I dunno. I just worry that long term it's a a futile effort. |
We've been relying on The issue I reported is quite annoying for us, since it effectively break the auto-download-from-PyPI feature we have in all existing EasyBuild releases. That shouldn't be the big motivation here though, of course, but I suspect other people be running into this too (and it's not trivial to pinpoint the exact issue either if you see the error popping up, it look like a fluke in PyPI at first to be honest). Maybe it's sufficient to have a test somewhere that checks whether a page like https://pypi.python.org/simple/pip can be parsed by If I can help with that in any way, I'd love to hear it. I'll try and make sure we don't rely on |
To add on
Those are not only equivalent to pip they are in fact defined to be equivalent in HTML5: https://stackoverflow.com/a/23752239/1930508 So I see no downside in adding that. Question however: PIP for Python2 is still (kinda) supported (at least up to some version). Does that access the /simple-endpoint too? How does it parse the page? |
Oh, even better. Hopefully that makes it less of an issue to change to
Latest |
Pip uses https://pypi.org/project/html5lib/ to parse |
Looking at https://github.com/pypa/warehouse/pull/7916/files#diff-5e24a0b38c92bf7d3bb2982c25b10156R22 then yes it will be |
@boegel @Flamefire that's correct—the attribute after the change would now be one of: absent, |
@boegel Correct. The deployment takes about ~5 minutes from merge, I can let you know once it's live if you'd like. In addition, it takes about ~24 hours for these pages to fall out of our cache, so it might be a while until 100% of |
OK, I can keep an eye on this, give it another shot tomorrow, thanks! |
This is now deployed, you can see it on https://pypi.org/simple/pip/ for example, for which I have manually purged the cache. |
Describe the bug
Parsing HTML served by
/simple
endpoint results inxml.etree.ElementTree.ParseError
.Expected behavior
No parse error, as it was before when there were no yanked releases yet or with packages that don't have any yanked releases (yet).
To Reproduce
Python script
test.py
that contains:run it with
python test.py
, for example (on macOS):The problem is the
data-yanked
part in lines like:My Platform
Additional context
data-yanked
part (see strip out 'data-yanked' from HTML page with package source URLs served by PyPI easybuilders/easybuild-framework#3303), but this issue still occurs in EasyBuild releases that worked fine perfectly before package releases were getting yankedThe text was updated successfully, but these errors were encountered: