-
Notifications
You must be signed in to change notification settings - Fork 985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify Upload API to be asynchronous #7730
Comments
Continuing the IRC transcript:
cc @woodruffw |
Alright, so here's the rough idea I have for the work flow for an asynchronous upload API that is also extensible in the future to support PEP 480 (assuming we ever accept and implement PEP 480). This should itself be a PEP since it's defining an standards based API, but just to get the very rough idea out there, here's a quick summary of what I'm thinking: The basic idea here is the Upload API effectively becomes 3 endpoints, and an upload request "flows" through them. This is basically modeled after the YouTube upload API and gives a few extra nice properties that we didn't have before.
This gives us all of the metadata up front, and allows us to do things like check permissions, check filenames, validate the metadata, etc basically anything that we can do upfront without the actual file, before the user has ever attempted to upload the file and return with an appropriate error if any of that fails. Assuming all of the the checks above pass, this would return with a
Once the server has received the data for all of the files present in this upload, it will then finish processing the request (including any async checks it wishes to do now that the full contents of each file is available) and then publish the files.
This change has a number of positives:
It is not without it's downsides however
Over all, I think this API for uploads allows us a lot more flexibility and gives us a much nicer UX over all, at the cost of some additional complexity in implementation (most of which is on the Warehouse side, but some is on the client side). This would not change anything in the Simple API or anything else besides the actual act of uploading. I'm going to start writing this up into a proper PEP, but I wanted to give a sort of brain dump on my thoughts here to see if anyone else had any thoughts on it. I had a quick call with @trishankkarthik just to make sure that for any hypothetical PEP 480 world, that the above API wouldn't lock us into place. It appears that we're perfectly fine to stick the TUF metadata as a sub key under the overall JSON object that gets sent in (1). For plans even beyond that, if we ever implement in-toto (which, who knows if we will) we would need the ability to upload multiple files in a single transaction (in-toto uses additional files beyond the actual file payloads), since one of the problems with the existing API is the lack of a multi file atomicity, this lead into the ability to upload multiple files in a single "transaction", which we could also then utilize for something like in-toto if we so desired. If/when PEP 480 gets implemented, we would also have the option of uploading the TUF metadata as an additional file, instead of baking it into the JSON object in the initial publish if we wanted to as well (and there is some benefit to doing that in terms of the total size of that initial request, but downsides in that we can't verify the TUF metadata is acceptable prior to accepting file uploads). |
@dstufft You mentioned
Have you heard any thoughts outside this issue? And how is the PEP going? |
Per pypi/warehouse#7730 , overhauling Warehouse's upload API requires a new PEP. The fundable improvement guidelines say we shouldn't ask for money for something till we have consensus for the idea, meaning that any PEPs are finished and approved. Thus, this commit removes the upload work from the list of features in the PyPI API revamp task. Signed-off-by: Sumana Harihareswara <sh@changeset.nyc>
Per pypi/warehouse#7730 , overhauling Warehouse's upload API requires a new PEP. The fundable improvement guidelines say we shouldn't ask for money for something till we have consensus for the idea, meaning that any PEPs are finished and approved. Thus, this commit removes the upload work from the list of features in the PyPI API revamp task. Signed-off-by: Sumana Harihareswara <sh@changeset.nyc>
There is now PEP 694: Upload 2.0 API for Python Package Repositories, which has discussions on discuss.python.org which is relevant to this issue. |
What's the problem this feature will solve?
Right now, uploading to Warehouse is synchronous. This is a pain when we want to implement upload gating like #5420 or other checks, and cramps our style regarding TUF (@ewdurbin can go into that further).
Describe the solution you'd like
We would change the Warehouse API (or add a new version) to make uploads asynchronous.
Additional context
From IRC today:
The text was updated successfully, but these errors were encountered: