-
-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New S3 Boto3 backend (closes #57) #111
Conversation
de60f98
to
cc4ae56
Compare
What would it take for this to be reviewed and/or merged in. Manually running "diff s3boto.py s3boto3.py" should be able to show the changes and why the 2 backends are so incompatible (unfortunately GitHub doesn't make this easy). A refactor to try to have the 2 backends share code is liable to make both unmaintainable, while the approach of replace s3boto with the new s3boto3 is dangerous too because it is not completely a drop-in replacement, so I intentionally made it sit side by side with the existing s3boto. |
I was just thinking about this as I was falling asleep last night. I'm going to push out a 1.3.2 later today and then review this. I'd like to get it into a 1.4 by the end of the week. The way the code is architected is completely correct, I am the only holdup. |
self.name = name[len(self._storage.location):].lstrip('/') | ||
self._mode = mode | ||
self.obj = storage.bucket.Object(storage._encode_name(name)) | ||
# TODO: Is this explicitly necessary? Done to emulate old |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to remove this sort of questionable backwards compat code for a new backend. If there is an issue we can deal with it.
There have been a few fixes that have landed in master since this was first opened. If we can get those moved over as well I will merge. I'm going to release a 1.4 that is just changing the PyPI package name back to django-storages (!!) but will do a fast 1.5 following with this code. |
* Based on existing s3boto module * Replace Boto2 headers settings with parameters * Does not support proxies, alternate host/port
If I were to specify a KMS Key Id, should I put it in |
For the use case that I wrote this port for, we happen to use an alias for the KMS ID (e.g. 'alias/foobar'), but I think it'll support any of alias, ARN, or key ID for SSEKMSKeyId. |
Any timeline for this to be released? Happy to test it out and provide some feedback if necessary 😄 |
@alejandrodnm how would you install the fork? |
@jplaza I added to my requirements.txt file The thing is that if you do a |
Cool! Thank you @alejandrodnm |
Any update on this? I left a comment on b95ec91 about 1 1/2 months back noting my doubts about a specific change you were asking me to merge in, but otherwise it is up to date with what was checked in at that time. |
Non-existent file raises IOError in _open for backwards compatibility with s3boto Don't let the ClientError bubble up
What is the current status of boto3 support? I'm guessing currently unsupported in latest pypi release? The RTD documentation has instructions for it, but an attempt to use it results in an import failure for original boto. |
I, too, would like to see this PR get merged. My company's upcoming website offering is going to be relying on django-storages, and we'd definitely rather not be left in the dust with old-boto as it becomes deprecated. |
+1 |
# Preserve the trailing slash after normalizing the path. | ||
# TODO: Handle force_http=not self.secure_urls like in s3boto | ||
name = self._normalize_name(self._clean_name(name)) | ||
if self.custom_domain: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbarrien This skips generate_presigned_url
for custom domains, which means that for those we lose the AWS_QUERYSTRING_AUTH
parameter. I don't think that's intended.
@mbarrien Why not create a separate django-s3boto3-storage package instead? The bundling of all these different storage back ends in one package doesn't contribute to any use case I can think of. |
The backends only work for django-storages so there's very little point to having a package per-backend. The real work for this is done in boto3 which is it's own package. This backend just takes 30KB and has no overhead aside from the space on the drive. |
With that said, I do think somebody should go through the new backend and devolve as much as possible up to boto3. Code like: if err.response['ResponseMetadata']['HTTPStatusCode'] == 301:
raise ImproperlyConfigured("Bucket %s exists, but in a different "
"region than we are connecting to. Set "
"the region to connect to by setting "
"AWS_S3_REGION_NAME to the correct region." % name) Seems very low level - I have to imagine boto3 abstracts all the HTTP responses into status codes and messages that can simply be passed to the user directly rather than running through custom logic. Where custom logic is necessarily, there should at least be a TODO added that links to a ticket in boto3 trying to get that logic implemented upstream as appropriate. |
@adamn Believe me I went looking for something higher level for that and could find nothing. Perhaps things have changed since then; I haven't touched the code in a few months and perhaps Boto3 has added it. |
@adamn I disagree if this would be it's own package it can directly depend on a correct minimum version of boto3. I ONLY want the s3boto3 backend so I copied it into my project from this pull request and made some changes that I needed but I rather depend on a single purpose clean package with tests instead of the full django-storages suite which does't even have tests written for several of the back ends. |
The problem with multiple backend projects is that they would all have to be coordinated for almost no gain. What would make sense might be multiple requirements.txt files. But since this project doesn't even have a requirements.txt file, that seems like a whole other issue to be addressed. I wholeheartedly agree though that any future requirements.txt files should not install drivers for the backends automatically - since that would be burdensome. |
Why would they have to be coordinated? The storage system is a part of Django, not local to django-storages. Also, package dependencies are specified in setup.py for a package, not a requirements file. |
Sorry, should have said setup.py. Nonetheless, there are no required packages in the current setup.py file so Coordination would be a problem if Django changes it's storage interface, But anyway, this isn't really my decision to make - just an opinion. On Sat, Jun 25, 2016 at 3:46 AM, Thomas Frössman notifications@github.com
|
@jschneier I've been using this in production for a few months now and I can confirm it works great. You said back in January that you'd like to land this soon but it's blocked on you. Do you need help maintaining the library? I can't promise much of my time but as long as we're using the library in our app, I can help review PRs and merge some stuff in. |
BTW, I do think the PR needs some improvements (cf. my comment on |
+1 to just merging it and putting other work in future PRs. |
I love this PR :) Wait for merging. |
This pull request implements #57 by adding a Boto 3 backend that tries to be a close-to-drop-in replacement for Boto 2. Due to the slight differences, I have kept it a separate backend, but with a lot of copy and paste code. Given that Boto 2 is heading towards maintenance mode according to boto/boto@e3dd996, I don't think it's worth trying to have 2 backends sharing code when the Boto 2 implementation looks on the way to being deprecated.
Note that this isn't just me blithely throwing away the Boto 2 implementation; the fundamental underlying operations are VERY different and worthy of a separate backend. Boto 2 operates on the assumption that you can set arbitrary headers by passing in a dictionary; Boto 3 restricts you to specific named parameters and as such the 2 approaches are very incompatible with one another. You can try to do minor mappings here and there to try to map some of the headers in the AWS_HEADERS setting, but trying to map every possible header value to the right argument name in the right method is pretty tedious and error prone. Instead, this pull request embraces Boto 3's use of parameters as its way of taking in these extra arguments, leaving the remapping up to the django-storages user who wants to switch backends. For the limited number of extra headers and parameters they'll use, this mapping is easier to do, and looking up in Boto 3's documentation for the parameter name is straightforward.
This pull request replaces #66, adding unit tests and incorporating changes due to pull requests accepted into Boto3/botocore. A substantially similar version of this (minus the recent merges from the past few days) has been run in production with Django 1.6.11 for several months without problems.
Also note that while I was at it, I made the necessary change to support #95 if you switch to this backend.
Because this is based on s3boto.py, you should be able to manually perform a diff of s3boto.py and the new s3boto3.py to understand the changes.
Changes:
Known issues:
This has been tested using s3v4 signatures with both signed and unsigned URLs, along with response content disposition and KMS server-side encryption key arguments.