Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove key prefix from filename in recursive mode #3641

Merged
merged 4 commits into from
Oct 9, 2018

Conversation

stealthycoin
Copy link
Contributor

@stealthycoin stealthycoin commented Oct 8, 2018

fixes #604

@codecov-io
Copy link

codecov-io commented Oct 8, 2018

Codecov Report

Merging #3641 into v2 will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##               v2    #3641      +/-   ##
==========================================
+ Coverage   94.87%   94.88%   +<.01%     
==========================================
  Files         177      177              
  Lines       13521    13516       -5     
==========================================
- Hits        12828    12824       -4     
+ Misses        693      692       -1
Impacted Files Coverage Δ
awscli/customizations/s3/subcommands.py 97.38% <100%> (+0.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 580a15d...8cb123f. Read the comment docs.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a comment on an edge case. We will probably will want to add a functional test for this. Also, I think we should feel liberal to refactor the internals if it simplifies the logic

@@ -533,6 +534,9 @@ def _display_page(self, response_data, use_basename=True):
filename = filename_components[-1]
else:
filename = content['Key']
if strip_prefix:
if filename.startswith(strip_prefix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I do not think it will be that simple. Specifically, if a user provides the following command:

$ aws s3 ls s3://mybucket/a --recurisve

It should print things like:

abc
aws-cli/__init__.py

By solely stripping out the prefix, we will start getting this instead:

bc
ws-cli/__init__.py

I tried fixing this a long time ago: #1009. It may be worth looking at the edge cases I ran into for reference. Looks like I at least put a good amount of comments for reference.

@joguSD
Copy link
Contributor

joguSD commented Oct 8, 2018

Also worth nothing that this removes the ability to print the full key paths. I think it could be a separate PR but we should look into adding an --absolute parameter for anyone who might be depending on this functionality.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just had some suggestions on how we can refactor this a bit more.

@@ -199,6 +199,40 @@ def test_requester_pays_with_no_args(self):
'Prefix': 'foo/'
})

def test_recursive_list_at_prefix(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth adding a test for the in-between delimiter cases. For example:

$ aws s3 ls s3://mybucket/f --recursive
2017-11-22 12:48:50       1448 fo
2017-11-22 12:48:50       1448 foo/bar

if strip_prefix:
if filename.startswith(strip_prefix):
filename = filename[len(strip_prefix) + 1:]
filename = self._get_relative_key(key, content['Key'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can simplify this method a little more? Specifically, I think we have generalized the _the_relative_key() method enough that we may be able to just remove the use_basename completely and just make the key required.

@@ -512,7 +512,7 @@ def _list_all_objects(self, bucket, key, page_size=None,
self._display_page(response_data)

def _display_page(self, response_data, use_basename=True,
strip_prefix=None):
key=''):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to rename this to something like provided_prefix or user_prefix? Using the name key is not necessarily correct as that value was used as the Prefix value to list objects v2 and it is a little ambiguous on where the key value came from, the response or user input.

@kyleknap
Copy link
Contributor

kyleknap commented Oct 9, 2018

@joguSD. Talked offline about this but wanted to capture what we talked about. I agree. I think the --absolute parameter would help a lot especially if you are doing scripting off of it. We should add it if we get enough requests for it. However, the purpose of this PR is to make the formats consistent, and having relative seems like a good default for interactive introspection objects in a bucket. Specifically, it would get annoying to run the aws s3 ls command and have the entire key name returned for deeply nested objects.

Removed a branch from the _display_page method and added a functional
and integration test for another case.
@stealthycoin stealthycoin force-pushed the fix-recursive-output-format branch from 31ef7bc to 42c7b1d Compare October 9, 2018 23:06
@stealthycoin stealthycoin force-pushed the fix-recursive-output-format branch from 42c7b1d to 8cb123f Compare October 9, 2018 23:11
Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I really like the refactoring. 🚢

@stealthycoin stealthycoin merged commit c6fe069 into aws:v2 Oct 9, 2018
@stealthycoin stealthycoin deleted the fix-recursive-output-format branch October 9, 2018 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants