Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors and warnings are printed to stderr. #919

Merged
merged 2 commits into from
Sep 25, 2014
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ Next Release (TBD)
``aws configure set`` command now write out all credential
variables to the shared credentials file ``~/.aws/credentials``
(`issue 847 <https://github.com/aws/aws-cli/issues/847>`__)
* bugfix:``aws s3``: Write warnings and errors to standard error as
opposed to standard output.
(`issue 919 <https://github.com/aws/aws-cli/pull/919>`__)


1.4.4
Expand Down
14 changes: 11 additions & 3 deletions awscli/customizations/s3/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,13 +274,15 @@ def run(self):

def _process_print_task(self, print_task):
print_str = print_task.message
print_to_stderr = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the newly added lines here, this method is now 62 lines long. Let's split this up to make it easier to manage.

if print_task.error:
self.num_errors_seen += 1
print_to_stderr = True
warning = False
if print_task.warning:
if print_task.warning:
warning = True
self.num_warnings_seen += 1
warning = True
self.num_warnings_seen += 1
print_to_stderr = True
final_str = ''
if warning:
final_str += print_str.ljust(self._progress_length, ' ')
Expand Down Expand Up @@ -309,6 +311,12 @@ def _process_print_task(self, print_task):
self._num_parts += 1
self._file_count += 1

# If the message is an error or warning, print it to standard error.
if print_to_stderr == True and not self._quiet:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use equality checks to compare booleans, just if print_to_stderr and ...

uni_print(final_str, sys.stderr)
sys.stderr.flush()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that every usage of uni_print is followed by a flush of the same stream. Why not have uni_print flush the fileobj before returning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uni_print is used by other classes. I checked those classes and they also flush(). So, it should be safe to change.

final_str = ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to not return after this line? Do we still want to print the progress on an error? For example:

$ aws s3 sync . s3://bucket/syncme/
warning: Skipping file /private/tmp/syncme/baz. File/Directory is not readable.
Completed 0 part(s) with ... file(s) remaining

Looks a bit off to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not want to return. The progress bar should always be the last thing printed for each print task, unless it is the last file. The example you gave is legitimate since the ... implies that there are files that will be attempted to be uploaded and the user should know that. And it is also true that 0 parts have been uploaded because that file is skipped. The progress bar is no longer printed once the number of completed tasks is equal to the number of expected tasks. Warnings/Skips are not included in this count. The behavior you described is the current behavior of the PrintThread even without my change. My change only prints it to stderr as opposed to stdout.

If I was to put a return at the end of the that code block, there can be scenarios when the user is without the progress bar for a long period of time. For example, if you had a bunch, like thousands of failed uploads, the progress bar would never show up because we never print it, and therefore the user would have no idea when the command is reaching an end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example I gave above, it is the last file we try to sync. Specifically I have three files in my directory, 1 that is unreadable. Running the sync command I get:

# First sync, this output looks reasonable:

$ aws s3 sync . s3://bucket/syncme2/
warning: Skipping file /private/tmp/syncme/baz. File/Directory is not readable.
upload: ./foo to s3://bucket/syncme2/foo
upload: ./bar to s3://bucket/syncme2/bar
$

# Now running the same command again I get:

$ aws s3 sync . s3://bucket/syncme2/
warning: Skipping file /private/tmp/syncme/baz. File/Directory is not readable.
Completed 0 part(s) with ... file(s) remaining
$

The last line seems out of place to me and I don't think it should be there. I'm not saying we shouldn't have progress messages, but it seems like there are cases where we print a progress message that shouldn't be printed, such as in my example above.

Given this behavior existed prior to this change (I only noticed because you're modifying this area of the code and it stood out), I would be ok with addressing this as a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I just tested this and I cannot repo this output:

$ aws s3 mb s3://knapp-new-bucket-2
make_bucket: s3://knapp-new-bucket-2/

$ aws s3 sync . s3://knapp-new-bucket-2/
warning: Skipping file /Users/kyleknap/Documents/GitHub/aws-cli/tests/integration/customizations/s3/wierd_dir/unreadable. File/Directory is not readable.
upload: ./bar to s3://knapp-new-bucket-2/bar     
upload: ./foo to s3://knapp-new-bucket-2/foo

$ aws s3 sync . s3://knapp-new-bucket-2/
warning: Skipping file /Users/kyleknap/Documents/GitHub/aws-cli/tests/integration/customizations/s3/wierd_dir/unreadable. File/Directory is not readable.

$ 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the order of the files. For example:

$ ls
bar baz foo
$ aws s3 sync . s3://jamesls-test-temp/
warning: Skipping file /private/tmp/syncme/baz. File/Directory is not readable.
Completed 0 part(s) with ... file(s) remaining


$ sudo mv baz unreadable
$ ls
bar        foo        unreadable
$ aws s3 sync . s3://jamesls-test-temp/
warning: Skipping file /private/tmp/syncme/unreadable. File/Directory is not readable.

$

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It is an easy fix. We just need to put an additional check in there that if self._num_parts == 0 then do not print the progress bar probably somewhere around here:
https://github.com/aws/aws-cli/blob/develop/awscli/customizations/s3/executor.py#L312

I can make another pull request later to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or I meant self._file_count


is_done = self._total_files == self._file_count
if not is_done:
prog_str = "Completed %s " % self._num_parts
Expand Down
20 changes: 11 additions & 9 deletions awscli/customizations/s3/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,24 +223,26 @@ def __init__(self):
self.count = 0


def uni_print(statement):
def uni_print(statement, out_file=None):
"""
This function is used to properly write unicode to stdout. It
ensures that the proper encoding is used if the statement is
not in a version type of string. The initial check is to
allow if ``sys.stdout`` does not use an encoding
This function is used to properly write unicode to a file, usually
stdout or stdderr. It ensures that the proper encoding is used if the
statement is not a string type. The initial check is to allow if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "The initial check is to allow if..." sentence is confusing.

``out_file`` does not use an encoding.
"""
encoding = getattr(sys.stdout, 'encoding', None)
if out_file is None:
out_file = sys.stdout
encoding = getattr(out_file, 'encoding', None)
if encoding is not None and not PY3:
sys.stdout.write(statement.encode(sys.stdout.encoding))
out_file.write(statement.encode(out_file.encoding))
else:
try:
sys.stdout.write(statement)
out_file.write(statement)
except UnicodeEncodeError:
# Some file like objects like cStringIO will
# try to decode as ascii. Interestingly enough
# this works with a normal StringIO.
sys.stdout.write(statement.encode('utf-8'))
out_file.write(statement.encode('utf-8'))


def guess_content_type(filename):
Expand Down
30 changes: 15 additions & 15 deletions tests/integration/customizations/s3/test_plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,10 +164,10 @@ def assert_no_errors(self, p):
self.assertEqual(
p.rc, 0,
"Non zero rc (%s) received: %s" % (p.rc, p.stdout + p.stderr))
self.assertNotIn("Error:", p.stdout)
self.assertNotIn("failed:", p.stdout)
self.assertNotIn("client error", p.stdout)
self.assertNotIn("server error", p.stdout)
self.assertNotIn("Error:", p.stderr)
self.assertNotIn("failed:", p.stderr)
self.assertNotIn("client error", p.stderr)
self.assertNotIn("server error", p.stderr)


class TestMoveCommand(BaseS3CLICommand):
Expand Down Expand Up @@ -458,7 +458,7 @@ def test_download_non_existent_key(self):
expected_err_msg = (
'A client error (NoSuchKey) occurred when calling the '
'HeadObject operation: Key "foo.txt" does not exist')
self.assertIn(expected_err_msg, p.stdout)
self.assertIn(expected_err_msg, p.stderr)


class TestSync(BaseS3CLICommand):
Expand Down Expand Up @@ -645,7 +645,7 @@ def testFailWithoutRegion(self):
p2 = aws('s3 sync s3://%s/ s3://%s/ --region %s' %
(self.src_bucket, self.dest_bucket, self.src_region))
self.assertEqual(p2.rc, 1, p2.stdout)
self.assertIn('PermanentRedirect', p2.stdout)
self.assertIn('PermanentRedirect', p2.stderr)

def testCpRegion(self):
self.files.create_file('foo.txt', 'foo')
Expand Down Expand Up @@ -695,9 +695,9 @@ def extra_setup(self):
def test_no_exist(self):
filename = os.path.join(self.files.rootdir, "no-exists-file")
p = aws('s3 cp %s s3://%s/' % (filename, self.bucket_name))
self.assertEqual(p.rc, 2, p.stdout)
self.assertEqual(p.rc, 2, p.stderr)
self.assertIn('warning: Skipping file %s. File does not exist.' %
filename, p.stdout)
filename, p.stderr)

@unittest.skipIf(platform.system() not in ['Darwin', 'Linux'],
'Read permissions tests only supported on mac/linux')
Expand All @@ -711,9 +711,9 @@ def test_no_read_access(self):
permissions = permissions ^ stat.S_IREAD
os.chmod(filename, permissions)
p = aws('s3 cp %s s3://%s/' % (filename, self.bucket_name))
self.assertEqual(p.rc, 2, p.stdout)
self.assertEqual(p.rc, 2, p.stderr)
self.assertIn('warning: Skipping file %s. File/Directory is '
'not readable.' % filename, p.stdout)
'not readable.' % filename, p.stderr)

@unittest.skipIf(platform.system() not in ['Darwin', 'Linux'],
'Special files only supported on mac/linux')
Expand All @@ -723,10 +723,10 @@ def test_is_special_file(self):
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.bind(file_path)
p = aws('s3 cp %s s3://%s/' % (file_path, self.bucket_name))
self.assertEqual(p.rc, 2, p.stdout)
self.assertEqual(p.rc, 2, p.stderr)
self.assertIn(("warning: Skipping file %s. File is character "
"special device, block special device, FIFO, or "
"socket." % file_path), p.stdout)
"socket." % file_path), p.stderr)


@unittest.skipIf(platform.system() not in ['Darwin', 'Linux'],
Expand Down Expand Up @@ -806,10 +806,10 @@ def test_follow_symlinks_default(self):

def test_bad_symlink(self):
p = aws('s3 sync %s s3://%s/' % (self.files.rootdir, self.bucket_name))
self.assertEqual(p.rc, 2, p.stdout)
self.assertEqual(p.rc, 2, p.stderr)
self.assertIn('warning: Skipping file %s. File does not exist.' %
os.path.join(self.files.rootdir, 'b-badsymlink'),
p.stdout)
p.stderr)


class TestUnicode(BaseS3CLICommand):
Expand Down Expand Up @@ -945,7 +945,7 @@ def test_mb_rb(self):
def test_fail_mb_rb(self):
# Choose a bucket name that already exists.
p = aws('s3 mb s3://mybucket')
self.assertIn("BucketAlreadyExists", p.stdout)
self.assertIn("BucketAlreadyExists", p.stderr)
self.assertEqual(p.rc, 1)


Expand Down
8 changes: 4 additions & 4 deletions tests/integration/customizations/s3/test_s3handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,12 +163,12 @@ def setUp(self):
self.s3_files = [self.bucket + '/text1.txt',
self.bucket + '/another_directory/text2.txt']
self.output = StringIO()
self.saved_stdout = sys.stdout
sys.stdout = self.output
self.saved_stderr = sys.stderr
sys.stderr = self.output

def tearDown(self):
self.output.close()
sys.stdout = self.saved_stdout
sys.stderr = self.saved_stderr
clean_loc_files(self.loc_files)
s3_cleanup(self.bucket, self.session)

Expand Down Expand Up @@ -215,7 +215,7 @@ def setUp(self):
self.session = botocore.session.get_session(EnvironmentVariables)
self.service = self.session.get_service('s3')
self.endpoint = self.service.get_endpoint('us-east-1')
params = {'region': 'us-east-1', 'acl': ['private']}
params = {'region': 'us-east-1', 'acl': ['private'], 'quiet': True}
self.s3_handler = S3Handler(self.session, params)
self.bucket = make_s3_files(self.session, key1=u'\u2713')
self.bucket2 = create_bucket(self.session)
Expand Down
14 changes: 11 additions & 3 deletions tests/unit/customizations/s3/test_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,12 +90,20 @@ def __call__(self):
executor.wait_until_shutdown()
self.assertEqual(open(f.name, 'rb').read(), b'foobar')


class TestPrintThread(unittest.TestCase):
def test_print_error(self):
result_queue = queue.Queue()
print_task = PrintTask(message="Fail File.", error=True)
thread = PrintThread(result_queue, False)
with mock.patch('sys.stderr', new=six.StringIO()) as mock_stderr:
thread._process_print_task(print_task)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's frowned upon to test through an internal API. Unless there's a really good reason, the unittests should be using the public APIs of the objects under test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the only reason I tested these are internal methods of threads, and the only way to call these methods is if I run the thread. So that means I would have to be careful that the threads are cleaned up properly. But, it should not be an issue if I use the Executor methods to control the threads

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an aside, but I guess I should also mention that you typically don't want to spin up threads in a unittest unless that's actually the behavior you're testing, i.e something related to threading/synchronization. I realize that there are some existing S3 tests that do this, but we really shouldn't be doing this in new tests. In this scenario, the thing you're testing has, as you mentioned, nothing to do with threads, so they should ideally not be spun up. Compare that to a test like TestMultipartUploadContext, in which the class it's testing is all about coordinating multiple threads together for a concurrent multipart upload (and nothing else). That makes sense to spin up multiple threads because that's all that class is used for.

If the functionality is too hard to test in the existing class then it might make sense to extract a new class from this code. It also looks like you might be able to accomplish what you want by using .run() instead of .start(), which does not spin up a new thread. Some of the tests above in this file do this and you might be able to follow a similar pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I forgot that there is a separate start() function to run(). For some reason, I thought run() encompassed start(). This is a lot easier now.

self.assertIn("Fail File.", mock_stderr.getvalue())

def test_print_warning(self):
result_queue = queue.Queue()
print_task = PrintTask(message="Bad File.", warning=True)
thread = PrintThread(result_queue, False)
with mock.patch('sys.stdout', new=six.StringIO()) as mock_stdout:
with mock.patch('sys.stderr', new=six.StringIO()) as mock_stderr:
thread._process_print_task(print_task)
self.assertIn("Bad File.", mock_stdout.getvalue())

self.assertIn("Bad File.", mock_stderr.getvalue())
4 changes: 2 additions & 2 deletions tests/unit/customizations/s3/test_s3handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def setUp(self):
self.session = FakeSession()
self.service = self.session.get_service('s3')
self.endpoint = self.service.get_endpoint('us-east-1')
params = {'region': 'us-east-1', 'acl': ['private']}
params = {'region': 'us-east-1', 'acl': ['private'], 'quiet': True}
self.s3_handler = S3Handler(self.session, params)
self.s3_handler_multi = S3Handler(self.session, multi_threshold=10,
chunksize=2,
Expand Down Expand Up @@ -275,7 +275,7 @@ def setUp(self):
self.session = FakeSession(True, True)
self.service = self.session.get_service('s3')
self.endpoint = self.service.get_endpoint('us-east-1')
params = {'region': 'us-east-1'}
params = {'region': 'us-east-1', 'quiet': True}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for the quiet arg being added here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the tests involve testing failing multipart tasks. However, the failures get printed to stderr now, and those messages were getting mixed with dots when running nose, which was annoying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. It was just that from a reviewer's perspective this change seemed out of place without any comment or explanation in the PR so it wasn't clear if this was a mistake or intentional, and if so what its purpose was.

Not opposed to leaving this.

self.s3_handler_multi = S3Handler(self.session, params,
multi_threshold=10, chunksize=2)
self.bucket = create_bucket(self.session)
Expand Down
9 changes: 7 additions & 2 deletions tests/unit/customizations/s3/test_subcommands.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,12 +114,17 @@ def setUp(self):
self.bucket = make_s3_files(self.session)
self.loc_files = make_loc_files()
self.output = StringIO()
self.err_output = StringIO()
self.saved_stdout = sys.stdout
self.saved_stderr = sys.stderr
sys.stdout = self.output
sys.stderr = self.err_output

def tearDown(self):
self.output.close()
self.err_output.close()
sys.stdout = self.saved_stdout
sys.stderr = self.saved_stderr

super(CommandArchitectureTest, self).tearDown()
clean_loc_files(self.loc_files)
Expand Down Expand Up @@ -222,7 +227,7 @@ def test_error_on_same_line_as_status(self):
output_str = (
"upload failed: %s to %s Error: Bucket does not exist\n" % (
rel_local_file, s3_file))
self.assertIn(output_str, self.output.getvalue())
self.assertIn(output_str, self.err_output.getvalue())

def test_run_cp_get(self):
# This ensures that the architecture sets up correctly for a ``cp`` get
Expand Down Expand Up @@ -362,7 +367,7 @@ def test_run_rb_nonzero_rc(self):
cmd_arc.create_instructions()
rc = cmd_arc.run()
output_str = "remove_bucket failed: %s" % s3_prefix
self.assertIn(output_str, self.output.getvalue())
self.assertIn(output_str, self.err_output.getvalue())
self.assertEqual(rc, 1)


Expand Down