-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws s3 sync
from s3 to local disk SCRAMBLED MY SYSTEM FILES!
#1174
Comments
i rolled back to jan 21 (before FYI my Oracle VirtualBox version is 4.3.12 r93733 and I run it on Windows 8.1 64-bit host. I have 4 physical cores and I had given the machine 100% of them. the more i think about this, its bizarre that any Java process could write to a root-only file. it seems like the kernel would have stopped it from writing to a root-only file, if it had gone through the kernel. also, it was a lot of data to write to the .vdi (+48gb) i suppose the only explanation that does make sense is VirtualBox writing outside of bounds to the .vdi file on my Windows host. im not sure how the ext filesystem works; if that could possibly result in the type of file corruption I described above. another thing i did notice when I was running and there appear to be similar, unresolved issues documented in older versions of vbox: https://www.virtualbox.org/ticket/10031 so its probably safe to run outside of vms, or on vms with more resources. most likely a very specific evil version combination i have--either Windows or VirtualBox level. will leave this issue in case anyone else reports something similar in the future. |
Sorry to hear that happened to you. We have never seen anything like that happen before nor does it really make any sense, from an AWS CLI developer's point of view, that it happened. Besides the fact that it does not make sense that the CLI wrote to a directory that was not even part of the targeted directory, the CLI only uses the permissions of the user that it is being ran by. Given the version of the CLI that you were using, the CLI would actually error out if the user did not have access to the file. In the current version of the CLI, version 1.7.11, we actual check if we have access and skip the file if we do not have access instead of completely failing. It sounds like you updated to the latest CLI, which is good. As to trying to cut down on the computational cost of the s3 sync, take a look at this PR: #1122. We are working on adding documentation for it, but the PR provides a decent description. You can configure the CLI to use less threads for s3 operations. By default, I believe it uses 10 threads so you could drop that down to something lower. Let me know if you run into this issue again or have any other questions. |
I am closing this issue as there seems that nobody else has ran into this issue. If you or anyone else runs into this issue please reopen. |
my command was:
cd /data; mkdir aj-dynamo-backups/; aws s3 sync s3://aj-dynamo-backups/ aj-dynamo-backups/
NOTE: I ran it several times, because I noticed a few files were appeared to be downloaded each time--its like it skips some files during the download and you have to run it until it stops saying it downloaded new files to be sure you got them all. I waited for it to complete between each run. I it was inside a 64-bit virtualbox vm with Ubuntu 14.04, in case that matters.
for the next few days afterward, i would see random errors like these ones when starting gvim:
and these when i tried to solve the problem with
apt-get update && apt-get dist-upgrade
:http://i.imgur.com/pcXTz5M.png
upon viewing one of the files in
less
, such as/etc/modprobe.d/alsa-base.conf
, i see:http://i.imgur.com/tRyd3y2.png
...clearly a portion of my dynamodb dump in the tell-tale quazi-json format the DynamoDB exporter writes to S3.
I could not finish my
dist-upgrade
due to this issue. I am mind-boggled. Some questions are:/etc/
?/etc/modprobe.d/alsa-base.conf
developer
. The user does have passwordless sudoers access. That is the nearest thing I can think of. But I did not usesudo
run the theaws
cli command, as you can see in my example above. Here you can see the directory permissions were set toroot
and user-writable only for/etc/
I haven't counted how many files are affected. I saved a snapshot of the vm though for forensic analysis. However it seems like potentially hundreds of files under
/etc/
are affected.I am afraid to use
aws s3 sync
in general now. Seems like whatever happened was catastrophic level bug. That or severe VM corruption. But I find that difficult to believe given the way that it corrupted each file. It seemed somewhat deterministic, since the filesystem was still readable, and filenames and paths were still in-tact, and their contents were readable byless
, and still legible/recognizable as DynamoDB dump output.On the plus side, there is 48GB of data downloaded to my
/data/aj-dynamo-backups/
dir. It appears to be a complete copy that I intended to receive, but I don't know how I would compare it byte-for-byte without some s3 recursive directory checksum, or something.Just a warning to other users. I am solving the issue by rolling back my VM to an earlier snapshot. I am doubting whether I can rely on this to play a role in the long-term DynamoDB backup/mirror solution I had intended. Any help appreciated.
The text was updated successfully, but these errors were encountered: