Hayden Baker
A Rust-based implementation of a file-synchronization service
This README is based on an Ubuntu/Debian Development Environment, and therefore your mileage may vary if you're trying to build/run this on a different distribution... beware!
Install Rust-Nightly
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Make sure you install the nightly toolchain
Make sure that the rust compiler (rustc) is installed
rustc --version
- You should see
rustc 1.4x.x-nightly ....
as the version
If new VM, update apt
apt-get update
Install necessary packages to build the binaries
apt-get install build-essential pkg-config libssl-dev
Configuration files for both of the services exist, by default, at ~/.config/sync-client
and ~/.config/sync-server
To configure the services to work, you must edit the sync-client/server configuration files with:
- valid AWS credentials
- a client-id, only the client config requires this
- sync directory (must be of following format:
/dir1/dir2/
), only the client config requires this - s3 bucket name
- the downstream queue handle
- the sqs handle prefix (i.e.
https://sqs.us-east-1.amazonaws.com/xxxxxxxxxxxxxxxx
)
Additionally, you must ensure that the process running the client has permissions to read and write the sync-directory
Both services can be run in development mode by:
cargo run --bin client
cargo run --bin server
However, you should really build the binaries for the individual services
cargo build --bin client --release
cargo build --bin server --release
Both binaries should be built and placed within target/release
Be wary that building in release mode will take a few minutes, since this application relies on a few heavyweight crates
Before running the services, you must set up and configure the necessary AWS services (since that's what it uses)
- Additionally, ensure that the services you set up are in
us-east-1
Using free-tier instances, you can boot up ubuntu boxes, which you can then follow this guide. Exact configuation is not important as of now.
Client and Server can technically be on the same instance, since they're decoupled via SQS, but the details are up to you. However, the synchronization server be as close as possible (lowest latency) to the other AWS services
You should have a single bucket set up with the same name as you specify in your configuration files
The actual objects that will be placed in S3 are the chunks of files, which are named based on their content-defined hash (Adler32)
Similar to S3, the downstream queue (main event queue) should be named sync-downstream.fifo
The upstream queues will be generated by the individual clients themselves, and they will follow the format sync-upstream-<client-id>.fifo
- For the downstream queue, you must enable
Content-Based Deduplication
when creating, since it is a FIFO queue
All that is required for DynamoDB set-up is a single DB instance with a table, file
, with a primary key attribute, path