Skip to content

Commit

Permalink
convert rfs to a work space
Browse files Browse the repository at this point in the history
  • Loading branch information
rawdaGastan committed May 19, 2024
1 parent a2aaa68 commit e61738a
Show file tree
Hide file tree
Showing 23 changed files with 3,666 additions and 446 deletions.
310 changes: 166 additions & 144 deletions Cargo.lock

Large diffs are not rendered by default.

74 changes: 4 additions & 70 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,71 +1,5 @@
[package]
name = "rfs"
version = "0.2.0"
edition = "2018"
[workspace]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[build-dependencies]
git-version = "0.3.5"

[[bin]]
name = "rfs"
path = "src/main.rs"
required-features = ["build-binary"]

[features]
build-binary = [
"dep:polyfuse",
"dep:simple_logger",
"dep:tempfile",
"dep:daemonize",
"dep:clap"
]

[lib]
name = "rfs"
path = "src/lib.rs"

[dependencies]
anyhow = "1.0.44"
time = "0.3.3"
sqlx = { version = "0.7.4", features = [ "runtime-tokio-rustls", "sqlite" ] }
tokio = { version = "1", features = [ "rt", "rt-multi-thread", "macros"] }
libc = "0.2"
futures = "0.3"
thiserror = "1.0"
bytes = "1.1.0"
log = "0.4"
lru = "0.7.0"
nix = "0.23.0"
snap = "1.0.5"
bb8-redis = "0.13"
async-trait = "0.1.53"
url = "2.3.1"
blake2b_simd = "1"
aes-gcm = "0.10"
hex = "0.4"
lazy_static = "1.4"
rand = "0.8"
# next are only needed for the binarys
clap = { version = "4.2", features = ["derive"], optional = true}
simple_logger = {version = "1.0.1", optional = true}
daemonize = { version = "0.5", optional = true }
tempfile = { version = "3.3.0", optional = true }
workers = { git="https://github.com/threefoldtech/tokio-worker-pool.git" }
rust-s3 = "0.34.0-rc3"
openssl = { version = "0.10", features = ["vendored"] }
regex = "1.9.6"
which = "6.0"

[dependencies.polyfuse]
branch = "master"
git = "https://github.com/muhamadazmy/polyfuse"
optional = true

[dev-dependencies]
reqwest = { version = "0.11", features = ["blocking"] }
assert_cmd = "2.0"

[profile.release]
lto = true
codegen-units = 1
members = [
"rfs",
]
141 changes: 12 additions & 129 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,147 +1,30 @@
# rfs

# Introduction
This repo contains the binaries related to rfs.

[![Test](https://github.com/threefoldtech/rfs/actions/workflows/tests.yaml/badge.svg?branch=master)](https://github.com/threefoldtech/rfs/actions/workflows/tests.yaml)

## Introduction

`rfs` is the main tool to create, mount and extract FungiStore lists (FungiList)`fl` for short. An `fl` is a simple format
to keep information about an entire filesystem in a compact form. It does not hold the data itself but enough information to
retrieve this data back from a `store`.

## Building rfs
## Build

To build rfs make sure you have rust installed then run the following commands:
Make sure you have rust installed then run the following commands:

```bash
# this is needed to be run once to make sure the musl target is installed
rustup target add x86_64-unknown-linux-musl

# build the binary
# build all binaries
cargo build --features build-binary --release --target=x86_64-unknown-linux-musl
```

the binary will be available under `./target/x86_64-unknown-linux-musl/release/rfs` you can copy that binary then to `/usr/bin/`
the rfs binary will be available under `./target/x86_64-unknown-linux-musl/release/rfs` you can copy that binary then to `/usr/bin/`
to be able to use from anywhere on your system.

## Stores

A store in where the actual data lives. A store can be as simple as a `directory` on your local machine in that case the files on the `fl` are only 'accessible' on your local machine. A store can also be a `zdb` running remotely or a cluster of `zdb`. Right now only `dir`, `zdb` and `s3` stores are supported but this will change in the future to support even more stores.

## Usage

### Creating an `fl`

```bash
rfs pack -m output.fl -s <store-specs> <directory>
```

This tells rfs to create an `fl` named `output.fl` using the store defined by the url `<store-specs>` and upload all the files under directory recursively.

The simplest form of `<store-specs>` is a `url`. the store `url` defines the store to use. Any `url`` has a schema that defines the store type. Right now we have support only for:

- `dir`: dir is a very simple store that is mostly used for testing. A dir store will store the fs blobs in another location defined by the url path. An example of a valid dir url is `dir:///tmp/store`
- `zdb`: [zdb](https://github.com/threefoldtech/0-db) is a append-only key value store and provides a redis like API. An example zdb url can be something like `zdb://<hostname>[:port][/namespace]`
- `s3`: aws-s3 is used for storing and retrieving large amounts of data (blobs) in buckets (directories). An example `s3://<username>:<password>@<host>:<port>/<bucket-name>`

`region` is an optional param for s3 stores, if you want to provide one you can add it as a query to the url `?region=<region-name>`

`<store-specs>` can also be of the form `<start>-<end>=<url>` where `start` and `end` are a hex bytes for partitioning of blob keys. rfs will then store a set of blobs on the defined store if they blob key falls in the `[start:end]` range (inclusive).

If the `start-end` range is not provided a `00-FF` range is assume basically a catch all range for the blob keys. In other words, all blobs will be written to that store.

This is only useful because `rfs` can accept multiple stores on the command line with different and/or overlapping ranges.

For example `-s 00-80=dir:///tmp/store0 -s 81-ff=dir://tmp/store1` means all keys that has prefix byte in range `[00-80]` will be written to /tmp/store0 all other keys `00-ff` will be written to store1.

The same range can appear multiple times, which means the blob will be replicated to all the stores that matches its key prefix.

To quickly test this operation

```bash
rfs pack -m output.fl -s 00-80=dir:///tmp/store0 -s 81-ff=dir:///tmp/store1 ~/Documents
```

this command will effectively create the `output.fl` and store (and shard) the blobs across the 2 locations /tmp/store0 and /tmp/store1.

```bash
#rfs pack --help

create an FL and upload blocks to provided storage

Usage: rfs pack [OPTIONS] --meta <META> <TARGET>

Arguments:
<TARGET> target directory to upload

Options:
-m, --meta <META> path to metadata file (flist)
-s, --store <STORE> store url in the format [xx-xx=]<url>. the range xx-xx is optional and used for sharding. the URL is per store type, please check docs for more information
--no-strip-password disables automatic password stripping from store url, otherwise password will be stored in the fl.
-h, --help Print help
```

#### Password stripping

During creation of an flist you will probably provide a password in the URL of the store. This is normally needed to allow write operation to the store (say s3 bucket)
Normally this password is removed from the store info so it's safe to ship the fl to users. A user of the flist then will only have read access, if configured correctly
in the store

For example a `zdb` store has the notion of a public namespace which is password protected for writes, but open for reads. An S3 bucket can have the policy to allow public reads, but protected writes (minio supports that via bucket settings)

If you wanna disable the password stripping from the store url, you can provide the `--no-strip-password` flag during creation. This also means someone can extract
this information from the fl and gain write access to your store, so be careful how u use it.

# Mounting an `fl`

Once the `fl` is created it can be distributes to other people. Then they can mount the `fl` which will allow them then to traverse the packed filesystem and also access (read-only) the files.

To mount an `fl` only the `fl` is needed since all information regarding the `stores` is already stored in the `fl`. This also means you can only share the `fl` if the other user can actually reach the store used to crate the `fl`. So a `dir` store is not sharable, also a `zdb` instance that is running on localhost :no_good:

```bash
sudo rfs mount -m output.fl <target>
```

The `<target>` is the mount location, usually `/mnt` but can be anywhere. In another terminal you can now `cd <target>` and walk the filesystem tree. Opening the files will trigger a file download from the store only on read access.

full command help

```bash
# rfs mount --help

mount an FL

Usage: rfs mount [OPTIONS] --meta <META> <TARGET>

Arguments:
<TARGET> target mountpoint

Options:
-m, --meta <META> path to metadata file (flist)
-c, --cache <CACHE> directory used as cache for downloaded file chuncks [default: /tmp/cache]
-d, --daemon run in the background
-l, --log <LOG> log file only used with daemon mode
-h, --help Print help
```

# Unpack an `fl`

Similar to `mount` rfs provides an `unpack` subcommand that downloads the entire content (extract) of an `fl` to a provided directory.

```bash
rfs unpack --help
unpack (downloads) content of an FL the provided location

Usage: rfs unpack [OPTIONS] --meta <META> <TARGET>

Arguments:
<TARGET> target directory to upload

Options:
-m, --meta <META> path to metadata file (flist)
-c, --cache <CACHE> directory used as cache for downloaded file chuncks [default: /tmp/cache]
-p, --preserve-ownership preserve files ownership from the FL, otherwise use the current user ownership setting this flag to true normally requires sudo
-h, --help Print help
```

By default when unpacking the `-p` flag is not set. which means downloaded files will be `owned` by the current user/group. If `-p` flag is set, the files ownership will be same as the original files used to create the fl (preserve `uid` and `gid` of the files and directories) this normally requires `sudo` while unpacking.

# Specifications
## Binaries and libraries

Please check [docs](docs)
- [rfs](./rfs/README.md)
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ The type of the `inode` is defined by its `mode` which is a `1:1` mapping from t
> from the [inode manual](https://man7.org/linux/man-pages/man7/inode.7.html)
```
POSIX refers to the stat.st_mode bits corresponding to the mask
S_IFMT (see below) as the file type, the 12 bits corresponding to
the mask 07777 as the file mode bits and the least significant 9
Expand Down
Loading

0 comments on commit e61738a

Please sign in to comment.