🧑‍🍳 Filet

Filet (Filecoin Extract Transform) makes it simple to get CSV data from Filecoin Archival Snapshots using Lily and lily-archiver.

🚀 Usage

The filet image available on Google Container Artifact Hub. Alternatively, you can build it locally with make build.

The following command will generate CSVs from an Filecoin Archival Snapshot:

docker run -it \
    -v $PWD:/tmp/data \
    europe-west1-docker.pkg.dev/protocol-labs-data/pl-data/filet:latest -- \
    /lily/export.sh archival_snapshot.car.zst .

⏰ Scheduling Jobs

You can use the send_export_jobs.sh script to schedule jobs on Google Cloud Batch. The script takes a file with a list of snapshots as input.

./scripts/send_export_jobs.sh SNAPSHOT_LIST_FILE [--dry-run]

For more details on the scheduled jobs configuration, you can check the gce_batch_job.json file.

The SNAPSHOT_LIST_FILE file should contain a list of snapshots, one per line. The snapshots should be available in the fil-mainnet-archival-snapshots Google Cloud Storage bucket.

gsutil ls gs://fil-mainnet-archival-snapshots/historical-exports/ | sort --version-sort > all_snapshots.txt

To get the batches you can use the following command to filter by snapshot height:

grep -E '^[2226480-2232002]$'

🔧 Managing Jobs

In case you need to retry a bunch of failed jobs, you can use the following commands:

# Get the list of failed jobs
gcloud alpha batch jobs list --format=json --filter="Status.state:FAILED" > failed_jobs.json

# Get the snapshot name from failed jobs
cat failed_jobs.json | jq ".[].taskGroups[0].taskSpec.runnables[0].container.commands[0]" -r | cut -d '/' -f 2 | sort > failed_jobs.list

# Retry the failed jobs
./scripts/send_export_jobs.sh failed_jobs.list

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.github/workflows		.github/workflows
scripts		scripts
.editorconfig		.editorconfig
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.toml		config.toml
gce_batch_job.json		gce_batch_job.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧑‍🍳 Filet

🚀 Usage

⏰ Scheduling Jobs

🔧 Managing Jobs

About

Releases

Packages

Contributors 2

Languages

License

filecoin-project/filet

Folders and files

Latest commit

History

Repository files navigation

🧑‍🍳 Filet

🚀 Usage

⏰ Scheduling Jobs

🔧 Managing Jobs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages