Parquimetro is a small (10MB) and simple tool to interact with parquet files. Built around parquet-go.
To check parquet schemas:
parquimetro schema ~/path/to/file.parquet
Options available:
- Count:
-f
or--format
output format,json
orgo
. (defaultjson
) - Skip:
--tags
show go struct tags (Only available if format isgo
) - Threads:
-t
or--threads
quantity of threads to be used. (default 1)
Schema command can be easily used together with jq
:
parquimetro schema ~/path/to/file.parquet | jq .
Easy read parquet files:
parquimetro read ~/path/to/file.parquet
Options available:
- Count:
-c
or--count
quantity of rows to be shows. (default 25) - Skip:
-s
or--skip
quantity of rows to skip (from beginning) - Threads:
-t
or--threads
quantity of threads to be used. (default 1)
Just as schema, read command can be easily used together with jq
:
parquimetro read ~/path/to/file.parquet | jq .
Easy know size related data:
go run main.go size ~/Downloads/userdata1.parquet
Options available:
- Uncompressed:
--uncompressed
show uncompressed size (Defaulttrue
) - Compressed:
--compressed
show compressed size (Defaultfalse
) - Pretty:
--pretty
show pretty size, it will use the best format to print (Defaulttrue
) - Format:
--format
or-f
give format to print output. Acceptable formats:KB
,MB
,GB
,TB
. (Lower priority thanpretty
, need to set--pretty=false
to use)
If you have go installed:
go install github.com/otaviohenrique/parquimetro@latest
Or if you want, you can download the release on our releases page and install it.