Skip to content

Latest commit

 

History

History

parquet-tools

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

parquet-tools

parquet-tools is a command line tool that aid in the inspection of parquet files. More functions will be added.

Build

cd parquet-tools && go build parquet-tools

Description

-cmd

schema/size/rowcount

-file

parquet file name;

-tag

print the go struct tags; default is false;

-cat

cat records of parquet file.

Example

Output Schema

bash$ ./parquet-tools -cmd schema -file a.parquet -tag false
bash$
----- Go struct -----
parquet_go_root struct{
  name string
  age int32
  id int64
  weight float32
  sex bool
  day int32
}
----- Json schema -----
{
  "Tag": "name=parquet_go_root, repetitiontype=REQUIRED",
  "Fields": [
    {
      "Tag": "name=name, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=REQUIRED",
      "Fields": null
    },
    {
      "Tag": "name=age, type=INT32, repetitiontype=REQUIRED",
      "Fields": null
    },
    {
      "Tag": "name=id, type=INT64, repetitiontype=REQUIRED",
      "Fields": null
    },
    {
      "Tag": "name=weight, type=FLOAT, repetitiontype=REQUIRED",
      "Fields": null
    },
    {
      "Tag": "name=sex, type=BOOLEAN, repetitiontype=REQUIRED",
      "Fields": null
    },
    {
      "Tag": "name=day, type=DATE, repetitiontype=REQUIRED",
      "Fields": null
    }
  ]
}

Show records

#show first 2 records of a.parquet
./parquet-tools -cmd cat -count 2 -file a.parquet