Skip to content

AgriConnect/influxdb-csv-cleaner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

influxdb-csv-cleaner

Tool to clean CSV content exported from InfluxDB.

What it does

  • Remove repeated header lines, appearing in the middle of file.
  • Remove first column, which is just table (measurement) name.
  • Convert timestamp in "time" column to a readable format, like "2018-01-30 00:00:00".

Build

  1. For BeagleBone

    • Install gcc-6-arm-linux-gnueabihf, binutils-arm-linux-gnueabihf packages.
    • Install armv7-unknown-linux-gnueabihf target for Rust:
    rustup target install armv7-unknown-linux-gnueabihf
    • If you are cross-compiling from PC, need to add this content to .cargo/config file:
    [target.armv7-unknown-linux-gnueabihf]
    linker = "arm-linux-gnueabihf-gcc"
    • Build with command:
    cargo build --target=armv7-unknown-linux-gnueabihf --release
    • Strip the compiled file to reduce file size:
    /usr/arm-linux-gnueabihf/bin/strip target/armv7-unknown-linux-gnueabihf/release/influxdb-csv-cleaner
  2. For PC

Too simple to tell.

Example

You export data from InfluxDB with this command:

influx -database myfarm -precision s -format csv -execute "SELECT temperature FROM condition LIMIT 100"

And save to a file sample.csv:

    name,time,temperature
    condition,1489544029,29.1
    condition,1489544039,29.2

Now, you want to remove the first column:

influxdb-csv-cleaner sample.csv -t Asia/Ho_Chi_Minh -o clean.csv

The ouput will be:

    time,temperature
    2017-03-15 09:13:49,29.1
    2017-03-15 09:13:59,29.2

The timestamp column is always convert to readable format. It also help avoid confusing when you use this CSV with graphing tool.

Note: The header line can apear many times in the InfluxDB export file, because influx client then makes chunked queries to handle big data. But influxdb-csv-cleaner tool will skip all of them, except the top line.

You can also use the tool in pipeline to clean on the stdin stream:

influx -database myfarm -precision s -format csv -execute "SELECT temperature FROM condition LIMIT 100" | influxdb-csv-cleaner - -t Asia/Ho_Chi_Minh

Please run

influxdb-csv-cleaner -h

to see more other usages.