Skip to content

Commit

Permalink
#50
Browse files Browse the repository at this point in the history
  • Loading branch information
shenwei356 committed Sep 13, 2018
1 parent f29a724 commit 56be7c9
Show file tree
Hide file tree
Showing 7 changed files with 47 additions and 22 deletions.
5 changes: 3 additions & 2 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@

- [csvtk v0.15.0](https://github.com/shenwei356/csvtk/releases/tag/v0.15.0)
[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/csvtk/v0.15.0/total.svg)](https://github.com/shenwei356/csvtk/releases/tag/v0.15.0)
- `csvtk mutate2`, add flag `-s/--digits-as-string` for not converting big digits into scientific notation. [#46](https://github.com/shenwei356/csvtk/issues/46)
- `csvtk sort`, add support for sorting in natural order. [#49](https://github.com/shenwei356/csvtk/issues/49)
- `csvtk`: add global flag `-E/--ignore-empty-row` to skip empty row. [#50](https://github.com/shenwei356/csvtk/issues/50)
- `csvtk mutate2`: add flag `-s/--digits-as-string` for not converting big digits into scientific notation. [#46](https://github.com/shenwei356/csvtk/issues/46)
- `csvtk sort`: add support for sorting in natural order. [#49](https://github.com/shenwei356/csvtk/issues/49)
- [csvtk v0.14.0](https://github.com/shenwei356/csvtk/releases/tag/v0.14.0)
[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/csvtk/v0.14.0/total.svg)](https://github.com/shenwei356/csvtk/releases/tag/v0.14.0)
- `csvtk`: **supporting multi-line fields by replacing [multicorecsv](https://github.com/mzimmerman/multicorecsv ) with standard library [encoding/csv](https://golang.org/pkg/encoding/csv/),
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,8 @@ It could save you much time of writing Python/R scripts.
- `uniq` unique data without sorting
- `freq` frequencies of selected fields
- `inter` intersection of multiple files
- `filter` filter rows by values of selected fields with artithmetic expression
- `filter2` filter rows by awk-like artithmetic/string expressions
- `filter` filter rows by values of selected fields with arithmetic expression
- `filter2` filter rows by awk-like arithmetic/string expressions
- `join` join multiple CSV files by selected fields
- `split` split CSV/TSV into multiple files according to column values
- `splitxlsx` split XLSX sheet into multiple sheets according to column values
Expand All @@ -106,7 +106,7 @@ It could save you much time of writing Python/R scripts.
- `rename2` rename column names by regular expression
- `replace` replace data of selected fields by regular expression
- `mutate` create new columns from selected fields by regular expression
- `mutate2` create new column from selected fields by awk-like artithmetic/string expressions
- `mutate2` create new column from selected fields by awk-like arithmetic/string expressions
- `gather` gather columns into key-value pairs

**Ordering**
Expand Down Expand Up @@ -308,11 +308,11 @@ Examples
- Using `--any` to print record if any of the field satisfy the condition: `csvtk filter -f "1-3>0" --any`
- **fuzzy fields**: `csvtk filter -F -f "A*!=0"`

1. **Filter rows by awk-like artithmetic/string expressions** (`filter2`)
1. **Filter rows by awk-like arithmetic/string expressions** (`filter2`)

- Using field index: `csvtk filter2 -f '$3>0'`
- Using column names: `csvtk filter2 -f '$id > 0'`
- Both artithmetic and string expressions: `csvtk filter2 -f '$id > 3 || $username=="ken"'`
- Both arithmetic and string expressions: `csvtk filter2 -f '$id > 3 || $username=="ken"'`
- More complicated: `csvtk filter2 -H -t -f '$1 > 2 && $2 % 2 == 0'`


Expand Down
16 changes: 16 additions & 0 deletions csvtk/cmd/csv.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ type CSVReader struct {
Ch chan CSVRecordsChunk
MetaLine []byte // meta line of separator declaration used by MS Excel

IgnoreEmptyRow bool

fh *xopen.Reader
}

Expand Down Expand Up @@ -101,6 +103,8 @@ func (csvReader *CSVReader) Run() {
chunkData := make([][]string, csvReader.chunkSize)
var id uint64
var i int
var notBlank bool
var data string
for {
record, err := csvReader.Reader.Read()
if err == io.EOF {
Expand All @@ -115,6 +119,18 @@ func (csvReader *CSVReader) Run() {
if record == nil {
continue
}
if csvReader.IgnoreEmptyRow {
notBlank = false
for _, data = range record {
if len(data) > 0 {
notBlank = true
break
}
}
if !notBlank {
continue
}
}
chunkData[i] = record
i++
if i == csvReader.chunkSize {
Expand Down
4 changes: 2 additions & 2 deletions csvtk/cmd/filter.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ import (
// filterCmd represents the filter command
var filterCmd = &cobra.Command{
Use: "filter",
Short: "filter rows by values of selected fields with artithmetic expression",
Long: `filter rows by values of selected fields with artithmetic expression
Short: "filter rows by values of selected fields with arithmetic expression",
Long: `filter rows by values of selected fields with arithmetic expression
`,
Run: func(cmd *cobra.Command, args []string) {
Expand Down
7 changes: 6 additions & 1 deletion csvtk/cmd/helper.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ import (
)

// VERSION of csvtk
const VERSION = "0.15.0-dev2"
const VERSION = "0.15.0-dev3"

func checkError(err error) {
if err != nil {
Expand Down Expand Up @@ -191,6 +191,8 @@ type Config struct {
NoHeaderRow bool

OutFile string

IgnoreEmptyRow bool
}

func isTrue(s string) bool {
Expand Down Expand Up @@ -233,6 +235,8 @@ func getConfigs(cmd *cobra.Command) Config {
NoHeaderRow: noHeaderRow,

OutFile: getFlagString(cmd, "out-file"),

IgnoreEmptyRow: getFlagBool(cmd, "ignore-empty-row"),
}
}

Expand All @@ -248,6 +252,7 @@ func newCSVReaderByConfig(config Config, file string) (*CSVReader, error) {
}
reader.Reader.Comment = config.CommentChar
reader.Reader.LazyQuotes = config.LazyQuotes
reader.IgnoreEmptyRow = config.IgnoreEmptyRow

return reader, nil
}
Expand Down
2 changes: 2 additions & 0 deletions csvtk/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,6 @@ func init() {
RootCmd.PersistentFlags().BoolP("out-tabs", "T", false, `specifies that the output is delimited with tabs. Overrides "-D"`)
RootCmd.PersistentFlags().BoolP("no-header-row", "H", false, `specifies that the input CSV file does not have header row`)
RootCmd.PersistentFlags().StringP("out-file", "o", "-", `out file ("-" for stdout, suffix .gz for gzipped out)`)

RootCmd.PersistentFlags().BoolP("ignore-empty-row", "E", false, `ignore empty row`)
}
25 changes: 13 additions & 12 deletions doc/docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Usage
```
A cross-platform, efficient and practical CSV/TSV toolkit
Version: 0.15.0-dev2
Version: 0.15.0-dev3
Author: Wei Shen <shenwei356@gmail.com>
Expand Down Expand Up @@ -114,8 +114,8 @@ Available Commands:
csv2md convert CSV to markdown format
csv2tab convert CSV to tabular format
cut select parts of fields
filter filter rows by values of selected fields with artithmetic expression
filter2 filter rows by awk-like artithmetic/string expressions
filter filter rows by values of selected fields with arithmetic expression
filter2 filter rows by awk-like arithmetic/string expressions
freq frequencies of selected fields
gather gather columns into key-value pairs
genautocomplete generate shell autocompletion script
Expand All @@ -126,7 +126,7 @@ Available Commands:
inter intersection of multiple files
join join multiple CSV files by selected fields
mutate create new column from selected fields by regular expression
mutate2 create new column from selected fields by awk-like artithmetic/string expressions
mutate2 create new column from selected fields by awk-like arithmetic/string expressions
plot plot common figures
pretty convert CSV to readable aligned table
rename rename column names
Expand All @@ -150,6 +150,7 @@ Flags:
-C, --comment-char string lines starting with commment-character will be ignored. if your header row starts with '#', please assign "-C" another rare symbol, e.g. '$' (default "#")
-d, --delimiter string delimiting character of the input CSV file (default ",")
-h, --help help for csvtk
-E, --ignore-empty-row ignore empty row
-l, --lazy-quotes if given, a quote may appear in an unquoted field and a non-doubled quote may appear in a quoted field
-H, --no-header-row specifies that the input CSV file does not have header row
-j, --num-cpus int number of CPUs to use (default value depends on your computer) (default 4)
Expand Down Expand Up @@ -952,7 +953,7 @@ Matched parts will be *highlight*
Usage

```
filter rows by values of selected fields with artithmetic expression
filter rows by values of selected fields with arithmetic expression
Usage:
csvtk filter [flags]
Expand Down Expand Up @@ -1020,9 +1021,9 @@ Examples
Usage

```
filter rows by awk-like artithmetic/string expressions
filter rows by awk-like arithmetic/string expressions
The artithmetic/string expression is supported by:
The arithmetic/string expression is supported by:
https://github.com/Knetic/govaluate
Expand Down Expand Up @@ -1069,15 +1070,15 @@ Examples:
11,Rob,Pike,rob
4,Robert,Griesemer,gri

1. Artithmetic and string expressions
1. arithmetic and string expressions

$ cat testdata/names.csv | csvtk filter2 -f '$id > 3 || $username=="ken"'
id,first_name,last_name,username
11,Rob,Pike,rob
2,Ken,Thompson,ken
4,Robert,Griesemer,gri

1. More artithmetic expressions
1. More arithmetic expressions

$ cat testdata/digitals.tsv
4 5 6
Expand Down Expand Up @@ -1684,9 +1685,9 @@ Examples
Usage

```
create new column from selected fields by awk-like artithmetic/string expressions
create new column from selected fields by awk-like arithmetic/string expressions
The artithmetic/string expression is supported by:
The arithmetic/string expression is supported by:
https://github.com/Knetic/govaluate
Expand All @@ -1711,7 +1712,7 @@ Usage:
Flags:
-L, --digits int number of digits after the dot (default 2)
-s, --digits-as-string treate digits as string to avoid converting big digits into scientific notation
-e, --expression string artithmetic/string expressions. e.g. "'string'", '"abc"', ' $a + "-" + $b ', '$1 + $2', '$a / $b', ' $1 > 100 ? "big" : "small" '
-e, --expression string arithmetic/string expressions. e.g. "'string'", '"abc"', ' $a + "-" + $b ', '$1 + $2', '$a / $b', ' $1 > 100 ? "big" : "small" '
-h, --help help for mutate2
-n, --name string new column name
Expand Down

0 comments on commit 56be7c9

Please sign in to comment.