Skip to content

Commit

Permalink
LogQL: Simple JSON expressions (#3280)
Browse files Browse the repository at this point in the history
* New approach, still rough

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding benchmark

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Minor refactoring

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Appeasing the linter

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Further appeasing the linter

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding more tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding documentation

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Docs fixup

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Removing unnecessary condition

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding extra tests from suggestion in review

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding JSONParseErr

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding test to cover invalid JSON line

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding equivalent benchmarks for JSON and JSONExpression parsing

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding suffix if label would be overridden

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Reparenting jsonexpr directory to more appropriate location

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Setting empty label on non-matching expression, to retain parity with label_format

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding statement about returned complex JSON types

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Added check for valid label name

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Making json expressions shardable

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
  • Loading branch information
Danny Kopping authored Feb 9, 2021
1 parent 6c8fdd6 commit feb7fb4
Show file tree
Hide file tree
Showing 16 changed files with 1,937 additions and 421 deletions.
128 changes: 94 additions & 34 deletions docs/sources/logql/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,40 +145,100 @@ If an extracted label key name already exists in the original log stream, the ex

We support currently support json, logfmt and regexp parsers.

The **json** parsers take no parameters and can be added using the expression `| json` in your pipeline. It will extract all json properties as labels if the log line is a valid json document. Nested properties are flattened into label keys using the `_` separator. **Arrays are skipped**.

For example the json parsers will extract from the following document:

```json
{
"protocol": "HTTP/2.0",
"servers": ["129.0.1.1","10.2.1.3"],
"request": {
"time": "6.032",
"method": "GET",
"host": "foo.grafana.net",
"size": "55",
},
"response": {
"status": 401,
"size": "228",
"latency_seconds": "6.031"
}
}
```

The following list of labels:

```kv
"protocol" => "HTTP/2.0"
"request_time" => "6.032"
"request_method" => "GET"
"request_host" => "foo.grafana.net"
"request_size" => "55"
"response_status" => "401"
"response_size" => "228"
"response_size" => "228"
```
The **json** parser operates in two modes:

1. **without** parameters:

Adding `| json` to your pipeline will extract all json properties as labels if the log line is a valid json document.
Nested properties are flattened into label keys using the `_` separator.

Note: **Arrays are skipped**.

For example the json parsers will extract from the following document:

```json
{
"protocol": "HTTP/2.0",
"servers": ["129.0.1.1","10.2.1.3"],
"request": {
"time": "6.032",
"method": "GET",
"host": "foo.grafana.net",
"size": "55",
"headers": {
"Accept": "*/*",
"User-Agent": "curl/7.68.0"
}
},
"response": {
"status": 401,
"size": "228",
"latency_seconds": "6.031"
}
}
```

The following list of labels:

```kv
"protocol" => "HTTP/2.0"
"request_time" => "6.032"
"request_method" => "GET"
"request_host" => "foo.grafana.net"
"request_size" => "55"
"response_status" => "401"
"response_size" => "228"
"response_size" => "228"
```

2. **with** parameters:

Using `| json label="expression", another="expression"` in your pipeline will extract only the
specified json fields to labels. You can specify one or more expressions in this way, the same
as [`label_format`](#labels-format-expression); all expressions must be quoted.

Currently, we only support field access (`my.field`, `my["field"]`) and array access (`list[0]`), and any combination
of these in any level of nesting (`my.list[0]["field"]`).

For example, `| json first_server="servers[0]", ua="request.headers[\"User-Agent\"]` will extract from the following document:

```json
{
"protocol": "HTTP/2.0",
"servers": ["129.0.1.1","10.2.1.3"],
"request": {
"time": "6.032",
"method": "GET",
"host": "foo.grafana.net",
"size": "55",
"headers": {
"Accept": "*/*",
"User-Agent": "curl/7.68.0"
}
},
"response": {
"status": 401,
"size": "228",
"latency_seconds": "6.031"
}
}
```

The following list of labels:

```kv
"first_server" => "129.0.1.1"
"ua" => "curl/7.68.0"
```

If an array or an object returned by an expression, it will be assigned to the label in json format.

For example, `| json server_list="servers", headers="request.headers` will extract:

```kv
"server_list" => `["129.0.1.1","10.2.1.3"]`
"headers" => `{"Accept": "*/*", "User-Agent": "curl/7.68.0"}`
```

The **logfmt** parser can be added using the `| logfmt` and will extract all keys and values from the [logfmt](https://brandur.org/logfmt) formatted log line.

Expand Down
33 changes: 33 additions & 0 deletions pkg/logql/ast.go
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,39 @@ func (e *labelFmtExpr) String() string {
return sb.String()
}

type jsonExpressionParser struct {
expressions []log.JSONExpression

implicit
}

func newJSONExpressionParser(expressions []log.JSONExpression) *jsonExpressionParser {
return &jsonExpressionParser{
expressions: expressions,
}
}

func (j *jsonExpressionParser) Shardable() bool { return true }

func (j *jsonExpressionParser) Stage() (log.Stage, error) {
return log.NewJSONExpressionParser(j.expressions)
}

func (j *jsonExpressionParser) String() string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf("%s %s ", OpPipe, OpParserTypeJSON))
for i, exp := range j.expressions {
sb.WriteString(exp.Identifier)
sb.WriteString("=")
sb.WriteString(strconv.Quote(exp.Expression))

if i+1 != len(j.expressions) {
sb.WriteString(",")
}
}
return sb.String()
}

func mustNewMatcher(t labels.MatchType, n, v string) *labels.Matcher {
m, err := labels.NewMatcher(t, n, v)
if err != nil {
Expand Down
1 change: 1 addition & 0 deletions pkg/logql/ast_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ func Test_SampleExpr_String(t *testing.T) {
`,
`10 / (5/2)`,
`10 / (count_over_time({job="postgres"}[5m])/2)`,
`{app="foo"} | json response_status="response.status.code", first_param="request.params[0]"`,
} {
t.Run(tc, func(t *testing.T) {
expr, err := ParseExpr(tc)
Expand Down
18 changes: 18 additions & 0 deletions pkg/logql/expr.y
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ import (
LabelFormatExpr *labelFmtExpr
LabelFormat log.LabelFmt
LabelsFormat []log.LabelFmt
JSONExpressionParser *jsonExpressionParser
JSONExpression log.JSONExpression
JSONExpressionList []log.JSONExpression
UnwrapExpr *unwrapExpr
}

Expand Down Expand Up @@ -82,6 +85,9 @@ import (
%type <LabelFormatExpr> labelFormatExpr
%type <LabelFormat> labelFormat
%type <LabelsFormat> labelsFormat
%type <JSONExpressionParser> jsonExpressionParser
%type <JSONExpression> jsonExpression
%type <JSONExpressionList> jsonExpressionList
%type <UnwrapExpr> unwrapExpr
%type <UnitFilter> unitFilter

Expand Down Expand Up @@ -211,6 +217,7 @@ pipelineExpr:
pipelineStage:
lineFilters { $$ = $1 }
| PIPE labelParser { $$ = $2 }
| PIPE jsonExpressionParser { $$ = $2 }
| PIPE labelFilter { $$ = &labelFilterExpr{LabelFilterer: $2 }}
| PIPE lineFormatExpr { $$ = $2 }
| PIPE labelFormatExpr { $$ = $2 }
Expand All @@ -226,6 +233,9 @@ labelParser:
| REGEXP STRING { $$ = newLabelParserExpr(OpParserTypeRegexp, $2) }
;

jsonExpressionParser:
JSON jsonExpressionList { $$ = newJSONExpressionParser($2) }

lineFormatExpr: LINE_FMT STRING { $$ = newLineFmtExpr($2) };

labelFormat:
Expand All @@ -252,6 +262,14 @@ labelFilter:
| labelFilter OR labelFilter { $$ = log.NewOrLabelFilter($1, $3 ) }
;

jsonExpression:
IDENTIFIER EQ STRING { $$ = log.NewJSONExpr($1, $3) }

jsonExpressionList:
jsonExpression { $$ = []log.JSONExpression{$1} }
| jsonExpressionList COMMA jsonExpression { $$ = append($1, $3) }
;

unitFilter:
durationFilter { $$ = $1 }
| bytesFilter { $$ = $1 }
Expand Down
Loading

0 comments on commit feb7fb4

Please sign in to comment.