-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/1363: Additional configuration for JSON parser #4351
Changes from 16 commits
420dafd
c50bfc2
7f2b2a0
4db6675
885193b
674b957
e25fbed
9196488
22e28e0
6207324
dca5e99
5e22ec4
867c3a7
fa4b2f0
197c474
e2ebce0
ab3433a
8445d14
aff2697
cabf688
145c5ef
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -105,9 +105,27 @@ but can be overridden using the `name_override` config option. | |
|
||
#### JSON Configuration: | ||
|
||
The JSON data format supports specifying "tag keys". If specified, keys | ||
will be searched for in the root-level of the JSON blob. If the key(s) exist, | ||
they will be applied as tags to the Telegraf metrics. | ||
The JSON data format supports specifying "tag_keys", "string_keys", and "json_query". | ||
If specified, keys in "tag_keys" and "string_keys" will be searched for in the root-level | ||
and any nested lists of the JSON blob. All int and float values are added to fields by default. | ||
If the key(s) exist, they will be applied as tags or fields to the Telegraf metrics. | ||
If "string_keys" is specified, the string will be added as a field. | ||
|
||
The "json_query" configuration is a gjson path to an JSON object or | ||
list of JSON objects. If this path leads to an array of values or | ||
single data point an error will be thrown. If this configuration | ||
is specified, only the result of the query will be parsed and returned as metrics. | ||
|
||
Object paths are specified using gjson path format, which is denoted by object keys | ||
concatenated with "." to go deeper in nested JSON objects. | ||
Additional information on gjson paths can be found here: https://github.com/tidwall/gjson#path-syntax | ||
|
||
The JSON data format also supports extracting time values through the | ||
config "json_time_key" and "json_time_format". If "json_time_key" is set, | ||
"json_time_format" must be specified. The "json_time_key" describes the | ||
name of the field containing time information. The "json_time_format" | ||
must be a recognized Go time format. | ||
More info on time formats can be found here: https://golang.org/pkg/time/#Parse | ||
|
||
For example, if you had this configuration: | ||
|
||
|
@@ -125,11 +143,25 @@ For example, if you had this configuration: | |
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md | ||
data_format = "json" | ||
|
||
## List of tag names to extract from top-level of JSON server response | ||
## List of tag names to extract from JSON server response | ||
tag_keys = [ | ||
"my_tag_1", | ||
"my_tag_2" | ||
] | ||
|
||
## List of field names to extract from JSON and add as string fields | ||
# string_fields = [] | ||
|
||
## gjson query path to specify a specific chunk of JSON to be parsed with | ||
## the above configuration. If not specified, the whole file will be parsed. | ||
## gjson query paths are described here: | ||
# json_query = "" | ||
|
||
## holds the name of the tag of timestamp | ||
# json_time_key = "" | ||
|
||
## holds the format of timestamp to be parsed | ||
# json_time_format = "" | ||
``` | ||
|
||
with this JSON output from a command: | ||
|
@@ -150,8 +182,9 @@ Your Telegraf metrics would get tagged with "my_tag_1" | |
exec_mycollector,my_tag_1=foo a=5,b_c=6 | ||
``` | ||
|
||
If the JSON data is an array, then each element of the array is parsed with the configured settings. | ||
Each resulting metric will be output with the same timestamp. | ||
If the JSON data is an array, then each element of the array is | ||
parsed with the configured settings. Each resulting metric will | ||
be output with the same timestamp. | ||
|
||
For example, if the following configuration: | ||
|
||
|
@@ -174,6 +207,19 @@ For example, if the following configuration: | |
"my_tag_1", | ||
"my_tag_2" | ||
] | ||
|
||
## List of field names to extract from JSON and add as string fields | ||
# string_fields = [] | ||
|
||
## gjson query path to specify a specific chunk of JSON to be parsed with | ||
## the above configuration. If not specified, the whole file will be parsed | ||
# json_query = "" | ||
|
||
## holds the name of the tag of timestamp | ||
json_time_key = "b_time" | ||
|
||
## holds the format of timestamp to be parsed | ||
json_time_format = "02 Jan 06 15:04 MST" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an interesting example time format because there it doesn't include a year. This results in a pared time that is in year 0. I think after parsing the time we should do something similar to what we did in logparser and use the current year. While this would be pretty annoying if you actually want to store in a metric from year 0, I think it is much more likely to be what is wanted:
|
||
``` | ||
|
||
with this JSON output from a command: | ||
|
@@ -183,27 +229,89 @@ with this JSON output from a command: | |
{ | ||
"a": 5, | ||
"b": { | ||
"c": 6 | ||
"c": 6, | ||
"time":"04 Jan 06 15:04 MST" | ||
}, | ||
"my_tag_1": "foo", | ||
"my_tag_2": "baz" | ||
}, | ||
{ | ||
"a": 7, | ||
"b": { | ||
"c": 8 | ||
"c": 8, | ||
"time":"11 Jan 07 15:04 MST" | ||
}, | ||
"my_tag_1": "bar", | ||
"my_tag_2": "baz" | ||
} | ||
] | ||
``` | ||
|
||
Your Telegraf metrics would get tagged with "my_tag_1" and "my_tag_2" | ||
Your Telegraf metrics would get tagged with "my_tag_1" and "my_tag_2" and fielded with "b_c" | ||
The metric's time will be a time.Time object, as specified by "b_time" | ||
|
||
``` | ||
exec_mycollector,my_tag_1=foo,my_tag_2=baz b_c=6 1136387040000000000 | ||
exec_mycollector,my_tag_1=bar,my_tag_2=baz b_c=8 1168527840000000000 | ||
``` | ||
|
||
If you want to only use a specific portion of your JSON, use the "json_query" | ||
configuration to specify a path to a JSON object. | ||
|
||
For example, with the following config: | ||
```toml | ||
[[inputs.exec]] | ||
## Commands array | ||
commands = ["/usr/bin/mycollector --foo=bar"] | ||
|
||
## measurement name suffix (for separating different commands) | ||
name_suffix = "_mycollector" | ||
|
||
## Data format to consume. | ||
## Each data format has its own unique set of configuration options, read | ||
## more about them here: | ||
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md | ||
data_format = "json" | ||
|
||
## List of tag names to extract from top-level of JSON server response | ||
tag_keys = ["first"] | ||
|
||
## List of field names to extract from JSON and add as string fields | ||
string_fields = ["last"] | ||
|
||
## gjson query path to specify a specific chunk of JSON to be parsed with | ||
## the above configuration. If not specified, the whole file will be parsed | ||
json_query = "obj.friends" | ||
|
||
## holds the name of the tag of timestamp | ||
# json_time_key = "" | ||
|
||
## holds the format of timestamp to be parsed | ||
# json_time_format = "" | ||
``` | ||
|
||
with this JSON as input: | ||
```json | ||
{ | ||
"obj": { | ||
"name": {"first": "Tom", "last": "Anderson"}, | ||
"age":37, | ||
"children": ["Sara","Alex","Jack"], | ||
"fav.movie": "Deer Hunter", | ||
"friends": [ | ||
{"first": "Dale", "last": "Murphy", "age": 44}, | ||
{"first": "Roger", "last": "Craig", "age": 68}, | ||
{"first": "Jane", "last": "Murphy", "age": 47} | ||
] | ||
} | ||
} | ||
``` | ||
You would recieve 3 metrics tagged with "first", and fielded with "last" and "age" | ||
|
||
``` | ||
exec_mycollector,my_tag_1=foo,my_tag_2=baz a=5,b_c=6 | ||
exec_mycollector,my_tag_1=bar,my_tag_2=baz a=7,b_c=8 | ||
exec_mycollector, "first":"Dale" "last":"Murphy","age":44 | ||
exec_mycollector, "first":"Roger" "last":"Craig","age":68 | ||
exec_mycollector, "first":"Jane" "last":"Murphy","age":47 | ||
``` | ||
|
||
# Value: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add link to where gjson query paths are described