Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/bosun: InfluxDB query support #1291

Merged
merged 1 commit into from
Sep 2, 2015
Merged

cmd/bosun: InfluxDB query support #1291

merged 1 commit into from
Sep 2, 2015

Conversation

nathanielc
Copy link
Contributor

Here is an updated PR from the original 'influxdb-query' branch. It adds support for an influx function that can perform a query against InfluxDB and return the results as a seriesSet. It also adds support for triple quoted strings in expressions so that InfluxDB queries make full use of single and double quotes.

Other small changes where made from the original PR:

Manipulating the InfluxQL AST instead of string manipulations
Removed the format parameter since it required that the expression double list desired tags. Seems like the original reason for having it was to get around a bug in InfluxDB and is no longer necessary. See influxdata/influxdb#3059

Now squashed with all changes.

This was referenced Sep 1, 2015
@gbrayut
Copy link
Contributor

gbrayut commented Sep 1, 2015

Thanks for squashing the commits. We usually prefer things get squashed before merging, and I see you already used the cmd/bosun: prefix. I'll take a look today and see if I can get it running.

Also for future reference you can force push the squashed commits to an existing branch instead of opening a new PR. A new PR is only needed if you are deviating from the original goal or want to reset the discussion. Using the existing PR can help keep all the comments in one place.

I'll let you know how much progress I get on this later today.

@gbrayut
Copy link
Contributor

gbrayut commented Sep 1, 2015

First off, some notes from how I did my testing:

#Install influxdb using RPM
wget https://s3.amazonaws.com/influxdb/influxdb-0.9.3-1.x86_64.rpm
yum localinstall influxdb-0.9.3-1.x86_64.rpm
service influxdb start

#create influxdb user (required by grafana) and opentsdb database
/opt/influxdb/influx
CREATE USER grafana WITH PASSWORD 'grafana'
CREATE DATABASE opentsdb

#Install telegraf for metrics https://influxdb.com/blog/2015/06/19/Announcing-Telegraf-a-metrics-collector-for-InfluxDB.html 
wget http://get.influxdb.org/telegraf/telegraf-0.1.7-1.x86_64.rpm
yum localinstall telegraf-0.1.7-1.x86_64.rpm
service telegraf start

#update /etc/opt/influxdb/influxdb.conf config file
[meta]
  dir = "/var/opt/influxdb/meta"
  hostname = "ny-gbraylx01.ds.stackexchange.com"
...
[opentsdb]
  enabled = true
  bind-address = ":4242"
  database = "opentsdb"
  retention-policy = ""

#Restart influxdb and point an instance of scollector at it:
service influxdb start
scollector -h ny-gbraylx01.ds.stackexchange.com:4242

#Clone repo and build bosun from branch
git remote add influxdb git@github.com:influxdb/bosun.git
git checkout squashed_influxdb-query
go build bosun.org/cmd/bosun

#edit conf file and run bosun
cd $GOPATH/src/bosun.org/cmd/bosun
echo influxHost = ny-gbraylx01.ds.stackexchange.com:8086 > influxdb.conf
./bosun -c influxdb.conf

Using the above I now have:
InfluxDB admin page at http://ny-gbraylx01.ds.stackexchange.com:8083/
opentsdb service at http://ny-gbraylx01.ds.stackexchange.com:4242/
grafana datasource working using http://ny-gbraylx01.ds.stackexchange.com:8086/

a quick test on the bosun expression page shows the following returns the expected data:
influx("telegraf","select value from cpu_percentageIdle group by cpu","5m","")

I will continue testing and let you know if I find any issues.

@giganteous
Copy link
Contributor

Triple quotes give an error: I patched that in parser.go in my branch at https://github.com/giganteous/bosun/tree/squashed_influxdb-query

@gbrayut
Copy link
Contributor

gbrayut commented Sep 1, 2015

  1. Ideally influxHost should support a full URL if provided, so that a secure https://host:port can be used if everything is configured correctly. Currently if I use influxHost = http://ny-gbraylx01.ds.stackexchange.com:8086 I get "influx: did not get a valid result from InfluxDB".
  2. Triple quotes doesn't appear to work correctly. Error message is expr: invalid syntax. (I see giganteous made a patch already)
  3. There is a debug writeline at https://github.com/influxdb/bosun/blob/squashed_influxdb-query/cmd/bosun/expr/influx.go#L189 that should be removed.
  4. Also when starting it has an odd "web.go:121: tsdb host:" in the log file.
[root@ny-gbraylx01 bosun]# ./bosun -c ./influxdb.conf
2015/09/01 22:25:04 enabling syslog
2015/09/01 22:25:04 info: bolt.go:118: RestoreState
2015/09/01 22:25:04 info: bolt.go:231: RestoreState done in 1.539129ms
2015/09/01 22:25:04 info: web.go:120: bosun web listening on: :8070
2015/09/01 22:25:04 info: web.go:121: tsdb host:

This doesn't need to be addressed in this PR but we may want to improve the logging of which backends are being used and what their settings are (tsdbHost, graphiteHost, influxHost, logstashElasticHosts)

We should probably make a docker image for testing. I'm happy to file an issue for tracking and work on this next week.

I'll chat with Craig about this and see when we can merge it.

docs: Add docs for influx function
@nathanielc
Copy link
Contributor Author

@giganteous thanks for the patch. I have applied it and updated this PR

@gbrayut
Copy link
Contributor

gbrayut commented Sep 2, 2015

The updated PR LGTM. The following expressions work on my local instance:

influx("telegraf","select value from cpu_percentageIdle group by host,cpu","5m","")
influx("opentsdb",'''SELECT value from "os.mem.percent_free" group by host''',"5m","")

I'll merge this and create issues for tracking some followup items (how to guide, docker image, etc). If you find any problems please create an Issue/PR using the influxdb label.

Thanks @nathanielc and @giganteous!

@mubblemanish
Copy link

I am getting below error while trying to start bosun.

sudo ./bosun -c conf/bosun.conf
2015/10/19 07:03:14 enabling syslog
panic: runtime error: index out of range [recovered]
panic: runtime error: index out of range [recovered]
panic: runtime error: index out of range

goroutine 1 [running]:
bosun.org/cmd/bosun/conf.errRecover(0xc820115c60)
/go/src/bosun.org/cmd/bosun/conf/conf.go:185 +0xbb
bosun.org/cmd/bosun/expr/parse.(_Tree).recover(0xc820293490, 0xc820115350)
/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:191 +0x8d
bosun.org/cmd/bosun/expr.influxTag(0xc82025dec0, 0x4, 0x4, 0xc820293490, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/influx.go:28 +0x224
bosun.org/cmd/bosun/expr/parse.(_FuncNode).Tags(0xc8201a9e00, 0x1, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/parse/node.go:137 +0x73
bosun.org/cmd/bosun/expr.tagFirst(0xc8202a0440, 0x1, 0x1, 0xf91803, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/funcs.go:44 +0x47
bosun.org/cmd/bosun/expr/parse.(_FuncNode).Tags(0xc8201a9d80, 0xc820293490, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/parse/node.go:137 +0x73
bosun.org/cmd/bosun/expr/parse.(_BinaryNode).Check(0xc820017ec0, 0xc820293490, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/parse/node.go:275 +0x3ce
bosun.org/cmd/bosun/expr/parse.(_Tree).parse(0xc820293490)
/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:243 +0xb5
bosun.org/cmd/bosun/expr/parse.(_Tree).Parse(0xc820293490, 0xc820017c20, 0x58, 0xc8202a0420, 0x2, 0x2, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:233 +0xdf
bosun.org/cmd/bosun/expr/parse.Parse(0xc820017c20, 0x58, 0xc8202a0420, 0x2, 0x2, 0xc820293490, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:113 +0x9c
bosun.org/cmd/bosun/expr.New(0xc820017c20, 0x58, 0xc8202a0420, 0x2, 0x2, 0x0, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/expr/expr.go:67 +0xac
bosun.org/cmd/bosun/conf.(_Conf).NewExpr(0xc820294000, 0xc820017c20, 0x58, 0x4)
/go/src/bosun.org/cmd/bosun/conf/conf.go:1125 +0xc2
bosun.org/cmd/bosun/conf.(_Conf).loadAlert(0xc820294000, 0xc82025da80)
/go/src/bosun.org/cmd/bosun/conf/conf.go:822 +0x1416
bosun.org/cmd/bosun/conf.(*Conf).loadSection(0xc820294000, 0xc82025da80)
/go/src/bosun.org/cmd/bosun/conf/conf.go:513 +0xeb
bosun.org/cmd/bosun/conf.New(0x7ffc1b5bd7f6, 0xf, 0xc820264a00, 0x243, 0xc820294000, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/conf/conf.go:368 +0xf7f
bosun.org/cmd/bosun/conf.ParseFile(0x7ffc1b5bd7f6, 0xf, 0xb, 0x0, 0x0)
/go/src/bosun.org/cmd/bosun/conf/conf.go:331 +0xe3
main.main()
/go/src/bosun.org/cmd/bosun/main.go:86 +0x201

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1696 +0x1

goroutine 5 [runnable]:
text/template/parse.lexText(0xc820080080, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc820080080)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 6 [runnable]:
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:20
created by os/signal.init.1
/usr/local/go/src/os/signal/signal_unix.go:28 +0x37

goroutine 7 [runnable]:
text/template/parse.lexText(0xc820080200, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc820080200)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 8 [runnable]:
text/template/parse.lexText(0xc820080300, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc820080300)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 11 [runnable]:
text/template/parse.lexText(0xc8201a9200, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc8201a9200)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 14 [runnable]:
text/template/parse.lexText(0xc8201a9500, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc8201a9500)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 16 [runnable]:
bosun.org/cmd/bosun/conf/parse.lexSpace(0xc820274d20, 0xf95aa8)
/go/src/bosun.org/cmd/bosun/conf/parse/lex.go:166 +0x1be
bosun.org/cmd/bosun/conf/parse.(*lexer).run(0xc820274d20)
/go/src/bosun.org/cmd/bosun/conf/parse/lex.go:135 +0x52
created by bosun.org/cmd/bosun/conf/parse.lex
/go/src/bosun.org/cmd/bosun/conf/parse/lex.go:128 +0xdf

goroutine 18 [runnable]:
text/template/parse.lexText(0xc8201a9c00, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc8201a9c00)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 19 [runnable]:
text/template/parse.lexText(0xc8201a9d00, 0xf97190)
/usr/local/go/src/text/template/parse/lex.go:237 +0x37e
text/template/parse.(*lexer).run(0xc8201a9d00)
/usr/local/go/src/text/template/parse/lex.go:206 +0x52
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:199 +0x15d

goroutine 20 [runnable]:
bosun.org/cmd/bosun/expr/parse.lexItem(0xc82025de40, 0xf95d98)
/go/src/bosun.org/cmd/bosun/expr/parse/lex.go:193 +0x4c1
bosun.org/cmd/bosun/expr/parse.(*lexer).run(0xc82025de40)
/go/src/bosun.org/cmd/bosun/expr/parse/lex.go:163 +0x52
created by bosun.org/cmd/bosun/expr/parse.lex
/go/src/bosun.org/cmd/bosun/expr/parse/lex.go:156 +0xc9

My config file looks like

influxHost = 54.254.204.231:8086
smtpHost = mail.stackoverflow.com:25
emailFrom = bosun@example.org

template cpu {
body = `Alert definition:
Name: {{.Alert.Name}}
Crit: {{.Alert.Crit}}

    Tags:{{range $k, $v := .Tags}}
    {{$k}}: {{$v}}{{end}}
    `
    subject = cpu idle at {{.Alert.Vars.q | .E}} on {{.Tags.host}}

}

notification default {
email = someone@domain.com
next = default
timeout = 1h
}

alert influx.os.high.cpu {
$q = avg(influx("metrics_db",'''select value from cpu_value GROUP BY host''', "5m", ""))
warn = $q > 20
crit = $q > 1
}

What is wrong in the configuration?

@giganteous
Copy link
Contributor

This bug is fixed if you compile from master if I'm not mistaken.

@gbrayut gbrayut self-assigned this Feb 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants