Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No data shown in dashboard for speedtest or airgradient "Service unavailable" #282

Closed
duindain opened this issue Nov 20, 2021 · 3 comments
Closed

Comments

@duindain
Copy link

Hey thx for creating the project again ;)

I seem to have an issue lately getting speedtest data in for the last month or two but i haven't investigated
I just finished creating my first AirGradient sensor today and toggled that feature on in the config and reran the ansible-playbook main.yml and both dashboards (Speedtest and Air sensors) show no data after 4 hours

How can I debug whats wrong?
I've tried deleting all containers in docker from the pi other than pi-hole and recreating the install but that didn't change anything
My config is almost stock other than specifying the Grafana version to fix the Annotation issue

I've changed the frequency of the updates for the speedtest to 180 minutes but other than that everything should be stock

Query inspector in Grafana shows Service Unavailable response being returned by the exporter service?

request:Object
url:"api/datasources/proxy/11/api/v1/query_range"
method:"POST"
data:Object
query:"pm02{job="airgradient"}"
start:1637232720
end:1637405520
step:120
hideFromInspector:false
response:Object
error:"Service Unavailable"
response:"Service Unavailable"
message:"Service Unavailable"

I've tested on the pi through ssh with curl localhost:9925/metrics
that returns

# HELP instance The ID of the AirGradient sensor.
# instance cfa389
# HELP wifi Current WiFi signal strength, in dB
# TYPE wifi gauge
wifi -69
# HELP pm02 Particulat Matter PM2.5 value
# TYPE pm02 gauge
pm02 0
# HELP rc02 CO2 value, in ppm
# TYPE rc02 gauge
rco2 772
# HELP atmp Temperature, in degrees Celsius
# TYPE atmp gauge
atmp 23.9
# HELP rhum Relative humidity, in percent
# TYPE rhum gauge
rhum 54

and those numbers change frequently so it seems the pi is getting the data through

config.yml

---
# Location where configuration files will be stored.
config_dir: '~'

# Pi-hole configuration.
pihole_enable: true
pihole_hostname: pihole
pihole_timezone: Australia/Sydney
pihole_password: "apassword"

# Internet monitoring configuration.
monitoring_enable: true
monitoring_grafana_adansible-playbook main.ymlmin_password: "admin"
monitoring_speedtest_interval: 180m
monitoring_ping_interval: 5s
monitoring_ping_hosts:  # [URL];[HUMAN_READABLE_NAME]
  - http://www.google.com/;google.com
  - https://github.com/;github.com
  - https://www.apple.com/;apple.com

# Shelly Plug configuration. (Also requires `monitoring_enable`)
shelly_plug_enable: false
shelly_plug_hostname: my-shelly-plug-host-or-ip
shelly_plug_http_username: username
shelly_plug_http_password: "password"

# AirGradient configuration. (Also requires `monitoring_enable`)
airgradient_enable: true

# Starlink configuration. (Also requires `monitoring_enable`)
starlink_enable: false

inventory.ini

[internet_pi] 
#191.168.2.44 ansible_user=pi 
 
# Comment out the previous line and uncomment this to run inside Rasberry Pi. 
127.0.0.1 ansible_connection=local ansible_user=pi

I tried adding the ip for the pi to inventory but the ansible install ended with the following output

PLAY RECAP ***********************************************************************************************************************************************************************************************************************
127.0.0.1                  : ok=20   changed=2    unreachable=0    failed=0    skipped=15   rescued=0    ignored=0   
191.168.2.44               : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0 

Does anyone know where to look to find the issue please?

@duindain
Copy link
Author

duindain commented Nov 20, 2021

Checking the logs for the docker containers using sudo docker logs containeruuid --timestamps
container prom/prometheus:v2.25.2 is restarting often and showing lots of memory errors

2021-11-20T11:36:09.014038866Z panic: runtime error: invalid memory address or nil pointer dereference 2021-11-20T11:36:09.014358293Z [signal SIGSEGV: segmentation violation code=0x1 addr=0xc pc=0x17308a8] 2021-11-20T11:36:09.014428762Z 2021-11-20T11:36:09.014493762Z goroutine 779 [running]: 2021-11-20T11:36:09.033965282Z bufio.(*Writer).Available(...) 2021-11-20T11:36:09.034135491Z /usr/local/go/src/bufio/bufio.go:624 2021-11-20T11:36:09.034226116Z github.com/prometheus/prometheus/tsdb/chunks.(*ChunkDiskMapper).WriteChunk(0x4ef0b40, 0x306, 0x0, 0x5e7ca812, 0x17c, 0x5e85bc4a, 0x17c, 0x25f26d8, 0x4b39700, 0x0, ...) 2021-11-20T11:36:09.034295126Z /app/tsdb/chunks/head_chunks.go:291 +0x4f0 2021-11-20T11:36:09.034362574Z github.com/prometheus/prometheus/tsdb.(*memSeries).mmapCurrentHeadChunk(0x45bce70, 0x4ef0b40) 2021-11-20T11:36:09.034427939Z /app/tsdb/head.go:2102 +0x6c 2021-11-20T11:36:09.034493564Z github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk(0x45bce70, 0x5e85cfd2, 0x17c, 0x4ef0b40, 0x54e7780) 2021-11-20T11:36:09.034559397Z /app/tsdb/head.go:2076 +0x24 2021-11-20T11:36:09.034688095Z github.com/prometheus/prometheus/tsdb.(*memSeries).append(0x45bce70, 0x5e85cfd2, 0x17c, 0x536cdbee, 0x3f43f50c, 0x0, 0x0, 0x4ef0b40, 0x56f0001) 2021-11-20T11:36:09.034751533Z /app/tsdb/head.go:2232 +0x3a0 2021-11-20T11:36:09.034811689Z github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples(0x3cf8800, 0x5e7ca700, 0x17c, 0x58d0ec0, 0x58d0e80, 0x0, 0x0) 2021-11-20T11:36:09.034872470Z /app/tsdb/head.go:397 +0x284 2021-11-20T11:36:09.034931220Z github.com/prometheus/prometheus/tsdb.(*Head).loadWAL.func5(0x3cf8800, 0x4bb7b10, 0x4bb7b20, 0x58d0ec0, 0x58d0e80) 2021-11-20T11:36:09.034991793Z /app/tsdb/head.go:491 +0x40 2021-11-20T11:36:09.035051168Z created by github.com/prometheus/prometheus/tsdb.(*Head).loadWAL 2021-11-20T11:36:09.035109814Z /app/tsdb/head.go:490 +0x268

2021-11-20T11:30:53.826030985Z panic: mmap, size 134217728: cannot allocate memory 2021-11-20T11:30:53.826209631Z 2021-11-20T11:30:53.826512652Z goroutine 827 [running]: 2021-11-20T11:30:53.828210153Z github.com/prometheus/prometheus/tsdb.(*memSeries).mmapCurrentHeadChunk(0x4ae6e70, 0x4986120) 2021-11-20T11:30:53.828296924Z /app/tsdb/head.go:2105 +0x228 2021-11-20T11:30:53.828362758Z github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk(0x4ae6e70, 0x5e85cfd2, 0x17c, 0x4986120, 0x5d01440) 2021-11-20T11:30:53.828425987Z /app/tsdb/head.go:2076 +0x24 2021-11-20T11:30:53.828486977Z github.com/prometheus/prometheus/tsdb.(*memSeries).append(0x4ae6e70, 0x5e85cfd2, 0x17c, 0x536cdbee, 0x3f43f50c, 0x0, 0x0, 0x4986120, 0x5160001) 2021-11-20T11:30:53.828550362Z /app/tsdb/head.go:2232 +0x3a0 2021-11-20T11:30:53.828611039Z github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples(0x3d7e500, 0x5e7ca700, 0x17c, 0x5ce6c80, 0x5ce6c40, 0x0, 0x0) 2021-11-20T11:30:53.828673122Z /app/tsdb/head.go:397 +0x284 2021-11-20T11:30:53.828733018Z github.com/prometheus/prometheus/tsdb.(*Head).loadWAL.func5(0x3d7e500, 0x5104158, 0x5104160, 0x5ce6c80, 0x5ce6c40) 2021-11-20T11:30:53.828794789Z /app/tsdb/head.go:491 +0x40 2021-11-20T11:30:53.828853904Z created by github.com/prometheus/prometheus/tsdb.(*Head).loadWAL 2021-11-20T11:30:53.828914477Z /app/tsdb/head.go:490 +0x268 2021-11-20T11:31:03.497673928Z level=info ts=2021-11-20T11:31:03.496Z caller=main.go:404 msg="Starting Prometheus" version="(version=2.25.2, branch=HEAD, revision=bda05a23ada314a0b9806a362da39b7a1a4e04c3)" 2021-11-20T11:31:03.498200439Z level=warn ts=2021-11-20T11:31:03.497Z caller=main.go:406 msg="This Prometheus binary has not been compiled for a 64-bit architecture. Due to virtual memory constraints of 32-bit systems, it is highly recommended to switch to a 64-bit binary of Prometheus." GOARCH=arm

This issue says it its a problem when running 32 bit prometheus and can be fixed by migrating to 64 bit even when on 32 bit os like raspbian

I've removed the hard coded version from the yaml file for prometheus and changed it to prom/prometheus:latest

The memory errors unfortunately came back after a few minutes and the service is restarting again

Nothing coming up in the graphs yet though and it still says service unavailable

@duindain
Copy link
Author

I think one of the WAL files is corrupt it seems to fail on loading segment 656 or 657 every time

Does anyone know how to clear these files? i cant seem to connect to the container it doesn't have bash shell

@duindain
Copy link
Author

I managed to delete the volume and get this sorted with
docker volume ls
docker-compose down
docker volume rm internet-monitoring_prometheus_data
docker volume rm internet-monitoring_grafana_data
docker-compose up - d
running from ~/internet-monitoring folder

I couldnt find the volume on the sdcard but deleting it through docker worked fine

I can now see data starting to populate in the air sensor dashboard
The internet one will probably work in a few hours after the first test

Hopefully it wont reoccur when i hit 656 wal files again not sure why there were so many

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant