Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outputs.influxdb cannot write any address - error: EOF #7650

Closed
Trovalo opened this issue Jun 8, 2020 · 10 comments
Closed

Outputs.influxdb cannot write any address - error: EOF #7650

Trovalo opened this issue Jun 8, 2020 · 10 comments
Labels
area/influxdb bug unexpected problem or unintended behavior platform/windows

Comments

@Trovalo
Copy link
Collaborator

Trovalo commented Jun 8, 2020

I'm gathering data with telegraf and writing to a telegraf gateway (input influxdb_listener), sometimes I get the following error

2020-06-05T11:08:07Z E! [outputs.influxdb] When writing to [https://____]: Post "https://____/write?db=quantumdatis": EOF
2020-06-05T11:08:07Z E! [agent] Error writing to outputs.influxdb: could not write any address

I'm not sure about what can cause this, the telegraf gateway does not report any error.

I've found an old issue #2280 about the same error and tried to change several settings to solve this but without luck.

Telegraf gatherer conf

  • agent settings
    • reduce metric_batch_size - in case there was too much data
    • increase interval - to give more time to the output
  • output.influxdb settings
    • using content_encoding = "gzip" - to reduce the request body size
    • increase the timeout - to give it more time

Telegraf gateway conf

  • inputs.influxdb_listener settings
    • increased the max_body_size - doubled its size max_body_size = "64MiB"

Relevant telegraf.conf:

# Full conf to be added (the server is offline now), but there is nothing special in the conf itself

System info:

  • Windows Server 2016 Standard
  • Telegraf Custom build - (last commit f91d083. the custom build adds some queries to the SQL server plugin, see Sql server old version compatibility #7495)
  • Telegraf 1.14.2 (the errors occurs also with the official version)

Steps to reproduce:

I doubt this will be easy to replicate but here are the steps

  1. Gather data from SQL Server
  2. Write to an influxDB gateway

Expected behavior:

Not getting the error EOF error

Actual behavior:

2020-06-05T11:08:07Z E! [outputs.influxdb] When writing to [https://____]: Post "https://____/write?db=quantumdatis": EOF
2020-06-05T11:08:07Z E! [agent] Error writing to outputs.influxdb: could not write any address

Additional info:

@ssoroka
Copy link
Contributor

ssoroka commented Jun 8, 2020

What InfluxDB server are you connecting to? is it a self-hosted one? If so, are there any matching error logs in InfluxDB?

@danielnelson
Copy link
Contributor

I have a few things I'd like to check into for this one around the json decoder and gzip handling.

@danielnelson danielnelson self-assigned this Jun 8, 2020
@danielnelson danielnelson added area/influxdb bug unexpected problem or unintended behavior labels Jun 8, 2020
@Trovalo
Copy link
Collaborator Author

Trovalo commented Jun 8, 2020

Right now we are having some issues on the server but it should be fixed soon.
The InfluxDB version should be 1.8 (but I will confirm and edit the issue this once the server is back online).

In the meanwhile you can leave instructions here, I will apply them asap

@Trovalo
Copy link
Collaborator Author

Trovalo commented Jul 13, 2020

@danielnelson sorry for the long wait, here are the info about my setup and configurations.

Current configuration & setup

The setup

  • several Telegraf clients that monitor one or more SQL Server instances
  • The data are sent to a telegraf gateway and redirected
  • The gateway sends data to different InfluxDBs (using database_tag)

Client configuration

Version: Telegraf unknown (git: master 1b1382c)
The configuration is split into several files, for full detail see attachment
Configuration.zip

Server configuration (gateway)

Version: Telegraf 1.14.2 (git: HEAD fb74eaf)
there are 2 outputs only because some data are cloned in a specific DB
gateway.zip

Log since last Telegraf restart

2020-07-13T09:37:48Z I! Loaded inputs: internal sqlserver sqlserver
2020-07-13T09:37:48Z I! Loaded aggregators: 
2020-07-13T09:37:48Z I! Loaded processors: converter strings
2020-07-13T09:37:48Z I! Loaded outputs: influxdb
2020-07-13T09:37:48Z I! Tags enabled: company=quantumdatis_demo host=QDSRVMONITOR
2020-07-13T09:37:48Z I! [agent] Config: Interval:15s, Quiet:false, Hostname:"QDSRVMONITOR", Flush Interval:10s
2020-07-13T09:38:06Z E! [inputs.sqlserver] Error in plugin: Unable to get instances from Sql Server Browser on host SQLCSRV04: read udp 10.0.1.122:50483->10.0.1.110:1434: i/o timeout
2020-07-13T09:47:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T10:01:30Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T10:02:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T10:11:40Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:11:40Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:12:40Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:12:40Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:13:10Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:13:10Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:14:10Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:14:10Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:15:10Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:15:10Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:20:20Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:20:20Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:23:20Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T10:23:20Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T10:55:30Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T11:00:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T11:52:45Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T11:54:25Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T11:54:25Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T11:54:55Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T11:54:55Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T11:57:55Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T11:57:55Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T11:58:25Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T11:58:25Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T12:00:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T12:00:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T12:02:05Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T12:02:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T12:02:55Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T12:02:55Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T12:04:05Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T12:04:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:26:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T13:30:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T13:33:49Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:33:49Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:43:44Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2020-07-13T13:43:44Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:46:10Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:46:10Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:46:40Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:46:40Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:47:10Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:47:10Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:48:10Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:48:10Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:48:40Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:48:40Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:49:30Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:49:30Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:50:40Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:50:40Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:51:20Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:51:20Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:53:50Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:53:50Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T13:54:20Z E! [outputs.influxdb] When writing to [https://_MyUrl_]: Post "https://_MyUrl_/write?db=quantumdatis_demo": EOF
2020-07-13T13:54:20Z E! [agent] Error writing to outputs.influxdb: could not write any address
2020-07-13T14:00:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T14:30:00Z W! [agent] [inputs.sqlserver] did not complete within its interval
2020-07-13T15:00:00Z W! [agent] [inputs.sqlserver] did not complete within its interval

I've noticed that the Telegraf gateway has an "old" version, I could start by updating it and check if the error persists.

If you have something to try to get more data about the issue just let me know and I will configure/apply it as soon as possible

@danielnelson
Copy link
Contributor

I tried a few ideas I had to trigger the error, but wasn't successful in reproducing. I have a couple ideas for ways we could proceed though:

  • Use wireshark to collect the traffic to/from the influxdb output.
  • Add trace statements into the http package, and try to determine the origin of the error.

@Trovalo
Copy link
Collaborator Author

Trovalo commented Jul 14, 2020

there will be lots of data passing by, I've never used Wireshark so far but it's not a problem to put it on the server sending metrics. Is there anything particular about it?

@danielnelson
Copy link
Contributor

Ideally we would limit the amount of data captured to capture only need the data directly around a single error. It would be good to have the log from Telegraf for the same time period so we can cross reference.

Use a capture filter to limit the communication with the single server:

port 8086 and host localhost

I'm not sure if HTTPS will be a issue here, it may not be possible to inspect the important part of the data with encryption in place but we can try without it first.

@Trovalo
Copy link
Collaborator Author

Trovalo commented Jul 16, 2020

I've tried Wireshark. I can get the encrypted data but I'm unable to decrypt them.
I've tried several ways, from simple RSA to (Pre)-Master-Secret file, which I'm unable to generate

If you don't have any additional suggestion I would consider adding some tracing code in telegraf itself in order to get the needed data.

@janegilring
Copy link

FYI: I had a similar error (EOF) when trying to run InfluxDB on Azure Container Instances. This turned out not to provide required storage performance, as it was using Azure Files/samba for the backend storage. Hence, a tip could be look into the backend storage for InfluxDB, and if possible test against a different instance using other storage.

More details:
#7650

@danielnelson danielnelson removed their assignment Sep 1, 2021
@powersj
Copy link
Contributor

powersj commented Jun 20, 2023

I am going to close this issue.

The issue has not had many updates in a while. While I know there have been reports of EOF errors coming up in the past, it can normally be traced to networking related issues.

Users of the influxdb_v2 plugin have seen this for similar reasons. Additionally, we got a fix in upstream Go to better handle http2 connections where this was seen.

@powersj powersj closed this as not planned Won't fix, can't repro, duplicate, stale Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/influxdb bug unexpected problem or unintended behavior platform/windows
Projects
None yet
Development

No branches or pull requests

6 participants