Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsq_to_file: log less (or support a "quiet mode") #880

Closed
dominicbarnes opened this issue Apr 8, 2017 · 5 comments
Closed

nsq_to_file: log less (or support a "quiet mode") #880

dominicbarnes opened this issue Apr 8, 2017 · 5 comments

Comments

@dominicbarnes
Copy link

Currently, nsq_to_tail is logging each time it flushes to disk, which ends up being a lot especially when the nsq topic has a non-trivial volume of data. It would be really great if there were ways to tell nsq_to_file to log less frequently. (like, errors only)

I know the internals are just using the golang log package, so adding levels like "debug" and such aren't available as a feature there, but it would be great if something similar could be implemented.

@jehiah
Copy link
Member

jehiah commented Apr 8, 2017

@dominicbarnes can you give some context as to your nsq_to_file configuration and volume? How often is the syncing xxx logging happening in practice for you?

For some context, I log many high volume datastreams, but i run my nsq_to_file with --max-in-flight=50000 to flush my gzip stream to disk less often. I end up with this log frequency as a result:

2017/04/08 00:45:44 syncing 42405 records to disk
2017/04/08 00:46:01 syncing 51935 records to disk

@dominicbarnes
Copy link
Author

dominicbarnes commented Apr 8, 2017

I sampled a single machine, and I saw 50+ logs/second on average. Each had <100 records being flushed to disk. My relevant configuration looks like:

-max-in-flight 300
-rotate-interval 5m
-rotate-size 10485760 # 10MB

What config in general impacts how often flushing to disk happens?

@jehiah
Copy link
Member

jehiah commented Apr 8, 2017

There are two aspects. One is -max-in-flight 300 combined with the number of connections. It's a little bit of a simplification, but when a single connection reaches it's limit (max-in-flight / number of connections) messages get flushed to avoid issues in situation where the source hosts are imbalanced.

The second is that nsq_to_file will always flush at least once every 30 seconds.

@mreiferson
Copy link
Member

Looks like #892 will land soon and we can apply the same to the other utility applications.

@mreiferson
Copy link
Member

see #1117

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants