Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime stats logger and pprof #3001

Merged
merged 1 commit into from
Sep 1, 2021
Merged

Conversation

angelcar
Copy link
Contributor

@angelcar angelcar commented Aug 26, 2021

Summary

This change adds a new log file: runtime-stats.log, which will contain the go runtime statistics. The intention is to have more information at runtime regarding, memory, CPU, and goroutine usage.

In addition, the ECS_ENABLE_PPROF configuration is introduced in order to enable pprof.

Both of these changes aim to better assess issues such as #2865.

Implementation details

Runtime stats are retrieved by periodically executing runtime.ReadMemStats, and runtime.NumGoroutine. The results of these methods are then logged to the new runtime-stats.log file. Stats are logged every 5 minutes and the interval is not configurable since there seems to be no justification for it. Also it is not advisable to keep increasing the list of configuration variables if not strictly necessary as the list is already large and might confusing to some.
This log is going to be enabled and considered to be just another log file part of the agent.

Pprof is only enabled on demand via the ECS_ENABLE_PPROF configuration. In this case, a new configuration was needed since users might not want/need to have pprof enabled in production environments; hence, the configuration is disabled by default.

Testing

Manually rested that pprof is enabled and functional when ECS_ENABLE_PPROF=true, and not reachable otherwise.

Also tested that the runtime stats log logs successfully every 5 minutes. Below is a sample log entry:

FreesCount=416081, HeapMemoryAlloc=5881688, HeapMemoryInUse=8232960, HeapMemoryIdle=57434112, HeapMemoryReleased=56737792, MallocCount=441929, StackMemoryInUse=1441792, GarbageCollectionCount=7, GarbageCollectionTime=453.039µs, GCCPUFraction=0.000016, GoroutineCount=124, UpTime=3h25m0.000145476s

UTs were also added/modified as necessary/

New tests cover the changes: yes

Description for the changelog

  • Add runtime-stats log file to periodically log agent's runtime stats such as used memory and CPU
  • Add new configuration setting to enable/disable pprof

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@angelcar angelcar force-pushed the runtime-stats branch 2 times, most recently from 6b8786f to 47c3826 Compare August 26, 2021 21:28
@angelcar angelcar marked this pull request as ready for review August 26, 2021 21:52
@angelcar angelcar force-pushed the runtime-stats branch 2 times, most recently from 28ea88c to aabeeb9 Compare August 26, 2021 22:05
@angelcar angelcar changed the title Add runtime stats logger Add runtime stats logger and pprof Aug 26, 2021
@angelcar angelcar force-pushed the runtime-stats branch 4 times, most recently from 96ebb85 to e850105 Compare August 27, 2021 17:31
mssrivas
mssrivas previously approved these changes Aug 27, 2021
Copy link
Contributor

@fenxiong fenxiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good one question

agent/app/agent.go Show resolved Hide resolved
Copy link
Contributor

@sharanyad sharanyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason why pprof output is through introspection endpoint, while runtime stats is through a file?
why not redirect pprof output to a file, so we can capture those as part of ECS logs collector as well?

agent/logger/runtime_stats_logger.go Outdated Show resolved Hide resolved
@angelcar
Copy link
Contributor Author

is there any reason why pprof output is through introspection endpoint, while runtime stats is through a file?
why not redirect pprof output to a file, so we can capture those as part of ECS logs collector as well?

The http pprof endpoints are a standard way in which go exposes profiling data. For example, tools like go tool pprof can communicate with these endpoints to retrieve and interpret profiling information. If a file is needed at some point, it can be easily generated by invoking the pprof http endpoint with curl and redirecting the output to a file (like I documented on the readme file of this PR). For more information see: https://pkg.go.dev/net/http/pprof

Add pprof to agent introspection endpoint

fixup! Add runtime stats logger
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants