Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more validation of stats received from Docker #4295

Merged
merged 1 commit into from
Aug 18, 2024

Conversation

danehlim
Copy link
Contributor

@danehlim danehlim commented Aug 17, 2024

Summary

Add additional validation of container stats received from Docker such that they are considered invalid if they have a read time of 0001-01-01T00:00:00Z (i.e., zero value of type time.Time) the associated container has a container restart policy is enabled.

This is so that if Docker does send ECS Agent a stat with a read time of 0001-01-01T00:00:00Z for a container with a container restart policy enabled, then ECS Agent drops/filters it out.

Implementation details

  • Update existing validateDockerStats function to consider stats with read time of 0001-01-01T00:00:00Z as invalid for containers with a container restart policy enabled
  • Update getAggregatedDockerStatAcrossRestarts function to set an aggregated stat's preread time to be the read time of the last stat stored in the stats queue, in case ECS Agent does receive from Docker a stat with a read time of 0001-01-01T00:00:00Z directly prior to the current stat being processed
  • Enhance logging to indicate what error was encountered (if any) during container stats collection (this allows visibility into specifically what errors ECS Agent could face, including receiving an invalid stat from Docker)

Testing

Automated PR tests.

Also test via internal functional test that container metadata aggregation across restarts logic from #4206 works as expected even when invalid stats from Docker are received on the latest AL2 AMD64 and AL2023 AMD64 ECS-Optimized AMIs.

New tests cover the changes: yes

Description for the changelog

Add more validation of stats received from Docker

Additional Information

Does this PR include breaking model changes? If so, Have you added transformation functions?
No

Does this PR include the addition of new environment variables in the README?
No

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@danehlim danehlim force-pushed the filter-out-invalid-stats branch from 691449e to c979c34 Compare August 17, 2024 15:52
@danehlim danehlim force-pushed the filter-out-invalid-stats branch from c979c34 to f2db481 Compare August 17, 2024 16:03
@danehlim danehlim marked this pull request as ready for review August 17, 2024 18:19
@danehlim danehlim requested a review from a team as a code owner August 17, 2024 18:19
@danehlim danehlim force-pushed the filter-out-invalid-stats branch from f2db481 to dec4150 Compare August 17, 2024 19:45
@danehlim danehlim force-pushed the filter-out-invalid-stats branch from dec4150 to 699b4f2 Compare August 17, 2024 20:13
@danehlim danehlim force-pushed the filter-out-invalid-stats branch from 699b4f2 to 57f030f Compare August 17, 2024 22:04
logger.Debug(fmt.Sprintf(
"Error processing stats stream of container, backing off %s before reopening", d), logger.Fields{
loggerfield.DockerId: dockerID,
loggerfield.Error: err,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@danehlim danehlim merged commit 5c60c23 into aws:dev Aug 18, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants