Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a issue that agent does not clean task execution credentials from credential manager when stopping a task #2993

Merged
merged 1 commit into from
Sep 9, 2021

Conversation

Realmonia
Copy link
Contributor

@Realmonia Realmonia commented Aug 20, 2021

Summary

Fix a issue that agent does not clean task execution credentials from credential manager when stopping a task

Implementation details

Introducing task execution credential removal to cleanupTask method

Testing

make test

Tested with an al2 instance, stop task with taskExecutionRole seems working fine.

Verify if this is related to memory leak issue: tried with a service of which the task definition makes task exits immediately. (run exit in command) and set desired task count to be 10 and run for about 48h. Both instances start with about the same memory usage (13MiB/31.15GiB)

Result

Using dev branch

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
e92dee603223        ecs-agent           6.92%               468.1MiB / 31.15GiB   1.47%               0B / 0B             0B / 7.63GB         17

Using branch with fix

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
ba0db2aaf082        ecs-agent           7.79%               450.2MiB / 31.15GiB   1.41%               0B / 0B             0B / 7.59GB         18

delta is about 18MiB.

The result shows that this issue is a very minor contributing factor to agent memory leak issue. More research on the memory leak issue will be done based on #3001

New tests cover the changes: yes

Description for the changelog

Fix a issue that agent does not clean task execution credentials from credential manager when stopping a task

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@Realmonia Realmonia marked this pull request as draft August 22, 2021 00:40
@yinyic
Copy link
Contributor

yinyic commented Aug 23, 2021

It's definitely the right thing to do to clean up execution role creds. I'm curious how confident we are that this is the issue contributing to the memory leak? The structure used for storing creds contains string fields only (and short ones too it seems), which usually won't take megabytes of memory I feel like.

There could be something else i'm missing. I think it would be great if we could run the experiment a few more times and observe the consistency.

Copy link
Contributor

@sharanyad sharanyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the testing, it looks like some amount of memory leak is mitigated with this change, but not fully. is that right?

agent/engine/task_manager.go Outdated Show resolved Hide resolved
@mythri-garaga
Copy link
Contributor

May be this questions is out of this PR scope but I am wondering if we are currently only cleaning up the latest task and execution credentials received from ACS credentials refresh payload at the time of task deletion. The previous expired credentials still live in the credential manager?

@Realmonia
Copy link
Contributor Author

May be this questions is out of this PR scope but I am wondering if we are currently only cleaning up the latest task and execution credentials received from ACS credentials refresh payload at the time of task deletion. The previous expired credentials still live in the credential manager?

Seems so, we are not doing credentials clean up based on credential expiration time.

@Realmonia Realmonia force-pushed the cleanupExecutionCredentials branch 3 times, most recently from ac0520c to cdfc4de Compare August 27, 2021 18:47
@Realmonia Realmonia marked this pull request as ready for review September 2, 2021 19:29
@Realmonia Realmonia self-assigned this Sep 2, 2021
@Realmonia Realmonia requested a review from a team September 2, 2021 19:30
sharanyad
sharanyad previously approved these changes Sep 2, 2021
Copy link
Contributor

@sharanyad sharanyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, changes lgtm

agent/engine/task_manager.go Outdated Show resolved Hide resolved
@Realmonia
Copy link
Contributor Author

Wait for the release is done to merge --- this is not in the planned release content.

@Realmonia Realmonia merged commit 1fd2d38 into aws:dev Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants