Merge branch 'boltdb' into dev #2562

fenxiong · 2020-08-11T21:38:22Z

Summary

Merge branch 'boltdb' into dev.

Implementation details

No merge conflict.

Testing

Functional test/manual test on the boltdb branch.

Description for the changelog

Enhancement - Agent's internal state management mechanism is changed from a custom json state file to boltdb. This change is made to reduce its resource consumption especially under high task density/mutation rate.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Added the new data interface for data management in agent. Added dummy boltdb implementation of the interface and its initialization.

Details: * data: implemented SaveContainer, SaveDockerContainer, DelContainer, GetContainers, SaveTask, DelTask, GetTasks. Added a few helper functions to cover common boltdb interaction. * api/container: added a new field TaskARN to allow easier generation of key when saving a container to db. This field will be populated with correct value in PostUnmarshalTask in a later code change. * utils: added a helper function to get task id from task arn.

Changes are made in following packages: * acs/handler: save task to boltdb after adding the task to task engine; * api/task: populate task arn to container in PostUnmarshalTask. this is used to generate database key when saving the container; * app/agent: initialize boltdb data client and pass it to task engine, acs handler and event handler; * data: added a no-op client which is used in testing and when ECS_CHECKPOINT is set to false; * engine: - added a file data.go which covers interaction with boltdb; - task engine: remove task and containers data from database when cleaning up the task; added a method SetDataClient to set the client similar to how the state manager was set; - task manager: save task to boltdb when its desired/known status changes and when resource known status changes; save container to boltdb when its desired/known status changes and when updating its metadata. these are done in handleDesiredStatusChange, handleContainerChange and handleResourceStateChange in task_manager.go; * eventhandler: save task/container in boltdb after updating their sent status.

Save various metadata to the metadata bucket in boltdb. Details: * acs/handler: made changes to save task manifest seq num to boltdb; * app: made changes to save agent version, availability zone, cluster name, container instance arn and ec2 instance id to boltdb; removed a redundant unit test TestDoStartHappyPath from agent_unix_test.go as it is covered by TestDoStartRegisterAvailabilityZone in agent_test.go which is basically the same, and renamed the latter as TestDoStartHappyPath; * data: implemented SaveMetadata and GetMetadata.

…dated ImageManager to use data Client instead of state manager to persist image states.

For a task in awsvpc network mode, the task engine state holds a mapping between the task's local ip address and the task, and the mapping is saved as part of the state via state manager. With migration to boltdb, this mapping is not saved. So to maintain this information in boltdb, the ip address is added as a field of the task struct and it is saved together with the task.

…. Updated agent to use data Client instead of state manager to persist eni attachment data

Implemented logic for loading data from boltdb upon startup, while preseving backward compatibility by falling back to loading from state file. Details: * app: - In data.go, implement method `loadData` that loads data from previous data file, either boltdb or state file. In the later case, data is migrated to boltdb after loading. Behavior of three cases are considered: 1. Agent starts from fresh instance (no previous state): (1) Try to load from boltdb, get nothing; (2) Try to load from state file, get nothing; (3) Return empty data. 2. Agent starts with previous state stored in boltdb: (1) Try to load from boltdb, get the data; (2) Return loaded data. 3. Agent starts with previous state stored in state file (i.e. it was just upgraded from an old agent that uses state file): (1) Try to load from boltdb, get nothing; (2) Try to load from state file, get something; (3) Save loaded data to boltdb; (4) Return loaded data. - In agent.go, invoke `loadData` method to load data, replacing the existing few lines of code that uses state manager to load data. - Update a few unit tests to use actual task engine state instead of mock one because with the changes, it would be tedious to list all the expected calls to the engine state. * engine: added a method SaveState which saves the whole task engine state to boltdb.

Merge branch 'dev' into boltdb

Also added logic to save attachment sent status as part of task state change, in agent/eventhandler/task_handler.go.

Previous commit lowers the test coverage by 0.1% without obvious reason. Raising code coverage in a poorly covered package instead.

Merge branch 'dev' into boltdb

Remove unnecessary db saves and use batch for db update

fenxiong and others added 22 commits June 17, 2020 11:52

Added boltdb dependency to vendor.

64a8953

Added new data interface and dummy implementation.

cce6787

Added the new data interface for data management in agent. Added dummy boltdb implementation of the interface and its initialization.

Added boltdb implementation to get, update and delete image state. Up…

0a1b4e3

…dated ImageManager to use data Client instead of state manager to persist image states.

Added boltdb implementation to get, update and delete eni attachments…

8e4f86b

…. Updated agent to use data Client instead of state manager to persist eni attachment data

Save all state to boltdb in termination handler

bb836cc

Add method for task engine to load state from boltdb.

a59d9ea

Merge branch 'dev' into boltdb

0209f31

Merge pull request aws#2532 from fenxiong/boltdb-merge

000f6e4

Merge branch 'dev' into boltdb

Remove usages of state manager in acs and event handler.

482fd46

Also added logic to save attachment sent status as part of task state change, in agent/eventhandler/task_handler.go.

Add more unit tests to api/eni package.

6bd4e0f

Previous commit lowers the test coverage by 0.1% without obvious reason. Raising code coverage in a poorly covered package instead.

Remove state manager usage in task engine.

0f72765

Make sure ip <-> task mapping in state is correctly persisted.

ce953c6

Merge branch 'dev' into boltdb

cbb6d2f

Merge pull request aws#2553 from fenxiong/boltdb-merge

ca26790

Merge branch 'dev' into boltdb

Remove unnecessary db saves and use batch for db update.

876425a

Merge pull request aws#2552 from fenxiong/boltdb-update

b196631

Remove unnecessary db saves and use batch for db update

Merge branch 'boltdb' into dev

ad61341

fenxiong added the bot/test label Aug 11, 2020

amazon-ecs-bot removed the bot/test label Aug 11, 2020

fenxiong added this to the 1.44.0 milestone Aug 11, 2020

fenxiong added the bot/test label Aug 12, 2020

amazon-ecs-bot removed the bot/test label Aug 12, 2020

fenxiong marked this pull request as ready for review August 12, 2020 18:39

fenxiong requested a review from a team August 12, 2020 18:40

shubham2892 approved these changes Aug 12, 2020

View reviewed changes

yhlee-aws approved these changes Aug 12, 2020

View reviewed changes

fenxiong merged commit 8c9cc5b into aws:dev Aug 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge branch 'boltdb' into dev #2562

Merge branch 'boltdb' into dev #2562

fenxiong commented Aug 11, 2020 •

edited

Loading

Merge branch 'boltdb' into dev #2562

Merge branch 'boltdb' into dev #2562

Conversation

fenxiong commented Aug 11, 2020 • edited Loading

Summary

Implementation details

Testing

Description for the changelog

Licensing

fenxiong commented Aug 11, 2020 •

edited

Loading