Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing task volumes in task payload for EBS-backed tasks #3975

Merged
merged 1 commit into from
Oct 21, 2023

Conversation

mye956
Copy link
Contributor

@mye956 mye956 commented Oct 20, 2023

Summary

This PR will address the current issue where if we launch an EBS-backed task, the list of volumes we get from the payload to be invalid in the eyes of agent. An example:

"Volumes": [{
            "DockerVolumeConfiguration": null,
            "EbsVolumeConfiguration": null,
            "EfsVolumeConfiguration": null,
            "FsxWindowsFileServerVolumeConfiguration": null,
            "Host": null,
            "Name": "someName",
            "Type": null
 }]

By default if the volume type is empty or nil, agent will fall back on volume type host and then look to try to parse whatever is in the Host field of the volume object for a host volume configuration. However, since everything except the volume name is empty, it will fail with the following error:

taskStatus="STOPPED" taskReason="UnrecogniedTaskError: Error loading task - invalid volume: empty volume configuration" taskKnownSentStatus="NONE" taskPullStartedAt="0001-01-01T00:00:00Z" taskPullStoppedAt="0001-01-01T00:00:00Z"

which isn't desired for EBS-backed tasks since the EBS volume is in the task attachments.

To accommodate this, we'll mark the corresponding volume object with the same volume name as the EBS volume attachment object to be of type attachment. This volume object will ultimately be replaced when we're parsing the EBS volume from the list of task attachments.

Implementation details

Implemented new functionality in:

  • agent/acs/session/payload_responder.go:
    • New functionality called hasEBSAttachment to check if the task payload that we've received has EBS attachments within the list task attachments. If there are then we'll return the volume name.
    • New functionlality called initializeAttachmentTypeVolume to find a volume object within the list of task volumes by volume name and then initialize/change the type to attachment. This task volume object will ultimately be replaced by the volume defined in the list of task attachments. This is currently only applicable to EBS volumes.
  • agent/api/task/task.go -> agent/api/task/task_attachment_handler.go: Moved functionality to clean up and remove all non-EBS volume configuration with the same volume name as EBS volume configuration within the list of task volumes.
  • agent/api/task/taskvolume.go: Added a new task volume type called attachment for EBS volumes found in the list of task attachments within the payload.

Testing

Unit tests

Manual testing:
Ran an EBS-backed task and it was able to successfully start one up.

Before changing the task volume type

"Volumes": [{
            "DockerVolumeConfiguration": null,
            "EbsVolumeConfiguration": null,
            "EfsVolumeConfiguration": null,
            "FsxWindowsFileServerVolumeConfiguration": null,
            "Host": null,
            "Name": "ebs1",
            "Type": null
        }],

After changing the task volume type

"volumes": [{
        "name": "ebs1",
        "type": "attachment"
    }]

EBS-backed task was started up

CONTAINER ID   IMAGE                            COMMAND                  CREATED              STATUS                    PORTS     NAMES
3e9e05f65c17   mreferre/yelb-db:0.5             "docker-entrypoint.s…"   About a minute ago   Up About a minute                   ecs-evm-db-6-db-9aa3bedbfcb8f1ec9201
4df58e21e339   amazon/amazon-ecs-agent:latest   "/agent"                 25 minutes ago       Up 25 minutes (healthy)             ecs-agent
e137050d8a00   ebs-csi-driver:latest            "/bin/ebs-csi-driver…"   5 hours ago          Up 5 hours                          ecs---ecs-managed-ebs-csi-driver-d2d7c8ddd1f08c91bc01

Mountpoint for the task

"Mounts": [
            {
                "Type": "bind",
                "Source": "/mnt/ecs/ebs/7bcd9e3ff65a489b9e1c43bfd6592384_vol-0501243308b3d16bc",
                "Destination": "/var/stuff",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }

New tests cover the changes:

Description for the changelog

Fix invalid task volumes field for EBS-backed task payload

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mye956 mye956 changed the title Fixing task volumes in task payload for EBS-backed tasks [WIP] Fixing task volumes in task payload for EBS-backed tasks Oct 20, 2023
@mye956 mye956 force-pushed the ebsFixTaskPayload branch 2 times, most recently from ebfed4e to 5a61de2 Compare October 20, 2023 22:13
@mye956 mye956 force-pushed the ebsFixTaskPayload branch from 5a61de2 to 7ecf228 Compare October 20, 2023 22:43
@mye956 mye956 marked this pull request as ready for review October 20, 2023 22:56
@mye956 mye956 requested a review from a team as a code owner October 20, 2023 22:57
@mye956 mye956 changed the title [WIP] Fixing task volumes in task payload for EBS-backed tasks Fixing task volumes in task payload for EBS-backed tasks Oct 20, 2023
@@ -306,3 +317,26 @@ func isTaskStatusStopped(status apitaskstatus.TaskStatus) bool {
func isTaskStatusNotStopped(status apitaskstatus.TaskStatus) bool {
return status != apitaskstatus.TaskStopped
}

func hasEBSAttachment(acsTask *ecsacs.Task) (string, bool) {
// TODO: This will only work if there's one EBS volume per task. If we there is a case where we have multi-attach for a task, this needs to be modified
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit just leave this comment at "// TODO: This will only work if there's one EBS volume per task. "

@@ -106,6 +107,16 @@ func (pmHandler *payloadMessageHandler) addPayloadTasks(payload *ecsacs.PayloadM
allTasksOK = false
continue
}

// Note: If we receive an EBS-backed task, we'll also received an incomplete volume configuration in the list of Volumes
// To accomodate this, we'll first check if the task IS EBS-backed then we'll mark the corresponding Volume object to be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is ebs-specific today but will probably be true for other volumes in future.
Can we just say "// To accommodate new volume patterns, we'll first check if the attachment properties include volume configuration then we'll mark the corresponding volume object to be of type attachment to designate source of truth.

task.Volumes = append(task.Volumes, taskVolume)
}
// We're removing all incorrect volume configuration that were intially passed over from ACS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "all incorrect volume configurations"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are saying that the task payload we get from ECS backend is incorrect? That reads very odd to me. Is there a better way to put it?

Copy link
Contributor Author

@mye956 mye956 Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

er sorry perhaps we can say "incomplete" here or "temporary"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even incomplete sounds wrong... non-blocking but let's rethink to prevent future confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely agree

@@ -78,6 +79,9 @@ func (tv *TaskVolume) UnmarshalJSON(b []byte) error {
return tv.unmarshalFSxWindowsFileServerVolume(intermediate["fsxWindowsFileServerVolumeConfiguration"])
case apiresource.EBSTaskAttach:
return tv.unmarshalEBSVolume(intermediate["ebsVolumeConfiguration"])
case AttachmentType:
seelog.Warn("Obtaining the volume configuration from task attachments.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an expected path? if so can this be Debug or Info level?

Copy link
Contributor

@mythri-garaga mythri-garaga Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand, why is this no-op for AttachmentType? will this be updated in future PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, so we want this to be a noop for volume of type attachment since it'll end being removed and replaced by the newly created volume configuration that we make when handling the task attachments.

We originally thought to try to have a special edge case for volumes defined in the list of task attachments here in taskvolume.go but that would mean we'd have to change around some of the existing edge cases and can get messy.

Comment on lines +115 to +118
volName, ok := hasEBSAttachment(task)
if ok {
initializeAttachmentTypeVolume(task, volName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this tested in any unit tests? The test update seems to be in the input and not in assertions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly, I was hoping to do this in a follow up. A lot of the testing was mostly done through manual testing.

@fierlion fierlion merged commit fc97633 into aws:dev Oct 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants