Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StepFunctions: Integration pattern RUN_JOB has no effect on GlueStartJobRun #30735

Closed
lauragalera opened this issue Jul 2, 2024 · 4 comments
Closed
Assignees
Labels
@aws-cdk/aws-stepfunctions Related to AWS StepFunctions bug This issue is a bug.

Comments

@lauragalera
Copy link

lauragalera commented Jul 2, 2024

Describe the bug

There is a feature in console that permits to execute two jobs sequentially by enabling "Wait for task to complete - optional" in the GlueStartJobRun. I found in a closed issue that the same behavior can be achieved by using the construct property IntegrationPattern.RUN_JOB. However, cdk ignores the property because, after deploying, the resource appears without the sufix .sync.

Expected Behavior

The deployment should result in a state machine with the following code (notice the .sync):

{
  "StartAt": "StartJobIngestionMixpanel",
  "States": {
    "StartJobIngestionMixpanel": {
      "Next": "StartJobRedshiftMixpanel",
      "Retry": [
        {
          "ErrorEquals": [
            "States.TaskFailed"
          ],
          "IntervalSeconds": 300,
          "MaxAttempts": 2,
          "BackoffRate": 1
        }
      ],
      "Catch": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "Next": "JobFailed"
        }
      ],
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun.sync",
      "Parameters": {
        "JobName": "JOB-00348-DEV-mixpanel-events-to-s3"
      }
    },
    "StartJobRedshiftMixpanel": {
      "End": true,
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun",
      "Parameters": {
        "JobName": "JOB-00351-DEV-mixpanel-events-to-redshift"
      }
    },
    "JobFailed": {
      "Type": "Fail",
      "Error": "Glue job failed after retries"
    }
  }
}

Current Behavior

The resulting code:

{
  "StartAt": "StartJobIngestionMixpanel",
  "States": {
    "StartJobIngestionMixpanel": {
      "Next": "StartJobRedshiftMixpanel",
      "Retry": [
        {
          "ErrorEquals": [
            "States.TaskFailed"
          ],
          "IntervalSeconds": 300,
          "MaxAttempts": 2,
          "BackoffRate": 1
        }
      ],
      "Catch": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "Next": "JobFailed"
        }
      ],
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun",
      "Parameters": {
        "JobName": "JOB-00348-DEV-mixpanel-events-to-s3"
      }
    },
    "StartJobRedshiftMixpanel": {
      "End": true,
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun",
      "Parameters": {
        "JobName": "JOB-00351-DEV-mixpanel-events-to-redshift"
      }
    },
    "JobFailed": {
      "Type": "Fail",
      "Error": "Glue job failed after retries"
    }
  }
}

Reproduction Steps

Code for replication

       const jobIngestionMixpanel = new GlueStartJobRun(this, 'StartJobIngestionMixpanel', {
            glueJobName: 'JOB-00348-DEV-mixpanel-events-to-s3',
            integretaionPatter: IntegrationPattern.RUN_JOB
        })

        jobIngestionMixpanel.addRetry({
            errors: ['States.TaskFailed'],
            backoffRate: 1,
            maxAttempts: 2,
            interval: Duration.minutes(5)
        })

        jobIngestionMixpanel.addCatch(new Fail(this, 'JobFailed', {
            error: 'Glue job failed after retries'
        }))

        const jobRedshiftMixpanel = new GlueStartJobRun(this, 'StartJobRedshiftMixpanel', {
            glueJobName: 'JOB-00351-DEV-mixpanel-events-to-redshift'
        })

        const definition = jobIngestionMixpanel.next(jobRedshiftMixpanel)

        new StateMachine(this, 'StateMachineJobs', {
            definitionBody: DefinitionBody.fromChainable(definition),
            role: roleStepFunction
        })

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.126.0

Framework Version

No response

Node.js Version

v20.11.0

OS

MacOS 14.2.1

Language

TypeScript

Language Version

No response

Other information

No response

@lauragalera lauragalera added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 2, 2024
@github-actions github-actions bot added the @aws-cdk/aws-stepfunctions Related to AWS StepFunctions label Jul 2, 2024
@ashishdhingra ashishdhingra self-assigned this Jul 2, 2024
@ashishdhingra ashishdhingra added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 2, 2024
@ashishdhingra
Copy link
Contributor

Run a Job (.sync) specifies that For integrated services such as AWS Batch and Amazon ECS, Step Functions can wait for a request to complete before progressing to the next state. To have Step Functions wait, specify the "Resource" field in your task state definition with the .sync suffix appended after the resource URI..

@lauragalera Good morning. Somehow, I'm unable to reproduce the issue at my end using CDK version 2.147.2. Deploying the below CDK stack:

import * as cdk from 'aws-cdk-lib';
import { DefinitionBody, Fail, IntegrationPattern, StateMachine } from 'aws-cdk-lib/aws-stepfunctions';
import { GlueStartJobRun } from 'aws-cdk-lib/aws-stepfunctions-tasks';
import { Construct } from 'constructs';

export class Issue30735Stack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const jobIngestionMixpanel = new GlueStartJobRun(this, 'StartJobIngestionMixpanel', {
      glueJobName: 'JOB-00348-DEV-mixpanel-events-to-s3',
      integrationPattern: IntegrationPattern.RUN_JOB
    })

    jobIngestionMixpanel.addRetry({
        errors: ['States.TaskFailed'],
        backoffRate: 1,
        maxAttempts: 2,
        interval: cdk.Duration.minutes(5)
    })

    jobIngestionMixpanel.addCatch(new Fail(this, 'JobFailed', {
        error: 'Glue job failed after retries'
    }))

    const jobRedshiftMixpanel = new GlueStartJobRun(this, 'StartJobRedshiftMixpanel', {
        glueJobName: 'JOB-00351-DEV-mixpanel-events-to-redshift'
    })

    const definition = jobIngestionMixpanel.next(jobRedshiftMixpanel)

    new StateMachine(this, 'StateMachineJobs', {
        definitionBody: DefinitionBody.fromChainable(definition)
    })
  }
}

generated the State Machine job with the below definition (notice the 1st Task has .sync suffix in Resource):

{
  "StartAt": "StartJobIngestionMixpanel",
  "States": {
    "StartJobIngestionMixpanel": {
      "Next": "StartJobRedshiftMixpanel",
      "Retry": [
        {
          "ErrorEquals": [
            "States.TaskFailed"
          ],
          "IntervalSeconds": 300,
          "MaxAttempts": 2,
          "BackoffRate": 1
        }
      ],
      "Catch": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "Next": "JobFailed"
        }
      ],
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun.sync",
      "Parameters": {
        "JobName": "JOB-00348-DEV-mixpanel-events-to-s3"
      }
    },
    "StartJobRedshiftMixpanel": {
      "End": true,
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun",
      "Parameters": {
        "JobName": "JOB-00351-DEV-mixpanel-events-to-redshift"
      }
    },
    "JobFailed": {
      "Type": "Fail",
      "Error": "Glue job failed after retries"
    }
  }
}

Please try using the latest aws-cdk-lib package (and preferably latest CDK CLI version) and confirm if the issue is resolved.

Thanks,
Ashish

@ashishdhingra ashishdhingra added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Jul 2, 2024
@lauragalera
Copy link
Author

Hello @ashishdhingra,

Indeed, changing to version 2.147.3 solved it.

Thank you

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jul 3, 2024
Copy link

github-actions bot commented Jul 3, 2024

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@aws-cdk-automation
Copy link
Collaborator

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

@aws aws locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/aws-stepfunctions Related to AWS StepFunctions bug This issue is a bug.
Projects
None yet
Development

No branches or pull requests

3 participants