Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(glue-alpha): add job run queuing to Glue job #31830

Merged
18 changes: 18 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,24 @@ The `sparkUI` property also allows the specification of an s3 bucket and a bucke

See [documentation](https://docs.aws.amazon.com/glue/latest/dg/add-job.html) for more information on adding jobs in Glue.

### Enable Job Run Queuing

AWS Glue job queuing monitors your account level quotas and limits. If quotas or limits are insufficient to start a Glue job run, AWS Glue will automatically queue the job and wait for limits to free up. Once limits become available, AWS Glue will retry the job run. Glue jobs will queue for limits like max concurrent job runs per account, max concurrent Data Processing Units (DPU), and resource unavailable due to IP address exhaustion in Amazon Virtual Private Cloud (Amazon VPC).

Enable job run queuing by setting the `jobRunQueuingEnabled` property to `true`.

```ts
new glue.Job(this, 'EnableRunQueuing', {
jobName: 'EtlJobWithRunQueuing',
executable: glue.JobExecutable.pythonEtl({
glueVersion: glue.GlueVersion.V4_0,
pythonVersion: glue.PythonVersion.THREE,
script: glue.Code.fromAsset(path.join(__dirname, 'job-script', 'hello_world.py')),
}),
jobRunQueuingEnabled: true,
});
```

## Connection

A `Connection` allows Glue jobs, crawlers and development endpoints to access certain types of data stores. For example, to create a network connection to connect to a data source within a VPC:
Expand Down
18 changes: 18 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/job.ts
Original file line number Diff line number Diff line change
Expand Up @@ -502,6 +502,16 @@ export interface JobProps {
*/
readonly description?: string;

/**
* Specifies whether job run queuing is enabled for the job runs for this job.
* A value of true means job run queuing is enabled for the job runs.
* If false or not populated, the job runs will not be considered for queueing.
* If this field does not match the value set in the job run, then the value from the job run field will be used.
*
* @default - no job run queuing
*/
readonly jobRunQueuingEnabled?: boolean;

/**
* The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs.
* Cannot be used for Glue version 2.0 and later - workerType and workerCount should be used instead.
Expand Down Expand Up @@ -722,6 +732,9 @@ export class Job extends JobBase {
if (props.workerType && (props.workerType !== WorkerType.G_1X && props.workerType !== WorkerType.G_2X)) {
throw new Error('FLEX ExecutionClass is only available for WorkerType G_1X or G_2X');
}
if (props.jobRunQueuingEnabled === true) {
throw new Error('FLEX ExecutionClass is only available if job run queuing is disabled');
}
}

let maxCapacity = props.maxCapacity;
Expand All @@ -743,6 +756,10 @@ export class Job extends JobBase {
throw new Error('Both workerType and workerCount must be set');
}

if (props.jobRunQueuingEnabled === true && props.maxRetries !== undefined && props.maxRetries > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add condition !Token.isUnresolved(props.maxRetries) to the if statement.

This will enable support when maxRetries is token.

Suggested change
if (props.jobRunQueuingEnabled === true && props.maxRetries !== undefined && props.maxRetries > 0) {
if (props.jobRunQueuingEnabled === true && props.maxRetries !== undefined && Token.isUnresolved(props.maxRetries) && props.maxRetries > 0) {

throw new Error(`Maximum retries was set to ${props.maxRetries}, must be set to 0 with job run queuing enabled`);
}

const jobResource = new CfnJob(this, 'Resource', {
name: props.jobName,
description: props.description,
Expand All @@ -756,6 +773,7 @@ export class Job extends JobBase {
glueVersion: executable.glueVersion.name,
workerType: props.workerType?.name,
numberOfWorkers: props.workerCount,
jobRunQueuingEnabled: props.jobRunQueuingEnabled,
maxCapacity: props.maxCapacity,
maxRetries: props.maxRetries,
executionClass: props.executionClass,
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -1754,6 +1754,127 @@
},
"WorkerType": "G.1X"
}
},
"EtlJobWithRunQueuingServiceRole33547334": {
"Type": "AWS::IAM::Role",
"Properties": {
"AssumeRolePolicyDocument": {
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": "glue.amazonaws.com"
}
}
],
"Version": "2012-10-17"
},
"ManagedPolicyArns": [
{
"Fn::Join": [
"",
[
"arn:",
{
"Ref": "AWS::Partition"
},
":iam::aws:policy/service-role/AWSGlueServiceRole"
]
]
}
]
}
},
"EtlJobWithRunQueuingServiceRoleDefaultPolicy5725F511": {
"Type": "AWS::IAM::Policy",
"Properties": {
"PolicyDocument": {
"Statement": [
{
"Action": [
"s3:GetBucket*",
"s3:GetObject*",
"s3:List*"
],
"Effect": "Allow",
"Resource": [
{
"Fn::Join": [
"",
[
"arn:",
{
"Ref": "AWS::Partition"
},
":s3:::",
{
"Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
},
"/*"
]
]
},
{
"Fn::Join": [
"",
[
"arn:",
{
"Ref": "AWS::Partition"
},
":s3:::",
{
"Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
}
]
]
}
]
}
],
"Version": "2012-10-17"
},
"PolicyName": "EtlJobWithRunQueuingServiceRoleDefaultPolicy5725F511",
"Roles": [
{
"Ref": "EtlJobWithRunQueuingServiceRole33547334"
}
]
}
},
"EtlJobWithRunQueuingA1B098B5": {
"Type": "AWS::Glue::Job",
"Properties": {
"Command": {
"Name": "glueetl",
"PythonVersion": "3",
"ScriptLocation": {
"Fn::Join": [
"",
[
"s3://",
{
"Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
},
"/432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855.py"
]
]
}
},
"DefaultArguments": {
"--job-language": "python"
},
"GlueVersion": "4.0",
"JobRunQueuingEnabled": true,
"Name": "EtlJobWithRunQueuing",
"Role": {
"Fn::GetAtt": [
"EtlJobWithRunQueuingServiceRole33547334",
"Arn"
]
}
}
}
},
"Parameters": {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading