Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(glue): Job construct #12506

Merged
merged 67 commits into from
Sep 8, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
c158a05
feat(aws-glue): add Job construct (#12443)
humanzz Jan 10, 2021
9ea48b6
support job's event rules and rule-based metrics
humanzz Jan 14, 2021
638129c
add metric helper method
humanzz Jan 14, 2021
b99dbce
add JobSpecialArgumentNames for glue special parameters
humanzz Jan 15, 2021
06b4d55
rebase to use Connection and SecurityConfiguration
humanzz Feb 17, 2021
6bcdad8
address some comments
humanzz Jul 27, 2021
e0eb941
Merge branch 'master' into glue-job
humanzz Jul 27, 2021
b2d866f
improve docs
humanzz Jul 27, 2021
29ff157
rename JobCommandName constants and JobCommand methods
humanzz Jul 28, 2021
c615c8e
Merge branch 'master' into glue-job
humanzz Jul 29, 2021
c31d7e4
drop unnecessry toString() methods
humanzz Jul 30, 2021
f05afa7
indicate PythonVersion.TWO is the default for JobCommand
humanzz Jul 30, 2021
5a46dc5
make metricRule and buildJobArn protected
humanzz Jul 30, 2021
f59d688
address more comments
humanzz Jul 30, 2021
fba208d
drop jobRunId from metric()'s arguments
humanzz Aug 2, 2021
6032807
change how event.Rules caching is done
humanzz Aug 2, 2021
8b74200
Merge branch 'master' into glue-job
humanzz Aug 2, 2021
98c19df
drop JobSpecialArgumentNames
humanzz Aug 2, 2021
933ebcd
introduce JobExecutable and refactor accordingly
humanzz Aug 11, 2021
6b3d31e
refactor JobExecutable and add more tests
humanzz Aug 12, 2021
2f02798
add enableProfilingMetrics to JobProps
humanzz Aug 12, 2021
52ae192
Merge branch 'master' into glue-job
humanzz Aug 12, 2021
fe08089
add @aws-cdk/assert-internal to package.json after merge
humanzz Aug 12, 2021
55b5ee4
add sparkUI optional prop to JobProps
humanzz Aug 12, 2021
3235ae7
add continuousLogging optional prop to JobProps
humanzz Aug 13, 2021
ee74a8a
Merge branch 'master' into glue-job
humanzz Aug 13, 2021
1367813
Merge branch 'master' into glue-job
humanzz Aug 20, 2021
c3c9e80
Merge branch 'master' into glue-job
humanzz Aug 24, 2021
0ff9d01
add GlueVersion.V3_0
humanzz Aug 24, 2021
cd27ef7
Merge branch 'master' into glue-job
humanzz Aug 24, 2021
4353f22
Merge branch 'master' into glue-job
humanzz Aug 30, 2021
fb7e376
address smaller comments
humanzz Aug 30, 2021
818be70
Merge branch 'master' into glue-job
humanzz Aug 30, 2021
7072a6f
address metric comments
humanzz Aug 31, 2021
e860f4d
take 1 at glue.Code (not fulyl tested)
humanzz Aug 31, 2021
1068229
test glue.Code
humanzz Aug 31, 2021
bf896fa
Merge branch 'master' into glue-job
humanzz Aug 31, 2021
9f5f85a
address some comments
humanzz Aug 31, 2021
0f587c9
fix build issues from previous round of comments
humanzz Aug 31, 2021
87dee59
address comments
humanzz Aug 31, 2021
0ded0f2
refactor JobExecutableProps
humanzz Aug 31, 2021
82c1d98
drop @aws-cdk/aws-s3-assets from devDependencies
humanzz Aug 31, 2021
c81e736
restore docs about individual files support
humanzz Aug 31, 2021
8ce0fe8
apply suggestions from comments
humanzz Sep 1, 2021
2ed79e1
add optional role to JobAttributes
humanzz Sep 1, 2021
1eb9c20
drop @aws-cdk/assert-internal in favour of @aws-cdk/assertions
humanzz Sep 1, 2021
16e3265
Merge branch 'master' into glue-job
humanzz Sep 1, 2021
7df0c1f
update README
humanzz Sep 1, 2021
8884ad7
increase test coverage to 100% for the new files
humanzz Sep 1, 2021
13b03e8
increase test coverage to 100%
humanzz Sep 1, 2021
2b5f47e
Merge branch 'master' into glue-job
humanzz Sep 1, 2021
bc82d60
tweak tests
humanzz Sep 3, 2021
550b919
Merge branch 'master' into glue-job
humanzz Sep 3, 2021
70b3e24
tweak tests #2
humanzz Sep 3, 2021
a500cf2
Merge branch 'master' into glue-job
humanzz Sep 3, 2021
32ba2ae
remove role from IJob
BenChaimberg Sep 8, 2021
babd3ec
address some comments
humanzz Sep 8, 2021
7379066
Merge branch 'master' into glue-job
humanzz Sep 8, 2021
b62a868
simplify job.test.ts
humanzz Sep 8, 2021
80c3f15
simplify testing success/failure/timeout rules and metrics
humanzz Sep 8, 2021
094929c
better handling for extraPythonFiles with non-Python jobs
humanzz Sep 8, 2021
f01c0be
update integ.job.ts
humanzz Sep 8, 2021
cd2d2ee
fix issues identified trying to run jobs from integ tests
humanzz Sep 8, 2021
ea32eab
update integ test verification documentation
humanzz Sep 8, 2021
98cc575
update Code.bind signature and PythonShell supported glue versions
humanzz Sep 8, 2021
0328615
narrow the permissions granted by S3Code
humanzz Sep 8, 2021
6bb18eb
Merge branch 'master' into glue-job
humanzz Sep 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/@aws-cdk/aws-glue/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ This can be used to schedule and run tasks that don't require an Apache Spark en
```ts
BenChaimberg marked this conversation as resolved.
Show resolved Hide resolved
new glue.Job(stack, 'PythonShellJob', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V2_0,
glueVersion: glue.GlueVersion.V1_0,
pythonVersion: PythonVersion.THREE,
script: glue.Code.fromBucket(bucket, 'script.py'),
}),
Expand Down
15 changes: 8 additions & 7 deletions packages/@aws-cdk/aws-glue/lib/code.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import * as fs from 'fs';
import * as s3 from '@aws-cdk/aws-s3';
import * as s3assets from '@aws-cdk/aws-s3-assets';
import * as cdk from '@aws-cdk/core';
import * as constructs from 'constructs';
import { Job } from './';

/**
* Represents a Glue Job's Code assets (an asset can be a scripts, a jar, a python file or any other file).
Expand Down Expand Up @@ -31,7 +31,7 @@ export abstract class Code {
/**
* Called when the Job is initialized to allow this object to bind.
*/
public abstract bind(scope: constructs.Construct): CodeConfig;
public abstract bind(job: Job): CodeConfig;
humanzz marked this conversation as resolved.
Show resolved Hide resolved
}

/**
Expand All @@ -42,7 +42,8 @@ export class S3Code extends Code {
super();
}

public bind(_scope: constructs.Construct): CodeConfig {
public bind(job: Job): CodeConfig {
this.bucket.grantRead(job);
BenChaimberg marked this conversation as resolved.
Show resolved Hide resolved
return {
s3Location: {
bucketName: this.bucket.bucketName,
Expand All @@ -69,18 +70,18 @@ export class AssetCode extends Code {
}
}

public bind(scope: constructs.Construct): CodeConfig {
public bind(job: Job): CodeConfig {
// If the same AssetCode is used multiple times, retain only the first instantiation.
if (!this.asset) {
this.asset = new s3assets.Asset(scope, `Code${this.hashcode(this.path)}`, {
this.asset = new s3assets.Asset(job, `Code${this.hashcode(this.path)}`, {
path: this.path,
...this.options,
});
} else if (cdk.Stack.of(this.asset) !== cdk.Stack.of(scope)) {
} else if (cdk.Stack.of(this.asset) !== cdk.Stack.of(job)) {
throw new Error(`Asset is already associated with another stack '${cdk.Stack.of(this.asset).stackName}'. ` +
'Create a new Code instance for every stack.');
}

this.asset.grantRead(job);
return {
s3Location: {
bucketName: this.asset.s3BucketName,
Expand Down
2 changes: 1 addition & 1 deletion packages/@aws-cdk/aws-glue/lib/job-executable.ts
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,7 @@ export class JobExecutable {
if (config.language !== JobLanguage.PYTHON) {
throw new Error('Python shell requires the language to be set to Python');
}
if ([GlueVersion.V0_9, GlueVersion.V1_0].includes(config.glueVersion)) {
if ([GlueVersion.V0_9].includes(config.glueVersion)) {
humanzz marked this conversation as resolved.
Show resolved Hide resolved
throw new Error(`Specified GlueVersion ${config.glueVersion.name} does not support Python Shell`);
}
}
Expand Down
208 changes: 189 additions & 19 deletions packages/@aws-cdk/aws-glue/test/code.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import * as glue from '../lib';

describe('Code', () => {
let stack: cdk.Stack;
let script: glue.Code;

beforeEach(() => {
stack = new cdk.Stack();
Expand All @@ -15,26 +16,185 @@ describe('Code', () => {
const key = 'script';
let bucket: s3.IBucket;

test('with valid bucket name and key and calling bind() returns correct s3 location', () => {
test('with valid bucket name and key and bound by job sets the right path and grants the job permissions to read from it', () => {
bucket = s3.Bucket.fromBucketName(stack, 'Bucket', 'bucketName');
expect(glue.Code.fromBucket(bucket, key).bind(stack)).toEqual({
s3Location: {
bucketName: 'bucketName',
objectKey: 'script',
script = glue.Code.fromBucket(bucket, key);
new glue.Job(stack, 'Job1', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
});

Template.fromStack(stack).hasResourceProperties('AWS::Glue::Job', {
Command: {
ScriptLocation: 's3://bucketName/script',
},
});

// Role policy should grant reading from the assets bucket
Template.fromStack(stack).hasResourceProperties('AWS::IAM::Policy', {
PolicyDocument: {
Statement: [
{
Action: [
's3:GetObject*',
's3:GetBucket*',
's3:List*',
],
Effect: 'Allow',
Resource: [
{
'Fn::Join': [
'',
[
'arn:',
{
Ref: 'AWS::Partition',
},
':s3:::bucketName',
],
],
},
{
'Fn::Join': [
'',
[
'arn:',
{
Ref: 'AWS::Partition',
},
':s3:::bucketName/*',
],
],
},
],
},
],
},
Roles: [
{
Ref: 'Job1ServiceRole7AF34CCA',
},
],
});
});
});

describe('.fromAsset()', () => {
const filePath = path.join(__dirname, 'job-script/hello_world.py');
const directoryPath = path.join(__dirname, 'job-script');

test('with valid and existing file path and calling bind() returns an s3 location and sets metadata', () => {
const codeConfig = glue.Code.fromAsset(filePath).bind(stack);
expect(codeConfig.s3Location.bucketName).toBeDefined();
expect(codeConfig.s3Location.objectKey).toBeDefined();
beforeEach(() => {
script = glue.Code.fromAsset(filePath);
});

test("with valid and existing file path and bound to job sets job's script location and permissions stack metadata", () => {
new glue.Job(stack, 'Job1', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
});

expect(stack.node.metadata.find(m => m.type === 'aws:cdk:asset')).toBeDefined();
Template.fromStack(stack).hasResourceProperties('AWS::Glue::Job', {
Command: {
ScriptLocation: {
'Fn::Join': [
'',
[
's3://',
{
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3Bucket4E517469',
},
'/',
{
'Fn::Select': [
0,
{
'Fn::Split': [
'||',
{
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3VersionKeyF7753763',
},
],
},
],
},
{
'Fn::Select': [
1,
{
'Fn::Split': [
'||',
{
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3VersionKeyF7753763',
},
],
},
],
},
],
],
},
},
});
// Role policy should grant reading from the assets bucket
Template.fromStack(stack).hasResourceProperties('AWS::IAM::Policy', {
PolicyDocument: {
Statement: [
{
Action: [
's3:GetObject*',
's3:GetBucket*',
's3:List*',
],
Effect: 'Allow',
Resource: [
{
'Fn::Join': [
'',
[
'arn:',
{
Ref: 'AWS::Partition',
},
':s3:::',
{
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3Bucket4E517469',
},
],
],
},
{
'Fn::Join': [
'',
[
'arn:',
{
Ref: 'AWS::Partition',
},
':s3:::',
{
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3Bucket4E517469',
},
'/*',
],
],
},
],
},
],
},
Roles: [
{
Ref: 'Job1ServiceRole7AF34CCA',
},
],
});
});
humanzz marked this conversation as resolved.
Show resolved Hide resolved

test('with an unsupported directory path throws', () => {
Expand All @@ -43,7 +203,6 @@ describe('Code', () => {
});

test('used in more than 1 job in the same stack should be reused', () => {
const script = glue.Code.fromAsset(filePath);
new glue.Job(stack, 'Job1', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V2_0,
Expand All @@ -64,7 +223,7 @@ describe('Code', () => {
[
's3://',
{
Ref: 'AssetParameters894df8f835015940e27548bfbf722885cb247378af70effdc8ecbe342419fc6bS3Bucket252142A8',
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3Bucket4E517469',
},
'/',
{
Expand All @@ -74,7 +233,7 @@ describe('Code', () => {
'Fn::Split': [
'||',
{
Ref: 'AssetParameters894df8f835015940e27548bfbf722885cb247378af70effdc8ecbe342419fc6bS3VersionKey7D45B377',
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3VersionKeyF7753763',
},
],
},
Expand All @@ -87,7 +246,7 @@ describe('Code', () => {
'Fn::Split': [
'||',
{
Ref: 'AssetParameters894df8f835015940e27548bfbf722885cb247378af70effdc8ecbe342419fc6bS3VersionKey7D45B377',
Ref: 'AssetParameters432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855S3VersionKeyF7753763',
},
],
},
Expand All @@ -96,6 +255,7 @@ describe('Code', () => {
],
],
};

expect(stack.node.metadata.find(m => m.type === 'aws:cdk:asset')).toBeDefined();
// Job1 and Job2 use reuse the asset
Template.fromStack(stack).hasResourceProperties('AWS::Glue::Job', {
Expand All @@ -122,13 +282,23 @@ describe('Code', () => {
});
});

test('throws if used in more than 1 stack', () => {
const stack2 = new cdk.Stack();
const asset = glue.Code.fromAsset(filePath);
asset.bind(stack);
test('throws if trying to rebind in another stack', () => {
new glue.Job(stack, 'Job1', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
});
const differentStack = new cdk.Stack();

expect(() => asset.bind(stack2))
.toThrow(/associated with another stack/);
expect(() => new glue.Job(differentStack, 'Job2', {
executable: glue.JobExecutable.pythonShell({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script: script,
}),
})).toThrow(/associated with another stack/);
});
});
humanzz marked this conversation as resolved.
Show resolved Hide resolved
});
Loading