Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws-lambda.Function): Platform configuration in BundlingOptions is not respected #30748

Closed
martonjuhasz98 opened this issue Jul 4, 2024 · 3 comments
Labels
@aws-cdk/aws-lambda Related to AWS Lambda bug This issue is a bug. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@martonjuhasz98
Copy link

martonjuhasz98 commented Jul 4, 2024

Describe the bug

In TypeScript, I have a CDK Pipeline (aws-cdk-lib/pipelines/CodePipeline) with a CodeBuild synth step running aws/codebuild/standard:7.0. This pipeline bundles a Python Function (aws-cdk-lib/aws-lambda/Function) targeting an ARM64 architecture.

The Python code and its dependencies are installed and bundled with the following configuration:

import * as lambda from 'aws-cdk-lib/aws-lambda';

const lambdaRuntime : lambda.Runtime = lambda.Runtime.PYTHON_3_12;
const lambdaArchitecture : lambda.Architecture = lambda.Architecture.ARM_64;
const lambdaFunction = new lambda.Function(this, 'Function', {
  handler: 'index.handler',
  runtime: lambdaRuntime,
  architecture: lambdaArchitecture,
  code: lambda.Code.fromAsset('./path/to/code', {
    bundling: {
      command: ['bash', '-c', 'echo "Architecture: $(uname -m)" && pip install --target /asset-output -r ./requirements.txt && cp -au . /asset-output'],
      image: lambdaRuntime.bundlingImage,
      platform: lambdaArchitecture.dockerPlatform,
      user: 'root',
    }
  })
});

My goal is for the cdk synth command to bundle the Python Lambda code assets using the Docker image that reflects my image and platform configurations specified in the BundlingOptions within code.bundling.

When I run cdk synth on my local machine (M1 / aarch64), the synthesizing process pulls the public.ecr.aws/sam/build-python3.12:latest Docker image and bundles without any platform specification. The pip install logs show that packages are installed for my local architecture:

Unable to find image 'public.ecr.aws/sam/build-python3.12:latest' locally
latest: Pulling from sam/build-python3.12
...
Digest: sha256:99ff8f6076ce3b435d8524b00a5a06227c5883ec71c5590ed0503536af64cab9
Status: Downloaded newer image for public.ecr.aws/sam/build-python3.12:latest
Architecture: aarch64
Collecting boto3 (from -r ./requirements.txt (line 1))
...
Downloading charset_normalizer-3.3.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (137 kB)
...

For debugging purposes, I print out the architecture name of the Docker image using the uname -m command.

However, when I run the same process through the CDK pipeline, the results differ:

Unable to find image 'public.ecr.aws/sam/build-python3.12:latest' locally
latest: Pulling from sam/build-python3.12
...
Downloading charset_normalizer-3.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (141 kB)
...

In this case, the same public.ecr.aws/sam/build-python3.12:latest Docker image is pulled, but during installation, x86_64 specific pip wheels are used instead of the target ARM_64 architecture configured in BundlingOptions.

Although this does not cause immediate synthesizing or deployment failures, it leads to hidden issues that only surface during runtime. For example, when attempting to import langchain_community.llms.Bedrock in the deployed Python Lambda function, I encountered the following error:

INIT_START Runtime Version: python:3.12.v28	Runtime Version ARN: arn:aws:lambda:redacted-region::runtime:redacted01234567890123456789012345678901234567890123456789
START RequestId: redacted01234567890123456789012345678901234567890123456789 Version: $LATEST
LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue in the function code. However, in rare cases, a Lambda runtime update can cause unexpected function behavior. For functions using managed runtimes, runtime updates can be triggered by a function change, or can be applied automatically. To determine if the runtime has been updated, check the runtime version in the INIT_START log entry. If this error correlates with a change in the runtime version, you may be able to mitigate this error by temporarily rolling back to the previous runtime version. For more information, see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
[ERROR] ModuleNotFoundError: No module named 'orjson.orjson'
Traceback (most recent call last):
  File "/opt/python/aws_lambda_powertools/logging/logger.py", line 447, in decorate
    return lambda_handler(event, context, *args, **kwargs)
  File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 317, in decorate
    response = lambda_handler(event, context, **kwargs)
  File "/opt/python/aws_lambda_powertools/middleware_factory/factory.py", line 135, in wrapper
    response = middleware()
  File "/opt/python/aws_lambda_powertools/utilities/data_classes/event_source.py", line 39, in event_source
    return handler(data_class(event), context)
  File "/var/task/index.py", line 123, in handler
    from langchain_community.llms import Bedrock
  File "/var/task/langchain_community/llms/__init__.py", line 24, in <module>
    from langchain_core.language_models.llms import BaseLLM
  File "/var/task/langchain_core/language_models/__init__.py", line 47, in <module>
    from langchain_core.language_models.chat_models import BaseChatModel, SimpleChatModel
  File "/var/task/langchain_core/language_models/chat_models.py", line 30, in <module>
    from langchain_core.callbacks import (
  File "/var/task/langchain_core/callbacks/__init__.py", line 23, in <module>
    from langchain_core.callbacks.manager import (
  File "/var/task/langchain_core/callbacks/manager.py", line 29, in <module>
    from langsmith.run_helpers import get_run_tree_context
  File "/var/task/langsmith/run_helpers.py", line 40, in <module>
    from langsmith import client as ls_client
  File "/var/task/langsmith/client.py", line 47, in <module>
    import orjson
  File "/var/task/orjson/__init__.py", line 3, in <module>
    from .orjson import *
END RequestId: redacted01234567890123456789012345678901234567890123456789
REPORT RequestId: redacted01234567890123456789012345678901234567890123456789	Duration: 928.49 ms	Billed Duration: 929 ms	Memory Size: 1024 MB	Max Memory Used: 143 MB	Init Duration: 2840.25 ms	
XRAY TraceId: redacted01234567890123456789012345678901234567890123456789	SegmentId: redacted01234567890123456789012345678901234567890123456789	Sampled: true	

Expected Behavior

I expect that the BundlingOptions image and platform parameters are respected and enforced when bundling Lambda code assets, regardless of the platform it is executed on.

Specifically, I expect that CodeBuild (running on x86_64) would pull the ARM_64 specific image public.ecr.aws/sam/build-python3.12:latest-arm64 or build specifically for the target architecture.

When running cdk synth locally on an M1 / aarch64 machine, I expect it to use the specified image and platform configuration. For instance, if I configure:

const BundlingOptions: cdk.BundlingOptions = {
  command: ['bash', '-c', 'echo "Architecture: $(uname -m)" && pip install --target /asset-output -r ./requirements.txt && cp -au . /asset-output'],
  image: lambda.Runtime.PYTHON_3_12.bundlingImage,
  platform: lambda.Architecture.X86_64.dockerPlatform,
  user: 'root',
};

and run cdk synth on my local M1 / aarch64 machine, I expect CDK to bundle using public.ecr.aws/sam/build-python3.12:latest-x86_64 or with the docker command with the --platform flag. Current evidence suggests this is not the case.

Current Behavior

Currently, this is not the case. Regardless of the platform configured in BundlingOptions.platform, it only uses the generic public.ecr.aws/sam/build-python3.12:latest Docker image and matches the machine's architecture. This causes downstream problems, as highlighted above.

Reproduction Steps

This issue seems to be related to the fundamental BundlingOptions itself rather than CodeBuild or other services mentioned in this ticket.

Step 1: Implement the following Lambda function in a simple TypeScript CDK app:

import * as lambda from 'aws-cdk-lib/aws-lambda';

const lambdaRuntime : lambda.Runtime = lambda.Runtime.PYTHON_3_12;
const lambdaArchitecture : lambda.Architecture = lambda.Architecture.ARM_64;
const lambdaFunction = new lambda.Function(this, 'Function', {
  handler: 'index.handler',
  runtime: lambdaRuntime,
  architecture: lambdaArchitecture,
  code: lambda.Code.fromAsset('./path/to/code', {
    bundling: {
      command: ['bash', '-c', 'echo "Architecture: $(uname -m)" && pip install --target /asset-output -r ./requirements.txt && cp -au . /asset-output'],
      image: lambdaRuntime.bundlingImage,
      platform: lambdaArchitecture.dockerPlatform,
      user: 'root',
    }
  })
});

Step 2: Execute locally cdk synth to debug which platform is used.

Step 3: Change the lambdaArchitecture constant to lambda.Architecture.X86_64.

Step 4: Execute locally cdk synth again and observe if the platform has changed.

Possible Solution

There are two potential workarounds for this issue:

1/ Resolve the issue at CodeBuild:

Given that I can run cdk synth locally without a problem and directly deploy using cdk deploy (because my local architecture matches the target architecture, ARM_64), I can solve this problem by switching the CodeBuild runtime image to be compatible with ARM_64.

const pipeline = new pipelines.CodePipeline(this, 'Pipeline', {
  dockerEnabledForSynth: true,
  synth: new pipelines.ShellStep('Synth', {
    input: input,
    installCommands: ['node --version', 'npm --version', 'npx --version'],
    commands: ['npm ci', 'npx -y cdk@2 --version', 'npx -y cdk@2 synth'],
  }),
  codeBuildDefaults: {
    buildEnvironment: {
      buildImage: codebuild.LinuxBuildImage.AMAZON_LINUX_2_ARM_3
    }
  }
});

Note that here I use the aws/codebuild/amazonlinux2-aarch64-standard:3.0 image. However, this solution might not work well for a team of developers working on different platforms, as local bundling could create faulty Lambda code assets to deploy.

2/ Configure architecture-specific bundling image:

In this case, I can configure the Docker image name based on the runtime and architecture configurations:

const lambdaBundlingImageName = `${lambdaRuntime.bundlingImage.image}:latest-${lambdaArchitecture.toString()}`;

Then, pull it from the registry:

cdk.DockerImage.fromRegistry(lambdaBundlingImageName)

All the pieces together:

const lambdaRuntime : lambda.Runtime = lambda.Runtime.PYTHON_3_12;
const lambdaArchitecture : lambda.Architecture = lambda.Architecture.X86_64;
const lambdaBundlingImageName = `${lambdaRuntime.bundlingImage.image}:latest-${lambdaArchitecture.toString()}`;
const lambdaFunction = new lambda.Function(this, 'Function', {
  handler: 'index.handler',
  runtime: lambdaRuntime,
  architecture: lambdaArchitecture,
  code: lambda.Code.fromAsset('./path/to/code', {
    bundling: {
      command: ['bash', '-c', 'echo "Architecture: $(uname -m)" && pip install --target /asset-output -r ./requirements.txt && cp -au . /asset-output'],
      image: cdk.DockerImage.fromRegistry(lambdaBundlingImageName),
      user: 'root',
    }
  })
});

This, however, can cause other issues as highlighted in this Github issue:

When executing the ARG and ENV commands from this Dockerfile, then the RUN command fails (as expected if the platform is wrong):

exec /bin/sh: exec format error
The command '/bin/sh -c python -m venv /usr/app/venv &&     mkdir /tmp/pip-cache &&     chmod -R 777 /tmp/pip-cache &&     pip install --upgrade pip &&     mkdir /tmp/poetry-cache &&     chmod -R 777 /tmp/poetry-cache &&     pip install pipenv==2022.4.8 poetry==$POETRY_VERSION &&     rm -rf /tmp/pip-cache/* /tmp/poetry-cache/*' returned a non-zero code: 1

Additional Information/Context

No response

CDK CLI Version

2.147.1

Framework Version

No response

Node.js Version

v20.13.1

OS

Mac M1

Language

TypeScript, Python

Language Version

Typescript (5.4.5) | Python (3.12)

Other information

This issue exhibits similar symptoms to those described in AWS CDK Issue #18696.

@martonjuhasz98 martonjuhasz98 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 4, 2024
@github-actions github-actions bot added the @aws-cdk/aws-lambda Related to AWS Lambda label Jul 4, 2024
@ashishdhingra
Copy link
Contributor

@martonjuhasz98 Good afternoon. The image public.ecr.aws/sam/build-python3.12:latest points to a generic image. If you refer this page, it supports OS/Arch: Linux, ARM 64, x86-64. My assumption is that depending on the environment architecture, it should pull the relevant image for that architecture.

Thanks,
Ashish

@ashishdhingra ashishdhingra added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 5, 2024
Copy link

github-actions bot commented Jul 8, 2024

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jul 8, 2024
@github-actions github-actions bot added closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Jul 13, 2024
@aws-cdk-automation
Copy link
Collaborator

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

@aws aws locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/aws-lambda Related to AWS Lambda bug This issue is a bug. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

3 participants