Skip to content

Latest commit

 

History

History
254 lines (174 loc) · 17 KB

monitoring-aws-lambda.asciidoc

File metadata and controls

254 lines (174 loc) · 17 KB

Monitoring AWS Lambda Functions

Elastic APM lets you monitor your AWS Lambda functions. The natural integration of {apm-guide-ref}/apm-distributed-tracing.html[distributed tracing] into your AWS Lambda functions provides insights into the functions' execution and runtime behavior as well as their relationships and dependencies to other services.

To get started with the setup of Elastic APM for your Lambda functions, checkout the language-specific guides:

  • {apm-node-ref}/lambda.html[Quick Start with APM on AWS Lambda - Node.js]

  • {apm-py-ref}/lambda-support.html[Quick Start with APM on AWS Lambda - Python]

  • {apm-java-ref}/aws-lambda.html[Quick Start with APM on AWS Lambda - Java]

Note that AWS Step Functions are currently not supported by any of the languages.

Learn more about the architecture of Elastic APM for AWS Lambda.

APM Architecture for AWS Lambda

AWS Lambda uses a special execution model to provide a scalable, on-demand compute service for code execution. In particular, AWS freezes the execution environment of a lambda function when no active requests are being processed. This execution model poses additional requirements on APM in the context of AWS Lambda functions:

  1. To avoid data loss, APM data collected by APM agents needs to be flushed before the execution environment of a lambda function is frozen.

  2. Flushing APM data must be fast so as not to impact the response times of lambda function requests.

To accomplish the above, Elastic APM agents instrument AWS Lambda functions and dispatch APM data via an AWS Lambda extension.

Normally, during the execution of a Lambda function, there’s only a single language process running in the AWS Lambda execution environment. With an AWS Lambda extension, Lambda users run a second process alongside their main service/application process.

image showing data flow from lambda function

By using an AWS Lambda extension, Elastic APM agents can send data to a local Lambda extension process, and that process will forward data on to APM Server asynchronously. The Lambda extension ensures that any potential latency between the Lambda function and the APM Server instance will not cause latency in the request flow of the Lambda function itself.

Performance impact and overhead

As described in APM Architecture for AWS Lambda, using Elastic APM with AWS Lambda requires adding both the Elastic APM AWS Lambda extension and a corresponding Elastic APM agent to the Lambda runtime. These components may introduce a small overhead on the size of your function’s deployment package as well as the execution duration of your function’s invocations.

Impact on the deployment package size

These components contribute a little to the uncompressed deployment package size of your Lambda function. Overall, the impact of using Elastic APM on the uncompressed deployment package size of your Lambda function is less than 30MB.

Performance impact

An advantage of the Elastic APM AWS Lambda extension architecture is that APM data dispatching is decoupled from your function’s request processing. The Elastic APM AWS Lambda extension flushes APM data to the Elastic backend after your function responds to the client’s request. Thus, it does not affect the latency of the client’s request. However, the extension’s flushing of APM data contributes to the overall execution time of the function invocation. The ELASTIC_APM_DATA_FORWARDER_TIMEOUT config option with the related exponential backoff algorithm limits and allows to control the impact the extension may have on the function’s overall execution time.

When your function experiences a cold start, the Elastic APM AWS Lambda extension needs to be initialized and, thus, slightly increases the cold start duration (in the range of tens of milliseconds) of your function.

APM agents enrich your application’s code with measurement code that collects APM data. This measurement code introduces a small performance overhead to your application, which is usually in a negligible range. The same is true with Lambda functions. The concrete performance overhead introduced by APM agents highly depends on the configuration of the agent and on the characteristics of your function’s code. The following agent-specific documentation pages provide insights and instructions on tuning the performance the APM agents:

  • {apm-node-ref}/performance-tuning.html[Performance Tuning - Node.js]

  • {apm-py-ref}/tuning-and-overhead.html[Performance Tuning - Python]

  • {apm-java-ref}/tuning-and-overhead.html[Performance Tuning - Java]

Similar to the Elastic APM AWS Lambda extension, APM agents are initialized at cold start time. As a consequence, the APM agent’s overhead will be higher for cold starts as compared to their overhead on warm invocations. This effect is especially relevant for the Java APM agent on AWS Lambda. Learn more about corresponding tuning options in the {apm-java-ref}/aws-lambda.html#aws-lambda-caveats[Java Agent’s AWS Lambda documentation].

Configuration options

The recommended way of configuring the {apm-lambda-ext} and the APM agents on AWS Lambda is through the Lambda function’s environment variables.

The configuration options for the APM agents are documented in the corresponding language agents:

  • {apm-node-ref}/configuration.html[Configuration options - Node.js APM agent]

  • {apm-py-ref}/configuration.html[Configuration options - Python APM agent]

  • {apm-java-ref}/configuration.html[Configuration options - Java APM agent]

Note
Some APM agent configuration options don’t make sense when the APM agent is running in a Lambda environment. For example, instead of using the Python APM agent configuration variable, verify_server_cert, you must use the ELASTIC_APM_LAMBDA_VERIFY_SERVER_CERT variable described below.
Note
APM Central configuration is not supported when using the Elastic APM AWS Lambda extension

Relevant configuration options

A list of relevant configuration options for the {apm-lambda-ext} is below.

ELASTIC_APM_LAMBDA_APM_SERVER

This required config option controls where the {apm-lambda-ext} will ship data. This should be the URL of the final APM Server destination for your telemetry.

ELASTIC_APM_LAMBDA_AGENT_DATA_BUFFER_SIZE

The size of the buffer that stores APM agent data to be forwarded to the APM server. The default is 100.

ELASTIC_APM_SECRET_TOKEN or ELASTIC_APM_API_KEY

One of these (or, alternatively, the corresponding settings for the AWS Secrets Manager IDs) needs to be set as the authentication method that the {apm-lambda-ext} uses when sending data to the URL configured via ELASTIC_APM_LAMBDA_APM_SERVER. Alternatively, you can store your APM Server credentials using the AWS Secrets Manager and use the ELASTIC_APM_SECRETS_MANAGER_SECRET_TOKEN_ID or ELASTIC_APM_SECRETS_MANAGER_API_KEY_ID config options, instead. Sending data to the APM Server if none of these options is set is possible, but your APM agent must be allowed to send data to your APM server in anonymous mode.

ELASTIC_APM_SECRETS_MANAGER_SECRET_TOKEN_ID or ELASTIC_APM_SECRETS_MANAGER_API_KEY_ID

Instead of specifying the ELASTIC_APM_SECRET_TOKEN or ELASTIC_APM_API_KEY as plain text in your Lambda environment variables, you can use the AWS Secrets Manager to securely store your APM authetication keys. The ELASTIC_APM_SECRETS_MANAGER_API_KEY_ID or ELASTIC_APM_SECRETS_MANAGER_SECRET_TOKEN_ID config options allow you to specify the Secrets Manager’s secret id of the stored APM API key or APM secret token, respectively, to be used by the {apm-lambda-ext} for authentication.

ELASTIC_APM_SECRETS_MANAGER_SECRET_TOKEN_ID takes precedence over ELASTIC_APM_SECRET_TOKEN, and ELASTIC_APM_SECRETS_MANAGER_API_KEY_ID over ELASTIC_APM_API_KEY, respectively.

ELASTIC_APM_SERVICE_NAME

The configured name of your application or service. The APM agent will use this value when reporting data to the APM Server. If unset, the APM agent will automatically set the value based on the Lambda function name. Use this config option if you want to group multiple Lambda functions under a single service entity in APM.

ELASTIC_APM_DATA_RECEIVER_TIMEOUT

Added in: v1.2.0. Replaces ELASTIC_APM_DATA_RECEIVER_TIMEOUT_SECONDS.

The {apm-lambda-ext}'s timeout value, for receiving data from the APM agent. The default is 15s.

ELASTIC_APM_DATA_RECEIVER_SERVER_PORT

The port on which the {apm-lambda-ext} listens to receive data from the APM agent. The default is 8200.

ELASTIC_APM_DATA_FORWARDER_TIMEOUT

Added in: v1.2.0. Replaces ELASTIC_APM_DATA_FORWARDER_TIMEOUT_SECONDS.

The timeout value, for the {apm-lambda-ext}'s HTTP client sending data to the APM Server. The default is 3s. If the extension’s attempt to send APM data during this time interval is not successful, the extension queues back the data. Further attempts at sending the data are governed by an exponential backoff algorithm: data will be sent after a increasingly large grace period of 0, then circa 1, 4, 9, 16, 25 and 36 seconds, provided that the Lambda function execution is ongoing.

ELASTIC_APM_SEND_STRATEGY

Whether to synchronously flush APM agent data from the {apm-lambda-ext} to the APM Server at the end of the function invocation. The two accepted values are background and syncflush. The default is syncflush.

  • The background strategy indicates that the {apm-lambda-ext} will not flush when it receives a signal that the function invocation has completed. It will instead send any remaining buffered data on the next function invocation. The result is that, if the function is not subsequently invoked for that Lambda environment, the buffered data will be lost. However, for lambda functions that have a steadily frequent load pattern the extension could delay sending the data to the APM Server to the next lambda request and do the sending in parallel to the processing of that next request. This potentially would improve both the lambda function response time and its throughput.

  • The other value, syncflush will synchronously flush all remaining buffered APM agent data to the APM Server when the extension receives a signal that the function invocation has completed. This strategy blocks the lambda function from receiving the next request until the extension has flushed all the data. This has a negative effect on the throughput of the function, though it ensures that all APM data is sent to the APM server.

ELASTIC_APM_LOG_LEVEL

The logging level to be used by both the APM Agent and the {apm-lambda-ext}. Supported values are trace, debug, info, warning, error, critical and off.

ELASTIC_APM_LAMBDA_CAPTURE_LOGS

preview:[] Starting in Elastic Stack version 8.5.0, the Elastic APM lambda extension supports the collection of log events by default. Log events can be viewed in {kib} in the APM UI. Disable log collection by setting this to false.

ELASTIC_APM_LAMBDA_VERIFY_SERVER_CERT

Added in: v1.3.0.

Whether to enable {apm-lambda-ext} to verify APM Server’s certificate chain and host name.

ELASTIC_APM_LAMBDA_SERVER_CA_CERT_PEM

Added in: v1.3.0.

The certificate passed as environment variable. To be used to verify APM Server’s certificate chain if verify server certificate is enabled.

ELASTIC_APM_SERVER_CA_CERT_FILE

Added in: v1.3.0.

The certificate passed as a file name available to the extension. To be used to verify APM Server’s certificate chain if verify server certificate is enabled.

ELASTIC_APM_SERVER_CA_CERT_ACM_ID

Added in: v1.3.0.

The ARN for Amazon-issued certificate. To be used to verify APM Server’s certificate chain if verify server certificate is enabled.

Note

You may see errors similar to the following in {stack} versions less than 8.5:

client error: response status code: 400
message: log: did not recognize object type

Users on older versions should disable log collection by setting ELASTIC_APM_LAMBDA_CAPTURE_LOGS to false.

Deprecated options

ELASTIC_APM_DATA_RECEIVER_TIMEOUT_SECONDS

Deprecated in: v1.2.0. Use ELASTIC_APM_DATA_RECEIVER_TIMEOUT instead.

The {apm-lambda-ext}'s timeout value, in seconds, for receiving data from the APM agent. The default is 15.

ELASTIC_APM_DATA_FORWARDER_TIMEOUT_SECONDS

Deprecated in: v1.2.0. Use ELASTIC_APM_DATA_FORWARDER_TIMEOUT instead.

The timeout value, in seconds, for the {apm-lambda-ext}'s HTTP client sending data to the APM Server. The default is 3. If the extension’s attempt to send APM data during this time interval is not successful, the extension queues back the data. Further attempts at sending the data are governed by an exponential backoff algorithm: data will be sent after a increasingly large grace period of 0, then circa 1, 4, 9, 16, 25 and 36 seconds, provided that the Lambda function execution is ongoing.

Using AWS Secrets Manager to manage APM authentication keys

When using the config options ELASTIC_APM_SECRET_TOKEN or ELASTIC_APM_API_KEY for authentication of the {apm-lambda-ext}, the corresponding keys are specified in plain text in the environment variables of your Lambda function. If you prefer to securely store the authentication keys, you can use the AWS Secrets Manager and let the extension retrieve the actual keys from the AWS Secrets Manager. Follow the instructions below to set up the AWS Secrets Manager with the extension.

Step 1: Create a secret in the AWS Secrets Manager.

Create a secret in the AWS Secrets Manager for the {apm-guide-ref}/secret-token.html[APM Secret Token] or the {apm-guide-ref}/api-key.html[APM API key], depending on which one you prefer to use. Make sure to create the secret as a Plaintext typed secret and ensure it is created in the same AWS region as your target Lambda function that will use the secret.

We recommend using the AWS-managed encryption key aws/secretsmanager. However, you can optionally create and select a custom KMS key for encryption. Note that with a custom encryption key, you will need additional key permissions on your Lambda function (see Step 2).

Remember your chosen secret name. You will use the secret name as the value for the config options ELASTIC_APM_SECRETS_MANAGER_SECRET_TOKEN_ID or ELASTIC_APM_SECRETS_MANAGER_API_KEY_ID when configuring your {apm-lambda-ext}.

Step 2: Add permissions to your AWS Lambda function

For your Lambda function to be able to retrieve the authentication key from the AWS Secrets Manager, you need to provide the following permissions to your Lambda function.

Step 3: Configure the {apm-lambda-ext}

Finally, you will need to configure the {apm-lambda-ext} to use the secret from the Secrets Manager instead of the value provided through ELASTIC_APM_SECRET_TOKEN or ELASTIC_APM_API_KEY.

Provide the name of the secret you created in Step 1 as the value for the ELASTIC_APM_SECRETS_MANAGER_SECRET_TOKEN_ID or ELASTIC_APM_SECRETS_MANAGER_API_KEY_ID config option, respectively, depending on whether you want to use the {apm-guide-ref}/secret-token.html[APM Secret Token] or the {apm-guide-ref}/api-key.html[APM API key].

The language-specific instructions describe how to set environment variables for configuring AWS Lambda for Elastic APM:

  • {apm-node-ref}/lambda.html#_step_3_configure_apm_on_aws_lambda[Configure APM on AWS Lambda - Node.js]

  • {apm-py-ref}/lambda-support.html#_step_3_configure_apm_on_aws_lambda[Configure APM on AWS Lambda - Python]

  • {apm-java-ref}/aws-lambda.html#_step_3_configure_apm_on_aws_lambda[Configure APM on AWS Lambda - Java]

That’s it. With the first invocation (cold start) of your Lambda function you should see a log message from the {apm-lambda-ext} indicating that a secret from the secrets manager is used:

"Using the APM secret token retrieved from Secrets Manager."