Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrivateCpuProfilerBindings.startProfiling is not a function #14561

Open
3 tasks done
jbreckel opened this issue Dec 4, 2024 · 17 comments
Open
3 tasks done

PrivateCpuProfilerBindings.startProfiling is not a function #14561

jbreckel opened this issue Dec 4, 2024 · 17 comments
Labels
Feature: Profiling Package: node Issues related to the Sentry Node SDK Package: profiling-node Issues related to the Sentry Profiling Node SDK

Comments

@jbreckel
Copy link

jbreckel commented Dec 4, 2024

Is there an existing issue for this?

How do you use Sentry?

Self-hosted/on-premise

Which SDK are you using?

@sentry/node

SDK Version

8.41.0

Framework Version

No response

Link to Sentry event

No response

Reproduction Example/SDK Setup

AWS Lambda Node 20

Sentry.init({
    dsn: SENTRY_DSN,
    environment: stage,
    integrations: [nodeProfilingIntegration()],
    release: 'not-defined',
    tracesSampleRate: 0.5,
    profilesSampleRate: 1.0,
    shutdownTimeout: 0.1,
});

Steps to Reproduce

  1. execute lambda
  2. wait for execution
  3. get error or expected result

Expected Result

get a reproducible result.
~ 4 out of 10 times the below described error gets thrown

Actual Result

sometimes a TypeError is thrown, most times not

{
    "errorType": "TypeError",
    "errorMessage": "PrivateCpuProfilerBindings.startProfiling is not a function",
    "stack": [
        "TypeError: PrivateCpuProfilerBindings.startProfiling is not a function",
        "    at Bindings.startProfiling (/var/task/index.js:67619:43)",
        "    at maybeProfileSpan (/var/task/index.js:67944:27)",
        "    at /var/task/index.js:67988:28",
        "    at /var/task/index.js:5588:43",
        "    at Array.forEach (<anonymous>)",
        "    at NodeClient.emit (/var/task/index.js:5588:21)",
        "    at /var/task/index.js:37635:41",
        "    at /var/task/index.js:37661:82",
        "    at _optionalChain$1 (/var/task/index.js:37635:19)",
        "    at onSpanStart (/var/task/index.js:37661:7)"
    ]
}
@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 4, 2024
@github-actions github-actions bot added the Package: node Issues related to the Sentry Node SDK label Dec 4, 2024
@Lms24
Copy link
Member

Lms24 commented Dec 4, 2024

Hey @jbreckel thanks for writing in! Are you running your lambda function in ESM or CJS? Can you confirm that all your @sentry/* packages are aligned to exactly the same version?

@jbreckel
Copy link
Author

jbreckel commented Dec 4, 2024

lambdas are running CJS, NodejsFunction definition:

new lambdaNodeJs.NodejsFunction(
    this,
    `SomeLambda`,
    {
        functionName: `${this.prefix}-some-lambda`,
        entry: `src/aws/lambda/some-lambda.ts`,
        runtime: lambda.Runtime.NODEJS_20_X,
        logRetention: logs.RetentionDays.ONE_WEEK,
        architecture: lambda.Architecture.ARM_64,
        bundling: {
            loader: {
                '.node': 'file',
            },
            commandHooks: {
                beforeBundling: () => [],
                beforeInstall: () => [],
                afterBundling: (_, outputDir) => [
                    `sentry-prune-profiler-binaries --target_platform=linux --target_arch=arm64 --target_node=20 --target_stdlib=glibc --target_dir_path=${outputDir}`,
                ],
            },
        },
        environment: {
            SENTRY_ENV: this.awsStage,
            SENTRY_RELEASE: `${this.prefix}-${this.awsStage}`,
            SENTRY_DSN: Context.tryGetContext<string>(
                this.node,
                'sentryDSN'
            ),
        },
    }
);

Yes, they are all pinned @8.41.0 in package-lock.json.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 4, 2024
@mydea mydea added Feature: Profiling Package: profiling-node Issues related to the Sentry Profiling Node SDK labels Dec 5, 2024
@Lms24
Copy link
Member

Lms24 commented Dec 5, 2024

Correct me if I'm wrong but it looks like you're using some kind of wrapper around your lambda function handler? Is this a library or some framework? Any chance you could provide a minimal reproduction for this?

Also gonna cc @JonasBa - have you seen this error before by chance?

@jbreckel
Copy link
Author

jbreckel commented Dec 6, 2024

The only wrapper is the sentry wrapHandler.
I will try to get a minimal example setup

@jbreckel
Copy link
Author

jbreckel commented Dec 6, 2024

https://github.com/jbreckel/sentry-profiling-reproduction
deployed and executed the lambda a few times, still get the same issue.

I may be using the wrong .node file but I am unsure how to verify that

@Lms24
Copy link
Member

Lms24 commented Dec 6, 2024

We'll look at this, thanks for the reproduction! I have to admit I've never seen a lambda definition like this (though my AWS knowledge is limited as well) and the fact that it mentions "bundling" worries me a bit with opentelemetry instrumentation. This shouldn't cause the issue with profilling though. @JonasBa has the most context around profiling so hopefully we get to this soonish.

For the moment, you should be good if you remove the profiling integration.

@JonasBa
Copy link
Member

JonasBa commented Dec 10, 2024

@jbreckel @Lms24 I seem to have made a mistake in the docs. The correct method should be startProfiler. Let me do a quick search over our docs and update this!

@jbreckel
Copy link
Author

@jbreckel @Lms24 I seem to have made a mistake in the docs. The correct method should be startProfiler. Let me do a quick search over our docs and update this!

how is this related?
I am not calling startProfiling manually, but rely on the nodeProfilingIntegration, as you can see in the repro-repo.

Is there an update required as well?

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 12, 2024
@Lms24
Copy link
Member

Lms24 commented Dec 12, 2024

Hmm ok so digging a bit into this, I suspect it is indeed related to the loading of the .node binary. It seems like the SDK can load a binary but the startProfiling method is not available or somehow not a function? I'm not sure why this would happen but it does seem like it's possible, especially since we're just typecasting what we require. We could guard this but ultimately that's just applying a bandaid without knowing why it's broken. @JonasBa thoughts?

@jbreckel have you tried if removing the sentry-prune-profiler-binaries command changes anything?

@jbreckel
Copy link
Author

jbreckel commented Dec 12, 2024

does not change anything.
if it does not find the correct binary it should also throw another error, right?

I was also confused about that part, maybe there was an adjustment in the functionname and it is no longer called startProfiling but startProfiler as @JonasBa mentioned?

edit:
not sure if I understand it correctly, but shouldn't there be an error thrown around here:

napi_value start_profiling;
if (napi_create_function(env, "startProfiling", NAPI_AUTO_LENGTH,
StartProfiling, exports,
&start_profiling) != napi_ok) {
napi_throw_error(env, nullptr, "Failed to create startProfiling function.");
return NULL;
}

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 12, 2024
@Lms24
Copy link
Member

Lms24 commented Dec 12, 2024

I don't know how this works in detail tbh or if/how you can read the cpp debug logs. However, if I understand the linked code correctly, there actually is a chance that startProfiling is not a function but null? If so, we should really guard invoking it.

@Lms24
Copy link
Member

Lms24 commented Dec 12, 2024

I opened #14676 to guard the invocation of the functions with a typeof check. This only brings us halway there though, as I still don't understand why the native functions sometimes aren't loaded correctly.

@JonasBa
Copy link
Member

JonasBa commented Dec 12, 2024

@jbreckel the native method is startProfiling, because either of the two profiling modes uses the same native method, however the JS exposed one is startProfiler when using continuous profiling mode.

I think what ends up happening here is that we fail to expose the binding and you are correct that an error should be thrown. Are you seeing any errors being throw here that would correspond to the missing startProfiling call?

@jbreckel would you mind sharing what platform or provider are you running this on? I might take a look if there are similar issues that others have reported

@jbreckel
Copy link
Author

jbreckel commented Dec 12, 2024 via email

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 12, 2024
@JonasBa
Copy link
Member

JonasBa commented Dec 12, 2024

@jbreckel I was going to ask where do you deploy the infra? Is it AWS lambda? The reason I ask is that in some cases, the runtime can be backed by a different engine that might not be compatible with everything we require and I wanted to see if whatever provider you use has some documentation around native binaries

Lms24 added a commit that referenced this issue Dec 12, 2024
…4676)

Add a guard to check that the native `startProfiling` and
`stopProfiling` APIs are actually available and functions. This came up
in #14561 where it seems like they're not always loaded correctly. For
reasons unknown to me.

ref #14561
@Lms24
Copy link
Member

Lms24 commented Dec 12, 2024

I merged #14676 which will go out with version 8.45.0 soon. While this doesn't fix the root cause, it should keep your lambda from throwing the error as it will simply bail out if the binary wasn't loaded correctly.

@getsantry getsantry bot moved this to Waiting for: Community in GitHub Issues with 👀 3 Dec 12, 2024
@jbreckel
Copy link
Author

jbreckel commented Dec 16, 2024

@jbreckel I was going to ask where do you deploy the infra? Is it AWS lambda? The reason I ask is that in some cases, the runtime can be backed by a different engine that might not be compatible with everything we require and I wanted to see if whatever provider you use has some documentation around native binaries

we are using aws lambda just as the example repo shows. also without the pruning of binaries the error continues to randomly appear.
I tried to figure out if I could find any information around those binaries but I did not find much except for references of people using the pruning script

I merged #14676 which will go out with version 8.45.0 soon. While this doesn't fix the root cause, it should keep your lambda from throwing the error as it will simply bail out if the binary wasn't loaded correctly.

thanks!

@getsantry getsantry bot moved this from Waiting for: Community to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Profiling Package: node Issues related to the Sentry Node SDK Package: profiling-node Issues related to the Sentry Profiling Node SDK
Projects
Status: No status
Development

No branches or pull requests

4 participants