Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log ingestion FetchError #128

Open
JannikWempe opened this issue Jun 28, 2023 · 16 comments
Open

Log ingestion FetchError #128

JannikWempe opened this issue Jun 28, 2023 · 16 comments

Comments

@JannikWempe
Copy link

JannikWempe commented Jun 28, 2023

Hi folks 👋🏼

We are using next-axiom@0.17.0. Hosting on Vercel with the Axiom integration installed.

We are seeing a lot of FetchErrors (reason: write EPROTO 139882530146240:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:331) when sending to https://vercel-vitals.axiom.co/api/v1/send?configurationId=icfg_[OMITTED]&projectId=[OMITTED]&type=logs from getStaticProps (hit by on-demand revalidation call, not during built-time).

image

What could be the cause for this? I am not even sure if it is an issue on our side or maybe on Axioms? 🤔
I can't find anything for using Axiom in getStaticProps.

Any hints or tips?

PS: I have also asked this is Discord.

EDIT
This is the high level code if it helps:

export function withIsr<
  // ...
>(handler: IsrGetStaticProps<P, Q, D>, options: WithIsrOptions): GetStaticProps<P, Q, D> {
  return async (context) => {
    // ...
    const logger = new Logger(
      {
        // ...
      },
      undefined,
      false,
      'lambda',
    );

    // ...

    try {
      const result = await handler(extendedContext);
      await logger.flush();
      return result;
    } catch (error) {
      logger.error('Error in ISR getStaticProps', { error });
      await logger.flush();
      throw error;
    }
  };
}
@devj3ns
Copy link

devj3ns commented Jul 3, 2023

I am having the same issue with next-axiom@0.17.0 (Techstack: Next.js, tRPC and Vercel).

FetchError: request to https://vercel-vitals.axiom.co/api/v1/send?configurationId=censored&projectId=censored&type=logs failed, reason: write EPROTO 140399046711232:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:331:
    at ClientRequest.<anonymous> (/var/task/node_modules/next/dist/compiled/node-fetch/index.js:1:65756)
    at ClientRequest.emit (node:events:513:28)
    at TLSSocket.socketErrorListener (node:_http_client:494:9)
    at TLSSocket.emit (node:events:513:28)
    at emitErrorNT (node:internal/streams/destroy:157:8)
    at emitErrorCloseNT (node:internal/streams/destroy:122:3)
    at processTicksAndRejections (node:internal/process/task_queues:83:21) {
  type: 'system',
  errno: 'EPROTO',
  code: 'EPROTO'
}

@dasfmi
Copy link
Collaborator

dasfmi commented Jul 5, 2023

I am wondering if its because the runtime is shutdown during http tries to handshake with axiom. can you try to use await log.flush() before your function returns and see if that solves it?

@ValentinH
Copy link

ValentinH commented Jul 17, 2023

We had the same error when using Node 16. We upgraded to Node 18 and we are now seeing a new error (that is most probably due to Axiom):

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11457:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: ConnectTimeoutError: Connect Timeout Error
      at onConnectTimeout (node:internal/deps/undici/undici:8422:28)
      at node:internal/deps/undici/undici:8380:50
      at Immediate._onImmediate (node:internal/deps/undici/undici:8411:13)
      at process.processImmediate (node:internal/timers:476:21)
      at process.topLevelDomainCallback (node:domain:161:15)
      at process.callbackTrampoline (node:internal/async_hooks:128:24) {
    code: 'UND_ERR_CONNECT_TIMEOUT'
  }
}

We are also hosted on Vercel

@JannikWempe
Copy link
Author

We had the same error when using Node 16. We upgraded to Node 18 and we are now seeing a new error (that is most probably due to Axiom):

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11457:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: ConnectTimeoutError: Connect Timeout Error
      at onConnectTimeout (node:internal/deps/undici/undici:8422:28)
      at node:internal/deps/undici/undici:8380:50
      at Immediate._onImmediate (node:internal/deps/undici/undici:8411:13)
      at process.processImmediate (node:internal/timers:476:21)
      at process.topLevelDomainCallback (node:domain:161:15)
      at process.callbackTrampoline (node:internal/async_hooks:128:24) {
    code: 'UND_ERR_CONNECT_TIMEOUT'
  }
}

We are also hosted on Vercel

Same here. We also just recently updated to Node 18 and now see the same error.

@dasfmi
Copy link
Collaborator

dasfmi commented Jul 18, 2023

can one of you guys provide a minimal reproducible example? is this a one time error or is it consistent?

@ValentinH
Copy link

On our side, it's happening multiple times per day (multiple times per hour sometimes).
Our functions still return fine so I'm guessing that the error could happen when Axiom is trying to emit the logs while the function is shutting down.

Regarding the minimum reproductible example, it's pretty hard as the error seems to be random.

@trevorharwell
Copy link

I've been experiencing this error for months now (started the day I setup the axiom integration). It seems entirely harmless aside from the part that my logs are riddled with these error messages. And yes....it does seem to be random.

I asked about it in discord back in March: https://discord.com/channels/1065957163161370664/1065957163933114411/1082348198959534170

It would be really great if axiom could find a fix for this.

@vajdagabor
Copy link

vajdagabor commented Aug 28, 2023

I am seeing this error in my logs too. Reinitializing the integration didn't solve the problem. I have recently upgraded to Next.js 13 (currently v13.4.16) and next-axiom v0.18. My system works, but I am seeing these errors in the production log time to time (5-10 times daily). They are all coming from my API routes. The system is deployed on Vercel.

TypeError: fetch failed at Object.fetch (node:internal/deps/undici/undici:11576:11)

Probably related:

@haydn
Copy link
Contributor

haydn commented Aug 31, 2023

Probably also related:

There are some suggestions in there that the issue might be DNS related. That could explain it's very inconsistent nature.

@ImLunaHey
Copy link
Contributor

@vajdagabor what is the cause field showing for that error? It'll show the actual issue you're hitting.

@vajdagabor
Copy link

@ImLunaHey this is the full message, including cause (IP address at Vercel redacted with ********):

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: Error: connect ETIMEDOUT ********:443
      at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1495:16)
      at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
    errno: -110,
    code: 'ETIMEDOUT',
    syscall: 'connect',
    address: '********',
    port: 443
  }
}

@ImLunaHey
Copy link
Contributor

Possibly related nodejs/undici#1531

Are any of you able to reproduce this on node 20?

@vajdagabor
Copy link

@ImLunaHey node v.18x is the highest that is currently available on Vercel (and this is also the recommended setting, as v16.x and below are deprecated).

@ImLunaHey
Copy link
Contributor

ImLunaHey commented Sep 14, 2023

@ValentinH
Copy link

I'd like to share an update as we are still experiencing this error and yesterday, I noticed that some logs were missing even though I'm sure that the corresponding functions were executed correctly.

First, we are using Node 20.x on Vercel.
Here are the log we can see:
image
and here's the proof that these logs are related to Axiom:
image

This error doesn't seem to only be affecting Axiom as per this issue: vercel/vercel#11692

However, I think that it would be great to have a retry strategy when this error happens. What do you think?

Not being able to trust our logs is a deal breaker. We are looking like amateurs when we need to tell our stakeholders "we don't know what happen" when we are debugging production issues and logs are missing 😭

@ValentinH
Copy link

I also just found these logs:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants