Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Beacon sender causes flood of thousands of requests on page unload #2195

Closed
aKzenT opened this issue Nov 3, 2023 · 10 comments
Closed

Comments

@aKzenT
Copy link

aKzenT commented Nov 3, 2023

Description/Screenshot
image

Sometimes the AI SDK causes 10s of thousands of requests when the page is getting unloaded, which causes the page to become very unresponsive. Upon further investigation the problem seems to arise from the _beaconSender function in Sender.ts and specifically the fallback logic if the initial beacon send was not successful:
https://github.com/microsoft/ApplicationInsights-JS/blob/19e8131c7f979a0c4ec2069a56ea7e37a0d82c15/channels/applicationinsights-channel-js/src/Sender.ts#L865C22-L865C35

As I understand the function, it will try to split up the payload by using the payload.oriPayload array and sending the items of that array individually. However the for loop uses data.length instead of oriPayload.length as the index bound. data is however a string representation of the payload, so the loop basically iterates over each character of that payload, which in our case is multiples of 10k bytes. Also within the loop, the variable thePayload is initialized using payload[lp] instead of oriPayload[lp].

Steps to Reproduce
Create a payload that cannot be send completely using the beacon API.

  • OS/Browser: Chrome 118
  • SDK Version [e.g. 22]: 3.0.3

Expected behavior
Payload is split and send successfully.

Additional context
Add any other context about the problem here.

@MSNev MSNev added this to the 3.0.x milestone Nov 3, 2023
@MSNev
Copy link
Collaborator

MSNev commented Nov 3, 2023

yikes! good catch.

This scenario only occurs during unload and only when the full sendBeacon request returns false (it cannot batch the events to be sent). Because ApplicationInsights uses a SessionBuffer to try and remember unsent and no response received events between page loads (assumes the user is not closing the browser). You can "work around" this issue in the short term by disabling the usage of this API by setting the useSendBeacon: false in the config passed into the initialization (the config with your connectionString / instrumentationKey)

@MSNev
Copy link
Collaborator

MSNev commented Nov 3, 2023

At this point the next scheduled release (because of the upcoming holidays) will not be until early next year (Feb '24).
Would the above workaround suffice until that point in time?

@MSNev
Copy link
Collaborator

MSNev commented Nov 3, 2023

Hmm, however, looking at the code history, it appears that this bug was not introduced until v3.0.4 (so 3.0.3 doesn't have the same issue)...

for (let lp = 0; lp < payload.length; lp++) {

So this might just be a case that you have a backlog of events all of which cannot be sent (because the sendBeacon API has a 64kb limitation)

@aKzenT
Copy link
Author

aKzenT commented Nov 3, 2023

You are right. That's my mistake. I was looking at our package.json without realizing that our package-lock.json actually references version 3.0.4. So we are on 3.0.4.

As a workarround, we will downgrade to 3.0.3 until the bug is fixed.

Thanks!

@ReinisV
Copy link

ReinisV commented Nov 16, 2023

for people, who are integrating the sdk by injecting a script tag in the header of the page provided here:
https://learn.microsoft.com/en-us/azure/azure-monitor/app/javascript-sdk?tabs=javascriptwebsdkloaderscript

rather than using the npm version (because supposedly that nets better tracking, as the sdk would be loaded sooner)
The downgrade seems to be to specify a different version of the SDK by providing a different value for the src property.
https://github.com/microsoft/ApplicationInsights-JS#snippet-configuration-options

currently it is
https://js.monitor.azure.com/scripts/b/ai.3.gbl.min.js
but it is possible to include minor version in the url, so we can get
https://js.monitor.azure.com/scripts/b/ai.3.0.3.gbl.min.js
to work.

@MSNev would like to get a confirmation on if this is how this issue should be worked around until this bug is fixed.

@MSNev
Copy link
Collaborator

MSNev commented Nov 16, 2023

Correct, this is discussed (not quite so succinctly as above) here in the main readme https://github.com/Microsoft/ApplicationInsights-JS#module-formats

Additional details
From a timing perspective, when we released to npm we also fully publish the full version of that release to the CDN (so that you could upgrade to the new version immediately), and then we slowly deploy (promote / copy) the full version to the short names (ai.3.gbl.min.js and ai.3.0.gbl.min.js) over the next few weeks (we put the deployment plan in each Milestone).
The "short" versions of the files have a cache period of 30 mins, so that when we promote the next release everyone will pick up the new bits withing 30 mins, while the fully qualified version numbers have a 1yr cache period.

@Mohamed-Ahmed-Abdullah
Copy link

I downgraded to @microsoft/applicationinsights-web": "^3.0.3" & "3.0.2" but I still see excessive number of requests. Any other workaround?

@Dequim
Copy link

Dequim commented Nov 24, 2023

Hello,
It looks like the requests are spawning during flush. Calling the flush before unload would solve the problem?

@MSNev
Copy link
Collaborator

MSNev commented Dec 4, 2023

Calling the flush before unload would solve the problem?

Yes and No, if you call flush this will attempt to send the requests via a XMLHttpRequest, however, this can have several side-effects

  • If the SDK does managed to send the network request before any of the "unload" handlers (beforeunload, unload etc) are detected and the page successfully navigates away, then the network request may actually get cancelled with or without sending the requests. Because by default the SDK uses session storage to "track" events that do not have any response this (may) cause event duplication. And if the user actually closed the browser (not just refreshed the page), then the events will be lost
  • If however, one of the unload event has already been received (ie. you are calling flush within a beforeunload event handler), then is extremely likely that the events will still be sent using sendBeacon and you will still have the same issue.
  • Likewise, if you call flush telling it to send the request(s) as synchronously

@MSNev
Copy link
Collaborator

MSNev commented Dec 4, 2023

I downgraded to @microsoft/applicationinsights-web": "^3.0.3" & "3.0.2" but I still see excessive number of requests. Any other workaround?

Looking back at the history, it appears that the PR that introduced this change is #2113, which first appears in v3.0.4, so downgrading to either 3.0.3 or earlier should be fine.

This situation only occurs when the "final" sendBeacon call is rejected by the browser, so we then attempt to send as much as possible by splitting the batched events into individual requests. So it's not always deterministic to reproduce as it depends on the total outstanding event payload that the browser has not yet sent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants