Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix transaction performance collector oom #2505

Merged
merged 8 commits into from
Feb 3, 2023

Conversation

stefanosiano
Copy link
Member

📜 Description

PerformanceCollectionData now contains single values instead of lists
ICollectors now work on a single instance of PerformanceCollectionData instead than a list
TransactionPerformanceCollector:

  • is now set in Android options
  • a NoOpTransactionPerformanceCollector is set in Java options
  • deletes data after 30 seconds
  • collect data only if a profile is sampled

💡 Motivation and Context

We collect data about cpu and memory usage during transactions. This caused oom error in java backend.
As a workaround, the data collection was disabled.
We are now enabling the data collection only on the Android SDK, and only if a profile is sampled, since we send this data only in profiles, at the moment.
Several updates were put to reduce the memory footprint of the collected data

💚 How did you test it?

Unit test

📝 Checklist

  • I reviewed the submitted code
  • I added tests to verify the changes
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled
  • I updated the docs if needed
  • No breaking changes

🔮 Next steps

ICollectors now work on a single instance of PerformanceCollectionData instead than a list, to improve performance on lots of concurrent transactions
TransactionPerformanceCollector:
 is now set in Android options
 a NoOpTransactionPerformanceCollector is set in Java options
 deletes data after 30 seconds
 collect data only if a profile is sampled
ITransactionProfiler now works on a list of PerformanceCollectionData
@github-actions
Copy link
Contributor

github-actions bot commented Jan 31, 2023

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 321.17 ms 329.25 ms 8.08 ms
Size 1.73 MiB 2.33 MiB 623.10 KiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
c1399c1 345.06 ms 385.49 ms 40.43 ms
5fa24ec 326.29 ms 384.53 ms 58.24 ms
14c083a 350.82 ms 388.86 ms 38.04 ms

App size

Revision Plain With Sentry Diff
c1399c1 1.73 MiB 2.33 MiB 620.61 KiB
5fa24ec 1.73 MiB 2.33 MiB 620.61 KiB
14c083a 1.73 MiB 2.33 MiB 620.61 KiB

Previous results on branch: fix/transaction-performance-collector-oom-fix

Startup times

Revision Plain With Sentry Diff
20490da 253.57 ms 325.00 ms 71.43 ms
70f3d95 329.92 ms 346.80 ms 16.88 ms
7b28eb6 370.74 ms 427.53 ms 56.79 ms
ddb73f9 412.00 ms 422.18 ms 10.18 ms

App size

Revision Plain With Sentry Diff
20490da 1.73 MiB 2.33 MiB 623.07 KiB
70f3d95 1.73 MiB 2.33 MiB 623.06 KiB
7b28eb6 1.73 MiB 2.33 MiB 623.06 KiB
ddb73f9 1.73 MiB 2.33 MiB 623.07 KiB

@codecov
Copy link

codecov bot commented Jan 31, 2023

Codecov Report

Base: 80.18% // Head: 80.18% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (df61be9) compared to base (7da2a31).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #2505   +/-   ##
=========================================
  Coverage     80.18%   80.18%           
+ Complexity     3943     3942    -1     
=========================================
  Files           323      323           
  Lines         14896    14891    -5     
  Branches       1966     1964    -2     
=========================================
- Hits          11944    11940    -4     
  Misses         2178     2178           
+ Partials        774      773    -1     
Impacted Files Coverage Δ
...io/sentry/NoOpTransactionPerformanceCollector.java 100.00% <ø> (ø)
...c/main/java/io/sentry/NoOpTransactionProfiler.java 100.00% <ø> (ø)
...sentry/DefaultTransactionPerformanceCollector.java 92.72% <100.00%> (+2.16%) ⬆️
...y/src/main/java/io/sentry/JavaMemoryCollector.java 100.00% <100.00%> (ø)
...main/java/io/sentry/PerformanceCollectionData.java 100.00% <100.00%> (ø)
sentry/src/main/java/io/sentry/SentryOptions.java 79.46% <100.00%> (+0.09%) ⬆️
sentry/src/main/java/io/sentry/SentryTracer.java 91.95% <100.00%> (+0.06%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@stefanosiano stefanosiano marked this pull request as ready for review February 1, 2023 09:55
Copy link
Member

@markushi markushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, please have a look at my comments!

// This way we avoid issues caused by having multiple cpu or memory collectors.
tempData.commitData();

for (List<PerformanceCollectionData> data : performanceDataMap.values()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: This should probably be thread-safe right? As a transaction could be stopped and removed from the list, while collection is running.

Copy link
Member Author

@stefanosiano stefanosiano Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performanceDataMap is a ConcurrentHashMap, so no concurrentModifications occur.
I previously put a synchronized block, but that caused problems when there are a lot of transactions (100k+). It's an edge case, and only for backend, which would have performance collection disabled anyway, but I removed that and relied on the ConcurrentHashMap.
Does it make sense?

@adinauer
Copy link
Member

adinauer commented Feb 2, 2023

Let me know if you need my help in testing if OOM is solved.

Copy link
Member

@markushi markushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@stefanosiano
Copy link
Member Author

Let me know if you need my help in testing if OOM is solved.

Thanks @adinauer! I tried with your commands, and i found that after forcing a gc the ram went back to normal.
Feel free to try, just remember that it's still disabled for the backend. So you'd need to set the DefaultTransactionPerformanceCollector in the options and remove the && Boolean.TRUE.equals(isProfileSampled()) from this line

@stefanosiano stefanosiano merged commit f6a135d into main Feb 3, 2023
@stefanosiano stefanosiano deleted the fix/transaction-performance-collector-oom-fix branch February 3, 2023 11:12
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants