Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump up test JVM max heap from 2GB to 2.5GB #17056

Merged
merged 1 commit into from
Sep 13, 2024

Conversation

abhishekrb19
Copy link
Contributor

A few CI jobs have been consistently failing on master due to OOMs. Specifically, some tests in the Unit & Integration tests CI / unit tests (jdk17, sql-compat=true) / other_modules_test / other modules test (push) are failing -- see error below.

Looking at the heap dump, it appears that there are global references being retained which sort of point to Mockito. One candidate is the use of Mockito spy objects here that somehow have global references and don't get garbage collected. It's not evident yet. The heap dump artifact can be found in any of the recent failed jobs.

In the mean time, increase the JVM size from 2GB to 2.5GB to see if it helps alleviate the ongoing test failures.

Warning:  Tests run: 158, Failures: 0, Errors: 0, Skipped: 6, Time elapsed: 46.51 s -- in org.apache.druid.msq.test.CalciteArraysQueryMSQTest
[INFO] Running org.apache.druid.msq.test.CalciteSelectJoinQueryMSQTest$SortMergeTest
Warning:  Tests run: 586, Failures: 0, Errors: 0, Skipped: 85, Time elapsed: 131.9 s -- in org.apache.druid.msq.test.CalciteSelectJoinQueryMSQTest$SortMergeTest
[INFO] Running org.apache.druid.msq.test.CalciteSelectQueryMSQTest
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /home/runner/work/druid/druid/target/java_pid22365.hprof ...
Heap dump file created [3049991269 bytes in 9.662 secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="chmod 644 /home/runner/work/druid/druid/target/*.hprof"
#   Executing /bin/sh -c "chmod 644 /home/runner/work/druid/druid/target/*.hprof"...
Terminating due to java.lang.OutOfMemoryError: Java heap space

A few failed jobs:

  1. https://github.com/apache/druid/actions/runs/10834038415/job/30068115361
  2. https://github.com/apache/druid/actions/runs/10824552673/job/30035411027
  3. https://github.com/apache/druid/actions/runs/10835792545/job/30073489146

Copy link
Member

@kgyrtkirk kgyrtkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this as well in some PRs
+1

@abhishekagarwal87 abhishekagarwal87 merged commit 7a0d7d1 into apache:master Sep 13, 2024
91 checks passed
@kgyrtkirk
Copy link
Member

I don't think this is normal;
there are 4503 ObjectMappers in one of the heapdump; and plenty of hashmap-s with ControllerImpl
we have pretty similar issue to this: mockito/mockito#2503

@abhishekrb19 abhishekrb19 deleted the bump_up_test_jvm branch September 13, 2024 13:31
pranavbhole pushed a commit to pranavbhole/druid that referenced this pull request Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants