-
-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added churn testsuite #5101
Added churn testsuite #5101
Conversation
Grinder with disabled compressed oops https://ci.adoptium.net/view/Test_grinder/job/Grinder/8948/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to put a request changes on this PR, so it doesn't get merged before all your questions get answered. I have added some review comments in-line, and will answer some other questions shortly.
My immediate question is what happens when this testcase is run against distributions that do not ship with certain GCs. This likely also relates to your question about running each GC as a separate testcase.
functional/churn/build.xml
Outdated
<property name="DEST" value="${BUILD_ROOT}/functional/churn" /> | ||
<property name="src" location="./churn" /> | ||
|
||
<target name="THC.check"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opportunity to generalize the name to testcodeDir.check and testCodeDir.exists (which could also be made into an ant macrodef and moved to TKG, but not required for this PR), no need to use same ant target name as the TestHeadlessComponents ant tasks
functional/churn/playlist.xml
Outdated
--> | ||
<playlist xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../../TKG/resources/playlist.xsd"> | ||
<test> | ||
<testCaseName>churn</testCaseName> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest calling it churnGC_5min and creating a 2nd testTargetName called churnGC_4hr and setting the DURATION accordingly for both.
functional/churn/playlist.xml
Outdated
</groups> | ||
</test> | ||
<test> | ||
<testCaseName>churn-nocoops</testCaseName> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a mechanism to automatically run all test targets twice, once with compressedrefs JVM option, and a second time as noncompressedrefs JVM option.
I have to look closer at what NOCOMP does in the churn code, is it used for other decision-making within the test code to best advise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is propagating it to the logs name on top level - https://github.com/rh-openjdk/churn/blob/master/run.sh#L155, and setting the appropriate flags on the low level - https://github.com/rh-openjdk/churn/blob/master/bin/nocoops.sh
TYVM! will elaborate tomorrow morning CET. |
It seems I'm handling resutls correctly. The only disadvantage of run in grinder is, that logs got packed second time. I will fix this in the usptream runner. |
Hello! Bit more bussy day then expected:(
|
re: #5101 (comment) - great, chat with you tomorrow about final steps |
pls do not merge (yet). The 050fb07 is missbehavg in my test setup. But maybe my testsetup is wrong |
re: #5101 (comment) - I will convert this to Draft state, until all reviews and testing is completed (that way it can not get merged prematurely) |
The 050fb07 , overwritten as 2249bc0 now works. Am now updating readme to reflect the state. Will lift draft (thanx for setting it up) once done Two new questions
|
hmm.. https://ci.adoptium.net/view/Test_grinder/job/Grinder/8994/tapResults/ they are here! thoughts? I'm not sure what I did... (search for churn.tap or for eg shenandoah) |
re: #5101 (comment) - you did not need to do anything, our test pipelines always generate TAP files that include all test targets ( all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review comments and suggestions are added inline
functional/churn/playlist.xml
Outdated
<command> | ||
export OTOOL_JDK_VERSION="$(JDK_VERSION)" ; \ | ||
export JREJDK="jdk" ; \ | ||
export OTOOL_garbageCollector="${CHURN_GCS}" ; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export OTOOL_garbageCollector="${CHURN_GCS}" ; \ | |
export OTOOL_garbageCollector="${APPLICATION_OPTIONS}" ; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand this requirements:
- export OTOOL_garbageCollector="${CHURN_GCS}" ; \
+export OTOOL_garbageCollector="${APPLICATION_OPTIONS}" ; \
...
- export DURATION="${CHURN_DURATION}" ; \
+ export DURATION="300" ; \
The APPLICATION_OPTIONS is something what is custom in aqavit? Isn't it some more general setup?
The "zgc" or "shenandaoh" can be hardly considered as generic.
To remove custom duration s moreover killing the purpise of the changeset and the suite. If you are developing custom GC or porting JDK to new paltform, you want to run churn on each GC you care about (usually default or the single custom one) for several hours. 300s is useles. Generally any hardcoded value in the "custom" setupo is (IMO) bad.
Can we stay with originals pelase? Or maybe an APPLICATION_OPTIONS contain key=value pairs? (but as GC can be space separated list of GCs, it sounds bad). Or maybe APPLICATION_OPTIONS_GC and APPLICATION_OPTIONS_DURATION ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing your touch to readme - is there eal fear, that some ccrazy eprson will start 20days long churn run? If so, then I would lke to still keep the duration adjsutable, and only eg limit it from top.
if such logic would be desirable, then also the churn_1m_allGCs and churn_5h_allGCs may actually take GC as parameter. And fallback to "ALL" if nothing provided. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see that I have asked that this churn_custom target is disabled (using the <disables>
block I suggested), in that way it does not get run during our automated runs.
Also in our world, no one is allowed to go onto test machines and set CHURN_DURATION and CHURN_GCS environment variables, and we would not want them to do that (for lots of reasons, not the least of which is transparency of machine config).
This leaves me to the point to look at this test target as a useful thing when someone wants to run a Grinder to test a particular GC for a short duration, which is why I suggest leveraging the existing APPLICATION_OPTIONS mechanism for pinpointed testing of a particular GC failure.
All that to be said, feel free to not take those review comments and apply them.
if such logic would be desirable, then also the churn_1m_allGCs and churn_5h_allGCs may actually take GC as parameter. And fallback to "ALL" if nothing provided. wdyt?
Yes, the other option is to remove the churn_custom target altogether from the playlist.xml file in this PR. That would be the expedient way to get this PR through.
Could consider then leveraging APPLICATION_OPTIONS to pass a subset of GCs and if its blank, expect the churn test suite to default to all.
functional/churn/README.md
Outdated
|
||
## Running via aqavit | ||
|
||
The support for compressed ops is handled by aqavit itself. It is currently not sure if churn will be able to honor it. If not, churn will be fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The support for compressed ops is handled by aqavit itself. It is currently not sure if churn will be able to honor it. If not, churn will be fixed. | |
The support for compressed ops is handled by AQAvit itself through the use of the `<variation>` tag in playlist.xml. It is currently not sure if churn will be able to honor it. If not, churn will be fixed. |
functional/churn/playlist.xml
Outdated
</platformRequirementsList> | ||
<groups> | ||
<group>system</group> | ||
</groups> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's restrict to particular vendors for now, as I am not sure which vendors include the entire list of GCs that are in 'allGCs' in their distribution (in particular shenandoah) and we do not want to break others.
Add this bit into the <test>
block, can be right after the </groups>
tag
<vendors>
<vendor>eclipse</vendor>
<vendor>redhat</vendor>
</vendors>
functional/churn/playlist.xml
Outdated
</platformRequirementsList> | ||
<groups> | ||
<group>system</group> | ||
</groups> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<vendors>
<vendor>eclipse</vendor>
<vendor>redhat</vendor>
</vendors>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this element, But I think the generic openjdk should be for sure here. I now recall, there may be issue with shenandoah on jdk8, but I would rather workaround it in CHURN itself, then exclude generic oepnjdk. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had made the listing of GCs dynamic: rh-openjdk/churn@17f1b83
it is still based on known values, but at least will not enforce not-existing GC. Any advice on enabling more vendors? (compelltyh up to you :) tyvm!!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can onboard other vendors once we see this running for a few weeks at the project, we are just bringing this in now cautiously, but then can broaden it later (and we will do that by testing it against other vendor distros before enabling them).
Yes, we will add this test target with a
Yes, it will get run against our EA builds which get triggered whenever a new upstream tag is available (so approximately weekly for each JDK version). Since we are adding this into dev level, and system group, the 2 enabled test targets in your playlist, churn_1m_allGCs and churn_5h_allGCs, will get run in our |
Well, yes. With all pros and cons. There is usually one tap file, which consider the whole testsuite as on test. Then find properly what really run/failed inside is quite a detective task(but most liekly it is lack of my insight). Where the one default generated tap file is awesome feature, the suites usually generates much better outputs, and should be allowed to publish it. In this case, the tap file churn generates is pretty good, and as the tkg toolchain is picking up all taps, it is pretty usable (feel free to correct me, or tell me to not generate it). |
Thanx!
IIRC, shenandoah is enabled everywhere except jdk8 by default, so they would need to manually disable it during build. But why would anybody disable best GC around? In case of shenandoah, especially because of its 'not-present' in jdk8, I will make it more dynamic. I can exclude GCs from default list based on |
Yes, my thought with all of my review comments was to get the PR in and then come back later to adjust after we have run it for a while at the project. |
The TAP file is produced to 'normalize' test results from all suites. Having TAP files that do something different moves us away from normalization. For more granular information, we would look to test suites producing some sort of JUnit style output and then also present that. |
I see. The churn generates both tap file and packed junit file. So Your recomndation is to stop generating tap file? (no problem on my side, only I liked it :) ) |
Sure. Thank you very much! Highly appreciated.
wdyt? Do you se enay more obstacles? |
The tests might need to move to system/ folder instead of functional/ folder if we'd like to trigger the tests automatically as part of jenkins test builds of dev.system. It does work in grinder under functional/ folder as in grinder we can manually set buildList=functional/churn, however in jenkins dev.system jobs buildList=system. |
functional/churn/playlist.xml
Outdated
--> | ||
<playlist xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../../TKG/resources/playlist.xsd"> | ||
<test> | ||
<testCaseName>churn_1m_allGCs</testCaseName> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
churn_1m_allGCs is for testing only. When the PR is ready for merge can it be disabled by adding tag ?
Good catch, we started off with this in functional group, but because it is a stress/load test, it is more appropriate to add it to system, so better to shift the churn folder into the system folder. +1 |
Does the test require MVN? The output says |
Default build system is maven. But if it is not found, javac would do. So that is this case, it is compiled directly by javac. |
I see. It's an issue with window path and cygwin path.
|
For cygwin |
hopefully all s/functional/system/g done properly
Right yiou are, rh-openjdk/churn#24 was all waht was needed. Rerunning windows. |
^ no longer necessary.
I guess the disable of |
The new failure was related with migration from functional to system, I'm hitting the same issue with reproducible comparing tests. I can take a look next Monday. |
Gosh. Rerunning windows here: https://ci.adoptium.net/view/Test_grinder/job/Grinder/9038/ green! |
@sophia-guo TY! |
<mkdir dir="${DEST}"/> | ||
</target> | ||
|
||
<target name="dist" depends="getChurn" description="generate the distribution"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add init
task as depends depends="getChurn, init"
unless the non-exist directory will be created when doing the <copy todir="${DEST}">
? Not sure about this looking at the grinder the $DEST was created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @judovana ! I may come back for an enhancement on how we run the churn_custom target, but this is good as is.
functional/churn/playlist.xml
Outdated
</platformRequirementsList> | ||
<groups> | ||
<group>system</group> | ||
</groups> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can onboard other vendors once we see this running for a few weeks at the project, we are just bringing this in now cautiously, but then can broaden it later (and we will do that by testing it against other vendor distros before enabling them).
churn testsuite is powerful suite to test individual GCs in OpenJDK
As it is added heere, it tests all GCs in given version of jdk.
Initial pass is https://ci.adoptium.net/view/Test_grinder/job/Grinder/8946/
As it is now - for review process, it runs for some 20 seconds - that means aprox 4 seconds per GC, which i moreover useless. In GH actions (https://github.com/rh-openjdk/churn/actions) I run it on all platforms for aprox 3 minutes per GC, whcih is still moreover useless.
Exept running as it is, it can also run in disabled compressed oops mode. Not sure if it is required here. To run both, should be easy. To make it configurable.. not sure...
Few questions: