-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change telemetry devices to rely on jvm.config instead of ES_JAVA_OPTS #711
Conversation
Move telemetry devices that currently rely on ES_JAVA_OPTS from the launcher to the provisioner and instead persist the necessary information in config/jvm.options
Depends on elastic/rally-teams#28 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks already promising. :) I left a few comments.
esrally/mechanic/provisioner.py
Outdated
@@ -38,11 +38,22 @@ def local_provisioner(cfg, car, plugins, cluster_settings, all_node_ips, target_ | |||
node_root_dir = "%s/%s" % (target_root, node_name) | |||
|
|||
_, java_home = java_resolver.java_home(car, cfg) | |||
|
|||
node_telemetry_dir = "%s/telemetry" % node_root_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should replace this with os.path.join(node_root_dir, "telemetry")
.
esrally/mechanic/provisioner.py
Outdated
@@ -165,6 +177,9 @@ def prepare(self, binary): | |||
# determine after installation because some variables will depend on the install directory | |||
target_root_path = self.es_installer.es_home_path | |||
provisioner_vars = self._provisioner_variables() | |||
|
|||
# add java options for telemetry devices | |||
provisioner_vars.update({"additional_java_settings" : self.telemetry.instrument_candidate_env(self.es_installer.car, self.es_installer.node_name)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move this to prepare()
? Also, a simpler option would be to write it like:
provisioner_vars["additional_java_settings"] = self.telemetry.instrument_candidate_env(self.es_installer.car, self.es_installer.node_name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not sure what you mean, this snippet is already in prepare()..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should i move it to _provisioner_variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant self._provisioner_variables()
, yes. :)
esrally/mechanic/telemetry.py
Outdated
@@ -47,16 +47,12 @@ def __init__(self, enabled_devices=None, devices=None): | |||
self.devices = devices | |||
|
|||
def instrument_candidate_env(self, car, candidate_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should rename this method because we now provide only Java options.
esrally/mechanic/telemetry.py
Outdated
@@ -47,16 +47,12 @@ def __init__(self, enabled_devices=None, devices=None): | |||
self.devices = devices | |||
|
|||
def instrument_candidate_env(self, car, candidate_id): | |||
opts = {} | |||
opts = [] | |||
for device in self.devices: | |||
if self._enabled(device): | |||
additional_opts = device.instrument_env(car, candidate_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly we should rename instrument_env
of the individual telemetry devices.
esrally/mechanic/telemetry.py
Outdated
@@ -190,7 +186,7 @@ def instrument_env(self, car, candidate_id): | |||
java_opts = self.java_opts(log_file) | |||
|
|||
self.logger.info("jfr: Adding JVM arguments: [%s].", java_opts) | |||
return {"ES_JAVA_OPTS": java_opts} | |||
return java_opts.split() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just need to return java_opts
here? A Python list does not provide a split
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielmitterdorfer For jfr version of this method, I kept java_opts as string and then split it into list by white space, the code seemed simpler that way. But I can change it to use list from the beginning. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
java_opts = self.java_opts(log_file) for FlightRecorder still returns string, and here I split it into list. Do you think I should change java_opts to return list instead to make it more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should always return lists.
esrally/mechanic/telemetry.py
Outdated
return {"ES_JAVA_OPTS": "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation " | ||
"-XX:LogFile=%s -XX:+PrintAssembly" % log_file} | ||
return ["-XX:+UnlockDiagnosticVMOptions", "-XX:+TraceClassLoading", "-XX:+LogCompilation", | ||
"-XX:LogFile=%s", "-XX:+PrintAssembly" % log_file ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be:
"-XX:LogFile={}".format(log_file), "-XX:+PrintAssembly"
esrally/mechanic/telemetry.py
Outdated
return {"ES_JAVA_OPTS": "-Xloggc:%s -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps " | ||
"-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime " | ||
"-XX:+PrintTenuringDistribution" % log_file} | ||
return ["-Xloggc:%s", "-XX:+PrintGCDetails", "-XX:+PrintGCDateStamps", "-XX:+PrintGCTimeStamps", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly "-Xloggc:{}".format(log_file)
esrally/mechanic/telemetry.py
Outdated
else: | ||
# see https://docs.oracle.com/javase/9/tools/java.htm#JSWOR-GUID-BE93ABDC-999C-4CB5-A88B-1994AAAC74D5 | ||
return {"ES_JAVA_OPTS": "-Xlog:gc*=info,safepoint=info,age*=trace:file=%s:utctime,uptimemillis,level,tags:filecount=0" % log_file} | ||
return ["-Xlog:gc*=info,safepoint=info,age*=trace:file=%s:utctime,uptimemillis,level,tags:filecount=0" % log_file] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For newer code we should use #format()
instead of %
unless it is a hot code path.
@danielmitterdorfer Should I also add support for telemetry java options in docker provisioner in this PR or a different PR? |
IMHO this change should be consistent so you'd need to adapt the Docker provisioner as well. As I will start my vacation at the end of the day it would be best if @dliappis could pick this up as a reviewer. |
When I wrote this originally I was under the impression that we would already support this for Docker but we don't. Therefore there is no need to add this new functionality in this PR and we can defer this to a later point in time if needed. |
@ebadyano I will do a review as well and test this a bit, so let's not merge yet please until I've done that too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good and worked on some scenarios that I tried. I left a few PEP-8 related styling comments.
esrally/mechanic/provisioner.py
Outdated
self.es_installer.install(binary["elasticsearch"]) | ||
# we need to immediately delete it as plugins may copy their configuration during installation. | ||
self.es_installer.delete_pre_bundled_configuration() | ||
|
||
# determine after installation because some variables will depend on the install directory | ||
target_root_path = self.es_installer.es_home_path | ||
provisioner_vars = self._provisioner_variables() | ||
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for extra space on this empty line
esrally/mechanic/provisioner.py
Outdated
if java_opts: | ||
provisioner_vars["additional_java_settings"] = java_opts | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 1 empty line is enough here.
esrally/mechanic/telemetry.py
Outdated
return {"ES_JAVA_OPTS": "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation " | ||
"-XX:LogFile=%s -XX:+PrintAssembly" % log_file} | ||
return ["-XX:+UnlockDiagnosticVMOptions", "-XX:+TraceClassLoading", "-XX:+LogCompilation", | ||
"-XX:LogFile={}".format(log_file), "-XX:+PrintAssembly"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This over-indented, see https://lintlyci.github.io/Flake8Rules/rules/E127.html; would be great if we could align.
esrally/mechanic/telemetry.py
Outdated
"-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime " | ||
"-XX:+PrintTenuringDistribution" % log_file} | ||
return ["-Xloggc:{}".format(log_file), "-XX:+PrintGCDetails", "-XX:+PrintGCDateStamps", "-XX:+PrintGCTimeStamps", | ||
"-XX:+PrintGCApplicationStoppedTime", "-XX:+PrintGCApplicationConcurrentTime", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar PEP-8 style comment here: https://lintlyci.github.io/Flake8Rules/rules/E127.html
tests/mechanic/telemetry_test.py
Outdated
"-XX:FlightRecorderOptions=disk=true,maxage=0s,maxsize=0,dumponexit=true," | ||
"dumponexitpath=/var/log/test-recording.jfr -XX:StartFlightRecording=defaultrecording=true", java_opts) | ||
"dumponexitpath=/var/log/test-recording.jfr", "-XX:StartFlightRecording=defaultrecording=true"], java_opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment for indentation here: https://lintlyci.github.io/Flake8Rules/rules/E127.html
You can consider a syntax like the following as well to make it a bit cleaner, if you prefer:
self.assertEqual(
["-XX:+UnlockDiagnosticVMOptions", "-XX:+DebugNonSafepoints", "-XX:+UnlockCommercialFeatures", "-XX:+FlightRecorder",
"-XX:FlightRecorderOptions=disk=true,maxage=0s,maxsize=0,dumponexit=true,"
"dumponexitpath=/var/log/test-recording.jfr", "-XX:StartFlightRecording=defaultrecording=true"], java_opts)
tests/mechanic/telemetry_test.py
Outdated
self.assertEqual("-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints " | ||
"-XX:StartFlightRecording=maxsize=0,maxage=0s,disk=true,dumponexit=true,filename=/var/log/test-recording.jfr", | ||
self.assertEqual(["-XX:+UnlockDiagnosticVMOptions", "-XX:+DebugNonSafepoints", | ||
"-XX:StartFlightRecording=maxsize=0,maxage=0s,disk=true,dumponexit=true,filename=/var/log/test-recording.jfr"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/mechanic/telemetry_test.py
Outdated
java_opts) | ||
|
||
def test_sets_options_for_pre_java_9_custom_recording_template(self): | ||
jfr = telemetry.FlightRecorder(telemetry_params={"recording-template": "profile"}, | ||
log_root="/var/log", | ||
java_major_version=random.randint(0, 8)) | ||
java_opts = jfr.java_opts("/var/log/test-recording.jfr") | ||
self.assertEqual("-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -XX:+UnlockCommercialFeatures -XX:+FlightRecorder " | ||
self.assertEqual(["-XX:+UnlockDiagnosticVMOptions", "-XX:+DebugNonSafepoints", "-XX:+UnlockCommercialFeatures", "-XX:+FlightRecorder", | ||
"-XX:FlightRecorderOptions=disk=true,maxage=0s,maxsize=0,dumponexit=true," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment: https://lintlyci.github.io/Flake8Rules/rules/E127.html
tests/mechanic/telemetry_test.py
Outdated
"-XX:StartFlightRecording=maxsize=0,maxage=0s,disk=true,dumponexit=true," | ||
"filename=/var/log/test-recording.jfr,settings=profile", | ||
"filename=/var/log/test-recording.jfr,settings=profile"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this section needs to align with https://lintlyci.github.io/Flake8Rules/rules/E127.html
tests/mechanic/telemetry_test.py
Outdated
java_opts) | ||
|
||
def test_sets_options_for_java_11_or_above_custom_recording_template(self): | ||
jfr = telemetry.FlightRecorder(telemetry_params={"recording-template": "profile"}, | ||
log_root="/var/log", | ||
java_major_version=random.randint(11, 999)) | ||
java_opts = jfr.java_opts("/var/log/test-recording.jfr") | ||
self.assertEqual("-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints " | ||
self.assertEqual(["-XX:+UnlockDiagnosticVMOptions", "-XX:+DebugNonSafepoints", | ||
"-XX:StartFlightRecording=maxsize=0,maxage=0s,disk=true,dumponexit=true," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this section needs to align with https://lintlyci.github.io/Flake8Rules/rules/E127.html
tests/mechanic/telemetry_test.py
Outdated
gc_java_opts = gc.java_opts("/var/log/defaults-node-0.gc.log") | ||
self.assertEqual(7, len(gc_java_opts)) | ||
self.assertEqual(["-Xloggc:/var/log/defaults-node-0.gc.log", "-XX:+PrintGCDetails", "-XX:+PrintGCDateStamps", "-XX:+PrintGCTimeStamps", | ||
"-XX:+PrintGCApplicationStoppedTime", "-XX:+PrintGCApplicationConcurrentTime", "-XX:+PrintTenuringDistribution"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this section needs to align with https://lintlyci.github.io/Flake8Rules/rules/E127.html
Thank you for the review @dliappis . I update the change according to your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you.
By using ES_JAVA_OPTS we can provision a node, run a benchmark, and then “dynamically” (i.e. without reprovisioning) start the node again with telemetry attached. Relates to elastic#697 Relates to elastic#711
By using ES_JAVA_OPTS we can provision a node, run a benchmark, and then “dynamically” (i.e. without reprovisioning) start the node again with telemetry attached. Relates to elastic#697 Relates to elastic#711
Move telemetry devices that currently rely on ES_JAVA_OPTS from the
launcher to the provisioner and instead persist the necessary
information in config/jvm.options
Relates #697