-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto tuning feature enhancements #379
Conversation
Support for spark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to merge after addressing the minor comments!
Thanks for cleaning up the code as well.
* @param jobExecId String jobExecId of the execution to which penalty has to be applied | ||
*/ | ||
private void applyPenalty(String jobExecId) { | ||
Integer penaltyConstant = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment in the code on why 3 was chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
scripts/pso/pso_param_generation.py
Outdated
elif param_name[i] == PARAM_PIG_MAX_COMBINED_SPLIT_SIZE: | ||
max_combined_split_size_index = i | ||
pig_max_combined_split_size_index = i | ||
elif param_name[i] == PARAM_SPARK_EXECUTOR_MEMORY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned offline, it would make sense to send this together with the Spark changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
conf/evolutions/default/6.sql
Outdated
) ENGINE=InnoDB; | ||
|
||
|
||
# --- !Downs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write the downs section for the alter and insert statements as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
conf/evolutions/default/6.sql
Outdated
INSERT INTO tuning_parameter VALUES (11,'spark.memory.fraction',2, 0.6, 0.1, 0.9, 0.1, 0, current_timestamp(0), current_timestamp(0)); | ||
INSERT INTO tuning_parameter VALUES (12,'spark.memory.storageFraction', 2, 0.5, 0.1, 0.9, 0.1, 0, current_timestamp(0), current_timestamp(0)); | ||
INSERT INTO tuning_parameter VALUES (13,'spark.executor.cores', 2, 1 , 1, 1, 1, 0, current_timestamp(0), current_timestamp(0)); | ||
INSERT INTO tuning_parameter VALUES (14,'spark.yarn.executor.memoryOverhead', 2, 384, 384, 1024, 100, 0, current_timestamp(0), current_timestamp(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment on the background behind these constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually not needed now (it was needed for spark tuning) so I removed it altogether.
scripts/pso/pso_param_generation.py
Outdated
@@ -38,8 +39,14 @@ | |||
PARAM_MAPREDUCE_MAP_JAVA_OPTS = 'mapreduce.map.java.opts' | |||
PARAM_MAPREDUCE_REDUCE_JAVA_OPTS = 'mapreduce.reduce.java.opts' | |||
|
|||
PARAM_SPARK_EXECUTOR_MEMORY = "spark.executor.memory" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can include these with the spark changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
logger.info("Constraint violated: Sort memory > 60% of map memory"); | ||
violations++; | ||
} | ||
if (mrMapMemory - mrSortMemory < 768) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to define variables for these constants like 768.
protected void updateExecutionMetrics(List<TuningJobExecution> completedExecutions) { | ||
for (TuningJobExecution tuningJobExecution : completedExecutions) { | ||
private void updateExecutionMetrics(List<TuningJobExecution> completedExecutions) { | ||
Integer penaltyConstant = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.eq(TuningJobDefinition.TABLE.job + '.' + JobDefinition.TABLE.id, jobDefinition.id) | ||
.findUnique(); | ||
if (tuningJobDefinition.tuningEnabled == 1) { | ||
tuningJobDefinition.tuningEnabled = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to log this event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Done.
* @return true if the median gain is negative, else false | ||
*/ | ||
private boolean isMedianGainNegative(List<JobExecution> jobExecutions) { | ||
int num_fitness_for_median = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explanation added in java docs.
This reverts commit d3fb6ba.
This reverts commit 38c4d8f.
Auto tuning feature enhancements: Tuning auto switch off when: * parameter converges * the maximum number of tuning iterations is reached * no gain in cost function Remembers and returns the best parameter set when: * tuning is switched off * no new parameter suggestion exists for the job * execution has failed and a retry is attempted Parameters being tune now attain only discrete values defined by the step size Bug fixes
This reverts commit d3fb6ba.
This reverts commit 1b36348.
This reverts commit d3fb6ba.
Auto tuning feature enhancements:
File-wise summary of changes: