-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dr. Elephant Tez Support working patch #313
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good from Dr. elephant design perspective. Did not look at the accuracy of fetcher logic or heuristics code.
@akshayrai can you take a look as well ?
private List<Long> finishTimes = new ArrayList<Long>(); | ||
private List<Long> durations = new ArrayList<Long>(); | ||
|
||
private static final double MEMORY_BUFFER = 1.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make these configurable
@@ -29,6 +29,13 @@ | |||
</fetcher> | |||
--> | |||
<fetchers> | |||
<!-- | |||
REST based fetcher for Tez jobs which pulls job metrics and data from Timeline Server API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this dependency in the docs and setup instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chinmayms, I just quickly scanned through the PR and overall it looks good. Couple of issues to discuss.
- Are all the Heuristics copied over from MR? If so, can we consider maintaining a single copy of the heuristics rather than so much of duplicate code?
- Please apply and use the Apache license header in all the files.
- Use 2 space formatting throughout.
- It would be great if someone with Tez background can also take a look at this PR.
@akshayrai , thanks for your review. Addressing your issues one by one. Let me know your thoughts.
|
|
||
public TezFetcher(FetcherConfigurationData fetcherConfData) throws IOException { | ||
this._fetcherConfigurationData = fetcherConfData; | ||
final String applicationHistoryAddr = new Configuration().get("yarn.timeline-service.webapp.address"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare yarn.timeline-service.webapp.address as a private static final variable.
} | ||
} | ||
|
||
final class ThreadContextMR2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThreadContextMR2 is also used by MapReduceFetcherHadoop2. Refactor code here so there is only one instance of ThreadContextMR2.
} | ||
} | ||
|
||
if(mapperListAggregate.isEmpty() && reducerListAggregate.isEmpty()){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move this check to after line 113 with a something like
if (state.equals("FAILED") {
jobData.setSucceeded(false);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
LGTM.
Thanks for writing this change.
@chinmayms Our team at Comcast also has a Tez PR and we're also working with the Pepperdata folks to roll this functionality in. Would you be willing to have a short call at at a convenient time with myself and a couple of folks here to maybe combine forces? |
@ray-harrison |
@chinmayms Can we do it on Wednesday please. 1:30PM EST? I will send an invite accordingly. |
@ray-harrison, and @chrevanthreddy Thanks for initiating the conversation. I would also like to be part of this discussion, can you please also send me an invite too. |
@chrevanthreddy Sure. 1:30 EST on Wednesday works. Thanks |
@shkhrgpt Could you please let me know where I can send you the invite? @chinmayms I have sent the invite to your Gmail Id. |
@chrevanthreddy Please send it to sgupta@pepperdata.com |
+1 LGTM |
@akshayrai @shankar37 |
+1 for merging. |
This reverts commit a0470a3.
…tribution. This reverts commit e3fd598. Co-authored-by: Abhishek Das <abhishekdas99@users.noreply.github.com>
This reverts commit a0470a3.
…uding attribution. This reverts commit e3fd598. Co-authored-by: Abhishek Das <abhishekdas99@users.noreply.github.com>
* Revert "Dr. Elephant Tez Support working patch (#313)" This reverts commit a0470a3. * Rerevert "Dr. Elephant Tez Support working patch (#313)" including attribution. This reverts commit e3fd598. Co-authored-by: Abhishek Das <abhishekdas99@users.noreply.github.com> * Auto tuning: Support for parameter set multi-try (#386) * Changes in some of the Spark Heuristics * Adding test for changes executor gc heuristic and unified memory heuristic * Update ExecutorGcHeuristic.scala * Update UnifiedMemoryHeuristic.scala * Changed some hard coded values to variables * Due to strict inequality changing the other thereshold levels for executor and driver
…uding attribution." This reverts commit d5476c1.
This reverts commit 8d7b64d.
This reverts commit a0470a3.
…uding attribution. This reverts commit e3fd598. Co-authored-by: Abhishek Das <abhishekdas99@users.noreply.github.com>
* Revert "Dr. Elephant Tez Support working patch (linkedin#313)" This reverts commit a0470a3. * Rerevert "Dr. Elephant Tez Support working patch (linkedin#313)" including attribution. This reverts commit e3fd598. Co-authored-by: Abhishek Das <abhishekdas99@users.noreply.github.com> * Auto tuning: Support for parameter set multi-try (linkedin#386) * Changes in some of the Spark Heuristics * Adding test for changes executor gc heuristic and unified memory heuristic * Update ExecutorGcHeuristic.scala * Update UnifiedMemoryHeuristic.scala * Changed some hard coded values to variables * Due to strict inequality changing the other thereshold levels for executor and driver
* Revert "Dr. Elephant Tez Support working patch (#313)" This reverts commit a0470a3. * Rerevert "Dr. Elephant Tez Support working patch (#313)" including attribution. This reverts commit e3fd598. Co-authored-by: Abhishek Das <abhishekdas99@users.noreply.github.com> * Auto tuning: Support for parameter set multi-try (#386) * Changes in some of the Spark Heuristics * Adding test for changes executor gc heuristic and unified memory heuristic * Update ExecutorGcHeuristic.scala * Update UnifiedMemoryHeuristic.scala * Changed some hard coded values to variables * Due to strict inequality changing the other thereshold levels for executor and driver
Dr. Elephant Version with support for Tez based on patch (PR#278).
Summary:
Working Version with Tez Support currently running on Electronic Arts Production Data Infrastructure.
Future Work:
Fundamentally one Tez application when returned by RM may contain multiple DAGs, the original Dr. Elephant Interfaces do not allow for us to go any lower than application level as is the case for MapReduce. Need design changes to further allow Vertex, Edge level analysis.