Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Spark 2.3/2.4 in Dr.Elephant #683

Open
ShubhamGupta29 opened this issue Apr 20, 2020 · 50 comments
Open

Support for Spark 2.3/2.4 in Dr.Elephant #683

ShubhamGupta29 opened this issue Apr 20, 2020 · 50 comments
Assignees

Comments

@ShubhamGupta29
Copy link
Contributor

Currently, Dr.Elephant at max supports Spark 2.2.3. We need to support the latest versions of Spark(at least 2.3 and 2.4). This needs several changes, will update the issue as proceeds further.

@mareksimunek
Copy link

@ShubhamGupta29 Just FYI:
I managed to run https://github.com/songgane/dr-elephant/tree/feature/support_spark_2.x
and it worked with spark 2.3+.
There is couple of tests which needs to be fixed (I skipped them).

I have question:

  1. It doesn't show executor memory used
    Screenshot 2020-04-20 at 17 27 21

  2. Same for GC statistics
    Screenshot 2020-04-20 at 17 23 57

Is it because in SHS 2.3 it's not available or there is some work needed to see these metrics.

Anyway glad to see progress for 2.3+ spark.

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek I will surely go through your changes.
Just some questions:

  • which spark version did you use for compilation (was that change done in compile.conf)

  • are you fetching data from EventLogs or SHS's REST API

For your Heuristics related issues, I need to check how are you retrieving and transforming data

@ShubhamGupta29 ShubhamGupta29 changed the title Support for Spark 2.4 in Dr.Elephant Support for Spark 2.3/2.4 in Dr.Elephant Apr 20, 2020
@mareksimunek
Copy link

  1. I used hadoop 2.3.0 and spark 2.1.2
    https://github.com/songgane/dr-elephant/blob/feature/support_spark_2.x/compile.conf

I think, I tried rebase to current master and with higher versions there were more failing tests so I stick with 2.1 and skipped less tests :).

  1. I am fetching with:
  <fetcher>
    <applicationtype>spark</applicationtype>
    <classname>com.linkedin.drelephant.spark.fetchers.SparkFetcher</classname>
    <params>
      <use_rest_for_eventlogs>true</use_rest_for_eventlogs>
      <should_process_logs_locally>true</should_process_logs_locally>
      <event_log_location_uri>/spark2-history/</event_log_location_uri>
      <spark_log_ext>.snappy</spark_log_ext>
    </params>
  </fetcher>

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek the issue you mentioned It doesn't show executor memory used, is it for every job or for some jobs value is available?

@mareksimunek
Copy link

mareksimunek commented Apr 22, 2020

@ShubhamGupta29 every spark job, I suspect it's beacuse of this.
https://github.com/songgane/dr-elephant/blame/feature/support_spark_2.x/app/org/apache/spark/deploy/history/SparkDataCollection.scala#L178

That info.memUsed is only available when the job is running, but I am not sure if Dr.Elephant is fetching this information. And when the job ended the information is gone. Because when I check SHS of completed job there is everywhere Peak memory: 0.

Spark history verison : 2.3.0.2.6.5.0-292 (its 2.3 with some HDP patches)

@xglv1985
Copy link

@ShubhamGupta29 every spark job, I suspect it's beacuse of this.
https://github.com/songgane/dr-elephant/blame/feature/support_spark_2.x/app/org/apache/spark/deploy/history/SparkDataCollection.scala#L178

That info.memUsed is only available when the job is running, but I am not sure if Dr.Elephant is fetching this information. And when the job ended the information is gone. Because when I check SHS of completed job there is everywhere Peak memory: 0.

Spark history verison : 2.3.0.2.6.5.0-292 (its 2.3 with some HDP patches)

@mareksimunek Hello, buddy, I met the same case as yours: mem/executor/storage info can not be fetched from SHS once the job end. Have your problem solved? Thanks.

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek @xglv1985 need one help from you guysin debugging the issue, can you confirm if the value of memUsed is Non-Zero in the response for REST API endpoint [/executors].

@mareksimunek
Copy link

mareksimunek commented Apr 26, 2020

@ShubhamGupta29 yep its zero.
Checked http://someHost:18081/api/v1/applications/application_1587409317223_1104/1/executors

[ {
    "id" : "driver",
    "hostPort" : "someHost:37121",
    "isActive" : true,
    "rddBlocks" : 0,
    "memoryUsed" : 0,
    "diskUsed" : 0,
    "totalCores" : 0,
    "maxTasks" : 0,
    "activeTasks" : 0,
    "failedTasks" : 0,
    "completedTasks" : 0,
    "totalTasks" : 0,
    "totalDuration" : 0,
    "totalGCTime" : 0,
    "totalInputBytes" : 0,
    "totalShuffleRead" : 0,
    "totalShuffleWrite" : 0,
    "isBlacklisted" : false,
    "maxMemory" : 407057203,
    "addTime" : "2020-04-25T21:08:51.911GMT",
    "executorLogs" : {
      "stdout" : "http://someHost:8042/node/containerlogs/container_e54_1587409317223_1104_01_000001/fulltext/stdout?start=-4096",
      "stderr" : "http://someHost:8042/node/containerlogs/container_e54_1587409317223_1104_01_000001/fulltext/stderr?start=-4096"
    },
    "memoryMetrics" : {
      "usedOnHeapStorageMemory" : 0,
      "usedOffHeapStorageMemory" : 0,
      "totalOnHeapStorageMemory" : 407057203,
      "totalOffHeapStorageMemory" : 0
    }
  }, {
    "id" : "9",
    "hostPort" : "someHost2.dev.dszn.cz:33108",
    "isActive" : true,
    "rddBlocks" : 0,
    "memoryUsed" : 0,
    "diskUsed" : 0,
    "totalCores" : 3,
    "maxTasks" : 3,
    "activeTasks" : 0,
    "failedTasks" : 0,
    "completedTasks" : 56,
    "totalTasks" : 56,
    "totalDuration" : 846816,
    "totalGCTime" : 31893,
    "totalInputBytes" : 0,
    "totalShuffleRead" : 661719258,
    "totalShuffleWrite" : 747129542,
    "isBlacklisted" : false,
    "maxMemory" : 3032481792,
    "addTime" : "2020-04-25T21:09:08.100GMT",
    "executorLogs" : {
      "stdout" : "http://someHost2.dev.dszn.cz:8042/node/containerlogs/container_e54_1587409317223_1104_01_000011/fulltext/stdout?start=-4096",
      "stderr" : "http://someHost2.dev.dszn.cz:8042/node/containerlogs/container_e54_1587409317223_1104_01_000011/fulltext/stderr?start=-4096"
    },
    "memoryMetrics" : {
      "usedOnHeapStorageMemory" : 0,
      "usedOffHeapStorageMemory" : 0,
      "totalOnHeapStorageMemory" : 3032481792,
      "totalOffHeapStorageMemory" : 0
    }
  }.....

Correction, its even reporting zero memoryUsed on running job trough SHS rest API. Should I set something to spark.executor.extraJavaOptions? To show me these stats?

From MR its getting memory stats from this setting. Am I right?
mapreduce.task.profile.params= -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s

@xglv1985
Copy link

@mareksimunek @xglv1985 need one help from you guysin debugging the issue, can you confirm if the value of memUsed is Non-Zero in the response for REST API endpoint [/executors].

@ShubhamGupta29 yes, I also proved it. The mem field is 0 in response json

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek @xglv1985 thanks for the prompt response, I am able to support Spark2.3 and will make the changes public soon. I am debugging this memUsed = 0 issue as this problem is still there with Spark2.3. I am debugging the issue and will be in contact with you guys.
One more query I have for you, can you paste here the values you are getting for these metrics in /executor API response.
Metrics:
"memoryMetrics" : { "usedOnHeapStorageMemory" "usedOffHeapStorageMemory" "totalOnHeapStorageMemory" "totalOffHeapStorageMemory" }

@xglv1985
Copy link

@mareksimunek @xglv1985 thanks for the prompt response, I am able to support Spark2.3 and will make the changes public soon. I am debugging this memUsed = 0 issue as this problem is still there with Spark2.3. I am debugging the issue and will be in contact with you guys.
One more query I have for you, can you paste here the values you are getting for these metrics in /executor API response.
Metrics:
"memoryMetrics" : { "usedOnHeapStorageMemory" "usedOffHeapStorageMemory" "totalOnHeapStorageMemory" "totalOffHeapStorageMemory" }

sure:

"memoryMetrics" : {
"usedOnHeapStorageMemory" : 0,
"usedOffHeapStorageMemory" : 0,
"totalOnHeapStorageMemory" : 1099746508,
"totalOffHeapStorageMemory" : 4000000000
}

@xglv1985
Copy link

@mareksimunek @xglv1985 thanks for the prompt response, I am able to support Spark2.3 and will make the changes public soon. I am debugging this memUsed = 0 issue as this problem is still there with Spark2.3. I am debugging the issue and will be in contact with you guys.
One more query I have for you, can you paste here the values you are getting for these metrics in /executor API response.
Metrics:
"memoryMetrics" : { "usedOnHeapStorageMemory" "usedOffHeapStorageMemory" "totalOnHeapStorageMemory" "totalOffHeapStorageMemory" }

By the way, @ShubhamGupta29
I use dr.elephant to analyze spark 2.3 event log, and every job analysis result is as the follow. I found except "Spark Configuration", every field is empty. Is this normal? Thanks!

Spark Configuration
Severity: Moderate [Explain]

spark.application.duration | -1587978750 Seconds
spark.driver.cores | 4
spark.driver.memory | 4 GB
spark.dynamicAllocation.enabled | false
spark.executor.cores | 4
spark.executor.instances | 20
spark.executor.memory | 4 GB
spark.shuffle.service.enabled | false
Spark shuffle service is not enabled.
spark.yarn.driver.memoryOverhead | 0 B
spark.yarn.executor.memoryOverhead | 0 B

Spark Executor Metrics
Severity: None

Executor input bytes distribution | min: 0 B, p25: 0 B, median: 0 B, p75: 0 B, max: 0 B
Executor shuffle read bytes distribution | min: 0 B, p25: 0 B, median: 0 B, p75: 0 B, max: 0 B
Executor shuffle write bytes distribution | min: 0 B, p25: 0 B, median: 0 B, p75: 0 B, max: 0 B
Executor storage memory used distribution | min: 0 B, p25: 0 B, median: 0 B, p75: 0 B, max: 0 B
Executor storage memory utilization rate | 0.000
Executor task time distribution | min: 0 sec, p25: 0 sec, median: 0 sec, p75: 0 sec, max: 0 sec
Executor task time sum | 0
Total executor storage memory allocated | 1.96 GB
Total executor storage memory used | 0 B

Spark Job Metrics
Severity: None

Spark completed jobs count | 0
Spark failed jobs count | 0
Spark failed jobs list |  
Spark job failure rate | 0.000
Spark jobs with high task failure rates

Spark Stage Metrics
Severity: None

Spark completed stages count | 0
Spark failed stages count | 0
Spark stage failure rate | 0.000
Spark stages with high task failure rates |  
Spark stages with long average executor runtimes

Executor GC
Severity: None

GC time to Executor Run time ratio | NaN
Total Executor Runtime | 0
Total GC time | 0

@ShubhamGupta29
Copy link
Contributor Author

@xglv1985 no this is not normal. Can you tell me which branch or source code are you using?

@xglv1985
Copy link

@xglv1985 no this is not normal. Can you tell me which branch or source code are you using?

@ShubhamGupta29
dr-elephant_987

@ShubhamGupta29
Copy link
Contributor Author

Can you provide the link as linkedin/dr-elephant doesn't have any branch named dr-elephant_987

@xglv1985
Copy link

Can you provide the link as linkedin/dr-elephant doesn't have any branch named dr-elephant_987

@ShubhamGupta29 I forked my own dr-elephant from linkedin/dr-elephant master. I only put "SparkFetcher" in my conf xml file, with <use_rest_for_eventlogs>true</use_rest_for_eventlogs>
<should_process_logs_locally>true</should_process_logs_locally>. Is there any other configuration that may cause these empty field? I will debug more deeply. Thanks

@mareksimunek
Copy link

@xglv1985 if you are using current master, you can't see any metrics from spark 2.3+
More in:
#389
Check your logs there will be some parsing error. That's why i am using fork as said above.

That's why there is ongoing work from @ShubhamGupta29 to support this version.

Metrics:
"memoryMetrics" : { "usedOnHeapStorageMemory" "usedOffHeapStorageMemory" "totalOnHeapStorageMemory" "totalOffHeapStorageMemory" }

Thanks for update @ShubhamGupta29 They are already included in my post
#683 (comment)

@xglv1985
Copy link

@xglv1985 if you are using current master, you can't see any metrics from spark 2.3+
More in:
#389
Check your logs there will be some parsing error. That's why i am using fork as said above.

That's why there is ongoing work from @ShubhamGupta29 to support this version.

Metrics:
"memoryMetrics" : { "usedOnHeapStorageMemory" "usedOffHeapStorageMemory" "totalOnHeapStorageMemory" "totalOffHeapStorageMemory" }

Thanks for update @ShubhamGupta29 They are already included in my post
#683 (comment)

@mareksimunek Thanks very much, I saw the same problem with mine, in link you gave.
Then let's looking forward to the updated dr-elephant by @ShubhamGupta29

@ShubhamGupta29
Copy link
Contributor Author

ShubhamGupta29 commented Apr 28, 2020

@mareksimunek @xglv1985, I have made the changes for Spark2.3 (these are the foundation changes, will fix the tests and other cleanups in some time). If possible can you guys try this personal branch, it has changes for Spark2.3.

@mareksimunek
Copy link

@ShubhamGupta29 nice, the ShubhamGupta29/test23 works like a charm. It now even shows GC stats.
executor memory used still not showing, but I suppose if it's not available in SHS it won't be seen in elephant. (Do you have any news if there is something to do to make it available in SHS)

Screenshot 2020-04-29 at 21 33 57

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek working on the same, after going through Spark's code got some idea of why this metric is not getting populated. For now, testing the changes and soon add those to the branch and also trying to support Spark 2.4 too.
@mareksimunek and @xglv1985, can you guys fill the survey in #685, it would be helpful for us to make Dr.Elephant more OS community-friendly.

@xglv1985
Copy link

xglv1985 commented Apr 30, 2020 via email

@ShubhamGupta29
Copy link
Contributor Author

@xglv1985 did you get a chance to use the changes done for Spark2.3? Feedback for the changes will make it easy to start the effort for merging the changes to the master branch for users' ease.

@xglv1985
Copy link

xglv1985 commented Apr 30, 2020 via email

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add spark.eventLog.logBlockUpdates.enabled if not there already and make this value as true.

@mareksimunek
Copy link

@ShubhamGupta29
Hi, thanks for update and sorry for late response.

  • I needed delete some tests to compile.
	deleted:    test/com/linkedin/drelephant/tony/fetchers/TonyFetcherTest.java
	deleted:    test/com/linkedin/drelephant/tuning/PSOParamGeneratorTest.java
  • I missed that it's storage memory which is showing cached RDDs. And it seems it works fine, after setting spark.eventLog.logBlockUpdates.enabled to job :)
  • Is there also way to add peak memory? In which I most interested in.
    I noticed in event log there is:
    {"ID":111,"Name":"internal.metrics.peakExecutionMemory","Update":96381057,"Value":96381057,"Internal":true,"Count Failed Values":true}

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek thanks for the reply and testing out the provided version.

  • TonYFetcherTest doesn't work when compiling locally so it makes sense to remove it. For PSOParamGeneratorTest if you want to try you can fix it by doing pip install inspyred as it fixed the test for me.

  • Glad after setting spark.eventLog.logBlockUpdates.enabled the metric is getting populated for you.

  • I am also looking for a way to provide this metric (Peak Memory Used), can you provide me the event name from which you got this metrics(internal.metrics.peakExecutionMemory).

Also, let me know any other issue you are facing or any suggestion you have for Dr.Elephant. Hope Dr.Elephant is proving useful for you and your team.

@mareksimunek
Copy link

@ShubhamGupta29

  • Event name for internal.metrics.peakExecutionMemory used is called "Event":"SparkListenerTaskEnd"
    But I am not sure if thats it, only juding by its name :). I attached event log from the job.
    eventLogs-application_1587409317223_6508-1.zip

So far it seems it's working like a charm. I am trying to push it through in our team (now its running on small testing cluster) and with working spark metrics it will be much easier to get approval to work on that, thanks for the progress.

Question: Are you using 1 dr elephant installation per cluster or do you have 1 dr elephant analyzing more clusters.

@ShubhamGupta29
Copy link
Contributor Author

The current Dr.Elephant allow the analysis of jobs only from single RM(single cluster).

@xglv1985
Copy link

@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add spark.eventLog.logBlockUpdates.enabled if not there already and make this value as true.

@ShubhamGupta29 First sorry for the late response. Thanking for your branch feature_spark2.3, I now have run it up. This is my screen capture:
dr elephant
The good new is that it has more dimensions than the past versions of dr.elephant. But the detail of each dimension has disappeared, and I will double check the configuration.

@xglv1985
Copy link

xglv1985 commented May 27, 2020

@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add spark.eventLog.logBlockUpdates.enabled if not there already and make this value as true.

@ShubhamGupta29 First sorry for the late response. Thanking for your branch feature_spark2.3, I now have run it up. This is my screen capture:
dr elephant
The good new is that it has more dimensions than the past versions of dr.elephant. But the detail of each dimension has disappeared, and I will double check the configuration.

I have found the reason why the details disappeared. The "feature_2.3" branch use "org.avaje.ebeanorm.avaje-ebeanorm-3.2.4.jar" and "org.avaje.ebeanorm.avaje-ebeanorm-agent-3.2.2.jar", which will lead to details missing. I replaced these two jars with
''avaje-ebeanorm-3.2.2.jar" and "avaje-ebeanorm-agent-3.2.1.jar", which the old version dr.elephant depends on, and then the details came back :)

@ShubhamGupta29
Copy link
Contributor Author

@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add spark.eventLog.logBlockUpdates.enabled if not there already and make this value as true.

@ShubhamGupta29 First sorry for the late response. Thanking for your branch feature_spark2.3, I now have run it up. This is my screen capture:
dr elephant
The good new is that it has more dimensions than the past versions of dr.elephant. But the detail of each dimension has disappeared, and I will double check the configuration.

I have found the reason why the details disappeared. The "feature_2.3" branch use "org.avaje.ebeanorm.avaje-ebeanorm-3.2.4.jar" and "org.avaje.ebeanorm.avaje-ebeanorm-agent-3.2.2.jar", which will lead to details missing. I replaced these two jars with
''avaje-ebeanorm-3.2.2.jar" and "avaje-ebeanorm-agent-3.2.1.jar", which the old version dr.elephant depends on, and then the details came back :)

@xglv1985 did you replace this in the ClassPath?

@xglv1985
Copy link

@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add spark.eventLog.logBlockUpdates.enabled if not there already and make this value as true.

@ShubhamGupta29 First sorry for the late response. Thanking for your branch feature_spark2.3, I now have run it up. This is my screen capture:
dr elephant
The good new is that it has more dimensions than the past versions of dr.elephant. But the detail of each dimension has disappeared, and I will double check the configuration.

I have found the reason why the details disappeared. The "feature_2.3" branch use "org.avaje.ebeanorm.avaje-ebeanorm-3.2.4.jar" and "org.avaje.ebeanorm.avaje-ebeanorm-agent-3.2.2.jar", which will lead to details missing. I replaced these two jars with
''avaje-ebeanorm-3.2.2.jar" and "avaje-ebeanorm-agent-3.2.1.jar", which the old version dr.elephant depends on, and then the details came back :)

@xglv1985 did you replace this in the ClassPath?

Yes, I did.

@mareksimunek
Copy link

@ShubhamGupta29 Hi, did you have time to look into PeakExecutionMemory ?

@ShubhamGupta29
Copy link
Contributor Author

ShubhamGupta29 commented May 28, 2020

@mareksimunek didn't get a chance to look into it but surely will do it over the weekend.

@mareksimunek
Copy link

@ShubhamGupta29 I know you are probably busy, but I hope you will find a way to look at it :).

@ShubhamGupta29
Copy link
Contributor Author

Hi @mareksimunek, sorry was caught up in some other tasks. I am working on an approximate value for PeakExecutionMemory as after a lot of finding I got to know that there is no way of getting this value with making changes to Spark source code. Possibly in the coming week, I will push the changes. Also, let me know if Dr.Elephant's support for Spark2.3 is working fine.

@mareksimunek
Copy link

@ShubhamGupta29 Support for spark 2.3 works fine :)).

@RaphaelDucay
Copy link

@ShubhamGupta29
First of all thanks for all your precious work in order to help adapting Dr elephant to newer spark versions.
I have an hortonworks cluster HDP 2.6.5 ( == hadoop 2.7.3) running :

  • spark 1.6.3 on yarn
  • spark 2.3.2 on yarn

I have a few questions :

  • Do you have a fully working branch / tag / project-fork that i could use in order to make dr elephant work on spark 2.3.2 (which uses scala 2.11 i remind) ?
  • If yes, could you just tell me if there is any changes in term of "config/packaging/deployment" different from the steps described in the "quick setup guide" (from the master branch) in order to make this work for spark 2.3.2 other than adapting the env variables and conf to make them point on correct spark 2.3.2 folders and config ?
  • Finally, is it possible to have one Dr Elephant instance that deals with spark 1.6.3 AND spark 2.3.2 applications and what would be the procedure to make that happen ?

Thanks a lot in advance for your time !

@ShubhamGupta29
Copy link
Contributor Author

Hi @RaphaelDucay ,
I will try to answer you queries in the respective order:

  • We have a branch which compiles and work for Spark2.3 also Scala_2.11 is default for Spark2.3, here is the link. There is nothing different from the Quick Guide to set this branch up. Just update the spark_version to 2.3.2 in compile.conf file.
  • This branch is in Beta and is currently used by considerable number of customers. Kindly provide your feedback too.
  • To keep things simple, I would say that single Dr.Elephant instance cannot have support for Spark_1.6 and Spark_2.3. Kindly tell us your use-case so we can assist in you better way.

@RaphaelDucay
Copy link

RaphaelDucay commented Aug 18, 2020

@ShubhamGupta29 Thanks a lot for the feeedback !
We are making a POC on both

I will keep you updated !

@RaphaelDucay
Copy link

RaphaelDucay commented Aug 25, 2020

@ShubhamGupta29 Ok so we made it for spark 1.6.3
For spark 2.X We are facing issues :
Finally our spark 2 version is 2.3.0
We are using (as you suggested me) this branch https://github.com/ShubhamGupta29/dr-elephant/tree/feature_spark2.3
He are the logs of our failing compilation attempt

error_compile_dr-elephant_spark2 3

[warn] /dr-elephant-sources/app/org/apache/spark/deploy/history/SparkDataCollection.scala:124: abstract type pattern T is unchecked since it is eliminated by erasure
[warn] seq.foreach { case (item: T) => list.add(item)}
[warn] ^
[error] /dr-elephant-sources/app/org/apache/spark/status/CustomAppStatusListener.scala:628: value getPartitions is not a member of org.apache.spark.status.LiveRDD
[error] liveRDD.getPartitions().foreach { case (_, part) =>
[error] ^
[error] /dr-elephant-sources/app/org/apache/spark/status/CustomAppStatusListener.scala:629: value executors is not a member of Any
[error] part.executors.foreach { executorId =>
[error] ^
[error] /dr-elephant-sources/app/org/apache/spark/status/CustomAppStatusListener.scala:639: value getDistributions is not a member of org.apache.spark.status.LiveRDD
[error] liveRDD.getDistributions().foreach { case (executorId, rddDist) =>
[error] ^
[error] /dr-elephant-sources/app/org/apache/spark/status/CustomAppStatusListener.scala:640: type mismatch;
[error] found : Any
[error] required: String
[error] liveExecutors.get(executorId).foreach { exec =>
[error] ^
[warn] one warning found
[error] four errors found
[error] (compile:compileIncremental) Compilation failed

Do you have an idea on how to fix this ?

Thanks in advance !

@yanxiaole
Copy link

@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add spark.eventLog.logBlockUpdates.enabled if not there already and make this value as true.

@ShubhamGupta29 First sorry for the late response. Thanking for your branch feature_spark2.3, I now have run it up. This is my screen capture:
dr elephant
The good new is that it has more dimensions than the past versions of dr.elephant. But the detail of each dimension has disappeared, and I will double check the configuration.

I have found the reason why the details disappeared. The "feature_2.3" branch use "org.avaje.ebeanorm.avaje-ebeanorm-3.2.4.jar" and "org.avaje.ebeanorm.avaje-ebeanorm-agent-3.2.2.jar", which will lead to details missing. I replaced these two jars with
''avaje-ebeanorm-3.2.2.jar" and "avaje-ebeanorm-agent-3.2.1.jar", which the old version dr.elephant depends on, and then the details came back :)

@xglv1985 did you replace this in the ClassPath?

Yes, I did.

hi @ShubhamGupta29 , @xglv1985
is there a more suitable way? right now it seems the jars have to be renamed...

@shagneet330
Copy link

hi @ShubhamGupta29 , @xglv1985 ,
Facing the same issue. Not able to double click on these metric dimensions. Using feature_2.3 branch. How can this be corrected?

@ShubhamGupta29
Copy link
Contributor Author

Hi @shagneet330 the fix is provided in the above comment.
I have found the reason why the details disappeared. The "feature_2.3" branch use "org.avaje.ebeanorm.avaje-ebeanorm-3.2.4.jar" and "org.avaje.ebeanorm.avaje-ebeanorm-agent-3.2.2.jar", which will lead to details missing. I replaced these two jars with ''avaje-ebeanorm-3.2.2.jar" and "avaje-ebeanorm-agent-3.2.1.jar", which the old version dr.elephant depends on, and then the details came back :)

But I would suggest using the latest Ember UI, which would be available if your compilation went well. You can access the new UI by adding new# after Dr.E endpoint. e.g. http://hostname:8080/new#, if you can see UI like below then there should be no issue.
Screenshot 2021-02-02 at 9 45 47 PM

@shagneet330
Copy link

@ShubhamGupta29 Tried with http://hostname:8080/new# but this doesn't seem to load. Are these changes available in "feature_2.3" branch?

@ShubhamGupta29
Copy link
Contributor Author

ShubhamGupta29 commented Feb 3, 2021

It is available in feature_2.3, but your compilation should be successful with Ember. NPM should be available in your system. You see this kind of log during compilation:

`
"############################################################################"

"npm installation found, we'll compile with the new user interface"
"############################################################################"
`
Need to monitor the compilation as you face issues while compiling new UI.

@tcluzhe
Copy link

tcluzhe commented Feb 5, 2021

@ShubhamGupta29 would you help add spark application name in the new UI?
image
image

@Ashnee1990
Copy link

Ashnee1990 commented Jul 8, 2021

@ShubhamGupta29,
I am facing issue while compiling .the feature_spark2.3, master and finalspark23 branch. Could you please help.

Below is the error.

elephant_spark23/dr-elephant/app/org/apache/spark/status/CustomAppStatusListener.scala:628: value getPartitions is not a member of org.apache.spark.status.LiveRDD
[error] liveRDD.getPartitions().foreach { case (_, part) =>
[error] ^
[error] elephant_spark23/dr-elephant/app/org/apache/spark/status/CustomAppStatusListener.scala:629: value executors is not a member of Any
[error] part.executors.foreach { executorId =>
[error] ^
[error] elephant_spark23/dr-elephant/app/org/apache/spark/status/CustomAppStatusListener.scala:639: value getDistributions is not a member of org.apache.spark.status.LiveRDD
[error] liveRDD.getDistributions().foreach { case (executorId, rddDist) =>
[error] ^
[error] elephant_spark23/dr-elephant/app/org/apache/spark/status/CustomAppStatusListener.scala:640: type mismatch;
[error] found : Any
[error] required: String
[error] liveExecutors.get(executorId).foreach { exec =>
[error] ^
[warn] one warning found
[error] four errors found
[error] (compile:compileIncremental) Compilation failed

@Ashnee1990
Copy link

Ashnee1990 commented Jul 13, 2021

@ShubhamGupta29 @xglv1985 ,

Do we have resolved Peak Memory Used . Because I am still not able to see it.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants