【应用实践】 Linkis集成大数据计算引擎之战(一) #3364
Ritakang0451
started this conversation in
Solicit Articles(征文)
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
作者: 贰贰贰贰
微信: disheng
原文地址:https://mp.weixin.qq.com/s/o8FCiC2vjdb9fokLbyMi-w
之前一篇文章部署好了Linkis,并且提供以下几个通用的模块组件,用于解决抽象和拆解几乎绝大部分的大数据平台层问题。下面,我们就要开起集成计算组件之旅了!
计算引擎的集成之旅
由于这次没有安装前端页面,linkis提供了cli来验证各个引擎是否启动正常。各个脚本都在bin目录中
1 hive引擎
直接上踩坑过程吧
好吧修改一下脚本里面的java目录
执行失败?什么问题。。。开始翻看日志 linkis-cg-engineplugin.log
突然想到,哦!我之前好像把hive-1.2.1改成了1.1.0了。于是把cli脚本里的hive版本修改了一下,再次执行
再次报错,这次提示是HADOOP_CONF_DIR的问题。去日志里验证一下,发现报错是在linkis-cg-engineconnmanager.log。看来已经通过了引擎路由,是具体引擎执行端报的错。
看了下linkis-env.sh文件,明明已经配置上了,为什么没有读取到呢?是文件权限的问题吗?把文件权限改成777再次尝试,发现之前的报错信息消失了,但是依旧执行失败。再次查看日志,发现还是HADOOP_CONF_DIR的问题。不死心的我重启了所有应用,发现还是有问题。没办法,只能去代码里找答案了。
这个TODO让我感到有点慌。。。报错是在这里,那么环境变量从哪里加载的呢?一直追溯到JavaProcessEngineConnLaunchBuilder,发现
###HADOOP CONF DIR
export HADOOP_CONF_DIR=/etc/hadoop/conf
###HIVE CONF DIR
export HIVE_CONF_DIR=/etc/hive/conf
###SPARK CONF DIR
export SPARK_CONF_DIR=/opt/mobdata/spark/spark-2.4.3.mob1-bin-2.6.5/conf
[codeweaver@bd15-21-32-217 bin]$ ./linkis-cli-hive -code "SELECT * from mob_bg_devops.servers_exps_weekly_with_wh;" -submitUser codeweaver -proxyUser codeweaver
[INFO] LogFile path: /home/codeweaver/linkis/logs/linkis-cli//linkis-client.codeweaver.log.20211228162640335698166
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] connecting to linkis gateway:http://127.0.0.1:9001
JobId:10
TaskId:10
ExecId:exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_hive_5
[INFO] Job is successfully submitted!
2021-12-28 16:26:42.026 INFO Program is substituting variables for you
2021-12-28 16:26:42.026 INFO Variables substitution ended successfully
2021-12-28 16:26:42.026 WARN You submitted a sql without limit, DSS will add limit 5000 to your sql
2021-12-28 16:26:42.026 INFO SQL code check has passed
job is scheduled.
2021-12-28 16:26:42.026 INFO Your job is Scheduled. Please wait it to run.
Your job is being scheduled by orchestrator.
Job with jobId : LINKISCLI_codeweaver_hive_5 and execID : LINKISCLI_codeweaver_hive_5 submitted
2021-12-28 16:26:42.026 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
SELECT * from mob_bg_devops.servers_exps_weekly_with_wh limit 5000
SCRIPT CODE
2021-12-28 16:26:42.026 INFO Your job is accepted, jobID is LINKISCLI_codeweaver_hive_5 and taskID is 10 in ServiceInstance(linkis-cg-entrance, bd15-21-32-217:9104). Please wait it to be scheduled
2021-12-28 16:26:42.026 INFO job is running.
2021-12-28 16:26:42.026 INFO Your job is Running now. Please wait it to complete.
Job with jobGroupId : 10 and subJobId : 10 was submitted to Orchestrator.
2021-12-28 16:26:42.026 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:26:43.026 INFO Retry---success to rebuild task node:astJob_5_codeExec_5, ready to execute new retry-task:astJob_5_retry_30, current age is 1
2021-12-28 16:26:53.026 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:26:54.026 INFO Retry---success to rebuild task node:astJob_5_retry_30, ready to execute new retry-task:astJob_5_retry_30, current age is 2
2021-12-28 16:27:04.027 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:27:04.027 INFO Retry---success to rebuild task node:astJob_5_retry_31, ready to execute new retry-task:astJob_5_retry_31, current age is 3
2021-12-28 16:27:14.027 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:27:15.027 INFO Retry---success to rebuild task node:astJob_5_retry_32, ready to execute new retry-task:astJob_5_retry_32, current age is 4
2021-12-28 16:27:25.027 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:27:26.027 INFO Retry---success to rebuild task node:astJob_5_retry_33, ready to execute new retry-task:astJob_5_retry_33, current age is 5
2021-12-28 16:27:36.027 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:27:37.027 INFO Retry---success to rebuild task node:astJob_5_retry_34, ready to execute new retry-task:astJob_5_retry_34, current age is 6
2021-12-28 16:27:47.027 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:27:47.027 INFO Retry---success to rebuild task node:astJob_5_retry_35, ready to execute new retry-task:astJob_5_retry_35, current age is 7
2021-12-28 16:27:57.027 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:27:58.027 INFO Retry---success to rebuild task node:astJob_5_retry_36, ready to execute new retry-task:astJob_5_retry_36, current age is 8
2021-12-28 16:28:08.028 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:28:09.028 INFO Retry---success to rebuild task node:astJob_5_retry_37, ready to execute new retry-task:astJob_5_retry_37, current age is 9
2021-12-28 16:28:19.028 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:28:20.028 INFO Retry---success to rebuild task node:astJob_5_retry_38, ready to execute new retry-task:astJob_5_retry_38, current age is 10
2021-12-28 16:28:30.028 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 16:28:30.028 ERROR Task is Failed,errorMsg: ask Engine failed + errCode: 12003 ,desc: bd15-21-32-217:9101_89 Failed to async get EngineNodeLinkisRetryException: errCode: 30002 ,desc: 资源不足,请重试: errCode: 11012 ,desc: CPU resources are insufficient, to reduce the number of driver cores(CPU资源不足,建议调小驱动核数) ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9104 ,serviceKind: linkis-cg-entrance
2021-12-28 16:28:31.028 INFO job is completed.
2021-12-28 16:28:31.028 INFO Task creation time(任务创建时间): 2021-12-28 16:26:41, Task scheduling time(任务调度时间): 2021-12-28 16:26:42, Task start time(任务开始时间): 2021-12-28 16:26:42, Mission end time(任务结束时间): 2021-12-28 16:28:31
2021-12-28 16:28:31.028 INFO Your mission(您的任务) 10 The total time spent is(总耗时时间为): 1.8 分钟
2021-12-28 16:28:31.028 INFO Sorry. Your job completed with a status Failed. You can view logs for the reason.
[INFO] Job failed! Will not try get execute result.
============Result:================
TaskId:10
ExecId: exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_hive_5
User:codeweaver
Current job status:FAILED
extraMsg:
errCode: 11012
errDesc: 远程服务器CPU资源不足
############Execute Error!!!########
2021-12-28 16:28:28,720 INFO LinkisJobLogPresenter(89) - Job is still running, status=RUNNING, progress=0.0%
2021-12-28 16:28:30,710 INFO LinkisSubmitExecutor(101) -
2021-12-28 16:28:32,743 INFO LinkisSubmitExecutor(101) -
2021-12-28 16:28:34,774 WARN SyncSubmission(154) - Exception thrown when trying to query final result. Status will change to FAILED
com.webank.wedatasphere.linkis.cli.core.exception.ExecutorException: EXE0021,Error occured during execution: Get ResultSet Failed: job Status is not "Succeed", .
at com.webank.wedatasphere.linkis.cli.application.driver.UjesClientDriver.queryResultSetPaths(UjesClientDriver.java:428) ~[linkis-cli-application-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.cli.application.interactor.execution.executor.LinkisSubmitExecutor.doGetFinalResult(LinkisSubmitExecutor.java:173) ~[linkis-cli-application-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.cli.core.interactor.execution.SyncSubmission.ExecWithAsyncBackend(SyncSubmission.java:152) [linkis-cli-core-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.cli.core.interactor.execution.SyncSubmission.execute(SyncSubmission.java:76) [linkis-cli-core-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.cli.application.LinkisClientApplication.exec(LinkisClientApplication.java:349) [linkis-cli-application-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.cli.application.LinkisClientApplication.main(LinkisClientApplication.java:381) [linkis-cli-application-1.0.2.jar:?]
2021-12-28 16:28:34,804 INFO LinkisSubmitExecutor(101) -
2021-12-28 16:28:35,285 INFO LinkisJobLogPresenter(89) - Job is still running, status=FAILED, progress=100.0%
2021-12-28 16:28:38,806 INFO LinkisJobResultPresenter(57) - Job status is not success but 'FAILED'. Will not try to retrieve any Result
54e1b9c0-d4dc-4be9-a49c-4b5f3597f9c8:sudo: sorry, you must have a tty to run sudo
vi /etc/sudoers (最好用visudo命令)
注释掉 Default requiretty 一行
#Default requiretty
[codeweaver@bd15-21-32-217 bin]$ ./linkis-cli-hive -code "SELECT * from mob_bg_devops.servers_exps_weekly_with_wh;" -submitUser codeweaver -proxyUser codeweaver
[INFO] LogFile path: /home/codeweaver/linkis/logs/linkis-cli//linkis-client.codeweaver.log.20211228183116651149061
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] connecting to linkis gateway:http://127.0.0.1:9001
JobId:22
TaskId:22
ExecId:exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_hive_0
[INFO] Job is successfully submitted!
2021-12-28 18:31:19.031 INFO Program is substituting variables for you
2021-12-28 18:31:19.031 INFO Variables substitution ended successfully
2021-12-28 18:31:20.031 WARN You submitted a sql without limit, DSS will add limit 5000 to your sql
2021-12-28 18:31:20.031 INFO SQL code check has passed
job is scheduled.
2021-12-28 18:31:21.031 INFO Your job is Scheduled. Please wait it to run.
Job with jobId : LINKISCLI_codeweaver_hive_0 and execID : LINKISCLI_codeweaver_hive_0 submitted
Your job is being scheduled by orchestrator.
2021-12-28 18:31:21.031 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
SELECT * from mob_bg_devops.servers_exps_weekly_with_wh limit 5000
SCRIPT CODE
2021-12-28 18:31:21.031 INFO Your job is accepted, jobID is LINKISCLI_codeweaver_hive_0 and taskID is 22 in ServiceInstance(linkis-cg-entrance, bd15-21-32-217:9104). Please wait it to be scheduled
2021-12-28 18:31:21.031 INFO job is running.
2021-12-28 18:31:21.031 INFO Your job is Running now. Please wait it to complete.
Job with jobGroupId : 22 and subJobId : 22 was submitted to Orchestrator.
2021-12-28 18:31:21.031 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:31:43.031 INFO EngineConn local log path: ServiceInstance(linkis-cg-engineconn, bd15-21-32-217:26052) /tmp/codeweaver/linkis_dev/codeweaver/workDir/1c3da121-8e1e-4b3f-bbb9-1e09876ae96c/logs
HiveEngineExecutor_0 >> SELECT * from mob_bg_devops.servers_exps_weekly_with_wh limit 5000
2021-12-28 18:31:44.383 ERROR [Linkis-Default-Scheduler-Thread-3] com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveEngineConnExecutor 200 com$webank$wedatasphere$linkis$engineplugin$hive$executor$HiveEngineConnExecutor$$executeHQL - query failed, reason : java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_181]
at scala.collection.immutable.Range.foreach(Range.scala:160) [scala-library-2.11.12.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException$NoNodeException
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_181]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_181]
... 43 more
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_181]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_181]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[?:1.8.0_181]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_181]
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_181]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_181]
... 43 more
2021-12-28 18:31:44.410 ERROR [Linkis-Default-Scheduler-Thread-3] com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveEngineConnExecutor 57 error - execute code failed! java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_181]
at scala.collection.immutable.Range.foreach(Range.scala:160) [scala-library-2.11.12.jar:?]
at com.webank.wedatasphere.linkis.engineconn.acessible.executor.entity.AccessibleExecutor.ensureIdle(AccessibleExecutor.scala:54) [linkis-accessible-executor-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.acessible.executor.entity.AccessibleExecutor.ensureIdle(AccessibleExecutor.scala:48) [linkis-accessible-executor-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor.ensureOp(ComputationExecutor.scala:133) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor.execute(ComputationExecutor.scala:236) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl.com$webank$wedatasphere$linkis$engineconn$computation$executor$service$TaskExecutionServiceImpl$$executeTask(TaskExecutionServiceImpl.scala:239) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1$$anonfun$run$1.apply$mcV$sp(TaskExecutionServiceImpl.scala:172) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1$$anonfun$run$1.apply(TaskExecutionServiceImpl.scala:170) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1$$anonfun$run$1.apply(TaskExecutionServiceImpl.scala:170) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryCatch(Utils.scala:39) [linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryAndWarn(Utils.scala:68) [linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1.run(TaskExecutionServiceImpl.scala:170) [linkis-computation-engineconn-1.0.2.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException$NoNodeException
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_181]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_181]
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2013) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1978) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.getLockManager(DummyTxnManager.java:70) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.acquireLocks(DummyTxnManager.java:101) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:984) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) ~[hive-exec-1.1.0.jar:1.1.0]
... 43 more
Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException$NoNodeException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_181]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_181]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[?:1.8.0_181]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_181]
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_181]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_181]
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2013) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1978) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.getLockManager(DummyTxnManager.java:70) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.acquireLocks(DummyTxnManager.java:101) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:984) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) ~[hive-exec-1.1.0.jar:1.1.0]
... 43 more
2021-12-28 18:31:44.428 ERROR [Linkis-Default-Scheduler-Thread-3] com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl 57 error - null java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
at com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveDriverProxy.run(HiveEngineConnExecutor.scala:456) ~[linkis-engineplugin-hive-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveEngineConnExecutor.com$webank$wedatasphere$linkis$engineplugin$hive$executor$HiveEngineConnExecutor$$executeHQL(HiveEngineConnExecutor.scala:163) ~[linkis-engineplugin-hive-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveEngineConnExecutor$$anon$1.run(HiveEngineConnExecutor.scala:127) ~[linkis-engineplugin-hive-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveEngineConnExecutor$$anon$1.run(HiveEngineConnExecutor.scala:120) ~[linkis-engineplugin-hive-1.0.2.jar:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_181]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) ~[hadoop-common-2.6.0.jar:?]
at com.webank.wedatasphere.linkis.engineplugin.hive.executor.HiveEngineConnExecutor.executeLine(HiveEngineConnExecutor.scala:120) ~[linkis-engineplugin-hive-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$toExecuteTask$2$$anonfun$apply$10$$anonfun$apply$11.apply(ComputationExecutor.scala:179) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$toExecuteTask$2$$anonfun$apply$10$$anonfun$apply$11.apply(ComputationExecutor.scala:178) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryCatch(Utils.scala:39) ~[linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$toExecuteTask$2$$anonfun$apply$10.apply(ComputationExecutor.scala:180) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$toExecuteTask$2$$anonfun$apply$10.apply(ComputationExecutor.scala:174) ~[linkis-computation-engineconn-1.0.2.jar:?]
at scala.collection.immutable.Range.foreach(Range.scala:160) ~[scala-library-2.11.12.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$toExecuteTask$2.apply(ComputationExecutor.scala:173) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$toExecuteTask$2.apply(ComputationExecutor.scala:149) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryFinally(Utils.scala:60) ~[linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor.toExecuteTask(ComputationExecutor.scala:222) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$3.apply(ComputationExecutor.scala:237) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor$$anonfun$3.apply(ComputationExecutor.scala:237) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryFinally(Utils.scala:60) ~[linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.acessible.executor.entity.AccessibleExecutor.ensureIdle(AccessibleExecutor.scala:54) ~[linkis-accessible-executor-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.acessible.executor.entity.AccessibleExecutor.ensureIdle(AccessibleExecutor.scala:48) ~[linkis-accessible-executor-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor.ensureOp(ComputationExecutor.scala:133) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.execute.ComputationExecutor.execute(ComputationExecutor.scala:236) ~[linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl.com$webank$wedatasphere$linkis$engineconn$computation$executor$service$TaskExecutionServiceImpl$$executeTask(TaskExecutionServiceImpl.scala:239) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1$$anonfun$run$1.apply$mcV$sp(TaskExecutionServiceImpl.scala:172) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1$$anonfun$run$1.apply(TaskExecutionServiceImpl.scala:170) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1$$anonfun$run$1.apply(TaskExecutionServiceImpl.scala:170) [linkis-computation-engineconn-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryCatch(Utils.scala:39) [linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.common.utils.Utils$.tryAndWarn(Utils.scala:68) [linkis-common-1.0.2.jar:?]
at com.webank.wedatasphere.linkis.engineconn.computation.executor.service.TaskExecutionServiceImpl$$anon$1.run(TaskExecutionServiceImpl.scala:170) [linkis-computation-engineconn-1.0.2.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException$NoNodeException
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_181]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_181]
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2013) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1978) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.getLockManager(DummyTxnManager.java:70) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.acquireLocks(DummyTxnManager.java:101) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:984) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) ~[hive-exec-1.1.0.jar:1.1.0]
... 43 more
Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException$NoNodeException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_181]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_181]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[?:1.8.0_181]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_181]
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_181]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_181]
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2013) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1978) ~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.getLockManager(DummyTxnManager.java:70) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager.acquireLocks(DummyTxnManager.java:101) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:984) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) ~[hive-exec-1.1.0.jar:1.1.0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) ~[hive-exec-1.1.0.jar:1.1.0]
... 43 more
2021-12-28 18:31:44.031 ERROR Task is Failed,errorMsg: null
2021-12-28 18:31:44.031 INFO job is completed.
2021-12-28 18:31:44.031 INFO Task creation time(任务创建时间): 2021-12-28 18:31:19, Task scheduling time(任务调度时间): 2021-12-28 18:31:21, Task start time(任务开始时间): 2021-12-28 18:31:21, Mission end time(任务结束时间): 2021-12-28 18:31:44
2021-12-28 18:31:44.031 INFO Your mission(您的任务) 22 The total time spent is(总耗时时间为): 25.6 秒
2021-12-28 18:31:44.031 INFO Sorry. Your job completed with a status Failed. You can view logs for the reason.
[INFO] Job failed! Will not try get execute result.
============Result:================
TaskId:22
ExecId: exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_hive_0
User:codeweaver
Current job status:FAILED
extraMsg:
errDesc: 21304, Task is Failed,errorMsg: null
############Execute Error!!!########
[codeweaver@bd15-21-32-217 bin]$ ./linkis-cli-spark-sql -code "SELECT * from mob_bg_devops.servers_exps_weekly_with_wh;" -submitUser codeweaver -proxyUser codeweaver
[INFO] LogFile path: /home/codeweaver/linkis/logs/linkis-cli//linkis-client.codeweaver.log.20211228174733617974682
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] connecting to linkis gateway:http://127.0.0.1:9001
JobId:3
TaskId:3
ExecId:exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_spark_1
[INFO] Job is successfully submitted!
2021-12-28 17:47:35.047 INFO Program is substituting variables for you
2021-12-28 17:47:35.047 INFO Variables substitution ended successfully
2021-12-28 17:47:35.047 WARN You submitted a sql without limit, DSS will add limit 5000 to your sql
2021-12-28 17:47:35.047 INFO SQL code check has passed
job is scheduled.
2021-12-28 17:47:36.047 INFO Your job is Scheduled. Please wait it to run.
Your job is being scheduled by orchestrator.
Job with jobId : LINKISCLI_codeweaver_spark_1 and execID : LINKISCLI_codeweaver_spark_1 submitted
2021-12-28 17:47:36.047 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
SELECT * from mob_bg_devops.servers_exps_weekly_with_wh limit 5000
SCRIPT CODE
2021-12-28 17:47:36.047 INFO Your job is accepted, jobID is LINKISCLI_codeweaver_spark_1 and taskID is 3 in ServiceInstance(linkis-cg-entrance, bd15-21-32-217:9104). Please wait it to be scheduled
2021-12-28 17:47:36.047 INFO job is running.
2021-12-28 17:47:36.047 INFO Your job is Running now. Please wait it to complete.
Job with jobGroupId : 3 and subJobId : 3 was submitted to Orchestrator.
2021-12-28 17:47:36.047 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 17:47:36.047 ERROR Task is Failed,errorMsg: errCode: 12003 ,desc: bd15-21-32-217:9101_2 Failed to async get EngineNode RMErrorException: errCode: 11006 ,desc: Failed to request external resourceRMWarnException: errCode: 11006 ,desc: queue ide is not exists in YARN. ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9104 ,serviceKind: linkis-cg-entrance
2021-12-28 17:47:37.047 INFO job is completed.
2021-12-28 17:47:37.047 INFO Task creation time(任务创建时间): 2021-12-28 17:47:35, Task scheduling time(任务调度时间): 2021-12-28 17:47:36, Task start time(任务开始时间): 2021-12-28 17:47:36, Mission end time(任务结束时间): 2021-12-28 17:47:37
2021-12-28 17:47:37.047 INFO Your mission(您的任务) 3 The total time spent is(总耗时时间为): 1.8 秒
2021-12-28 17:47:37.047 INFO Sorry. Your job completed with a status Failed. You can view logs for the reason.
[INFO] Job failed! Will not try get execute result.
============Result:================
TaskId:3
ExecId: exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_spark_1
User:codeweaver
Current job status:FAILED
extraMsg:
errCode: 10001
errDesc: 会话创建失败,ide队列不存在,请检查队列设置是否正确
############Execute Error!!!########
2021-12-28 17:47:36.610 [INFO ] [ForkJoinPool-1-worker-7 ] c.w.w.l.m.a.s.e.DefaultEngineAskEngineService (45) [info] - Failed to async(bd15-21-32-217:9101_2) createEngine com.webank.wedatasphere.linkis.resourcemanager.exception.RMErrorException: errCode: 11006 ,desc: Failed to request external resourceRMWarnException: errCode: 11006 ,desc: queue ide is not exists in YARN. ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager
[codeweaver@bd15-21-32-217 bin]$ ./linkis-cli-spark-sql -code "SELECT * from mob_bg_devops.servers_exps_weekly_with_wh;" -submitUser codeweaver -proxyUser codeweaver --queue default
[INFO] LogFile path: /home/codeweaver/linkis/logs/linkis-cli//linkis-client.codeweaver.log.20211228185636659565504
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] connecting to linkis gateway:http://127.0.0.1:9001
JobId:27
TaskId:27
ExecId:exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_spark_0
[INFO] Job is successfully submitted!
2021-12-28 18:56:38.056 INFO Program is substituting variables for you
2021-12-28 18:56:38.056 INFO Variables substitution ended successfully
2021-12-28 18:56:38.056 WARN You submitted a sql without limit, DSS will add limit 5000 to your sql
2021-12-28 18:56:38.056 INFO SQL code check has passed
job is scheduled.
2021-12-28 18:56:38.056 INFO Your job is Scheduled. Please wait it to run.
Your job is being scheduled by orchestrator.
Job with jobId : LINKISCLI_codeweaver_spark_0 and execID : LINKISCLI_codeweaver_spark_0 submitted
2021-12-28 18:56:38.056 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
SELECT * from mob_bg_devops.servers_exps_weekly_with_wh limit 5000
SCRIPT CODE
2021-12-28 18:56:38.056 INFO Your job is accepted, jobID is LINKISCLI_codeweaver_spark_0 and taskID is 27 in ServiceInstance(linkis-cg-entrance, bd15-21-32-217:9104). Please wait it to be scheduled
2021-12-28 18:56:38.056 INFO job is running.
2021-12-28 18:56:38.056 INFO Your job is Running now. Please wait it to complete.
Job with jobGroupId : 27 and subJobId : 27 was submitted to Orchestrator.
2021-12-28 18:56:38.056 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:56:40.056 INFO Retry---success to rebuild task node:astJob_2_codeExec_2, ready to execute new retry-task:astJob_2_retry_0, current age is 1
2021-12-28 18:56:50.056 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:56:50.056 INFO Retry---success to rebuild task node:astJob_2_retry_0, ready to execute new retry-task:astJob_2_retry_0, current age is 2
2021-12-28 18:57:00.057 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:57:01.057 INFO Retry---success to rebuild task node:astJob_2_retry_1, ready to execute new retry-task:astJob_2_retry_1, current age is 3
2021-12-28 18:57:11.057 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:57:12.057 INFO Retry---success to rebuild task node:astJob_2_retry_2, ready to execute new retry-task:astJob_2_retry_2, current age is 4
2021-12-28 18:57:22.057 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:57:22.057 INFO Retry---success to rebuild task node:astJob_2_retry_3, ready to execute new retry-task:astJob_2_retry_3, current age is 5
2021-12-28 18:57:32.057 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:57:33.057 INFO Retry---success to rebuild task node:astJob_2_retry_4, ready to execute new retry-task:astJob_2_retry_4, current age is 6
2021-12-28 18:57:43.057 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:57:44.057 INFO Retry---success to rebuild task node:astJob_2_retry_5, ready to execute new retry-task:astJob_2_retry_5, current age is 7
2021-12-28 18:57:54.057 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:57:55.057 INFO Retry---success to rebuild task node:astJob_2_retry_6, ready to execute new retry-task:astJob_2_retry_6, current age is 8
2021-12-28 18:58:05.058 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:58:05.058 INFO Retry---success to rebuild task node:astJob_2_retry_7, ready to execute new retry-task:astJob_2_retry_7, current age is 9
2021-12-28 18:58:15.058 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:58:16.058 INFO Retry---success to rebuild task node:astJob_2_retry_8, ready to execute new retry-task:astJob_2_retry_8, current age is 10
2021-12-28 18:58:26.058 INFO Background is starting a new engine for you, it may take several seconds, please wait
2021-12-28 18:58:27.058 ERROR Task is Failed,errorMsg: ask Engine failed + errCode: 12003 ,desc: bd15-21-32-217:9101_23 Failed to async get EngineNodeLinkisRetryException: errCode: 30002 ,desc: 资源不足,请重试: errCode: 11014 ,desc: Queue CPU resources are insufficient, reduce the number of executors.(队列CPU资源不足,建议调小执行器个数) ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: bd15-21-32-217 ,port: 9104 ,serviceKind: linkis-cg-entrance
2021-12-28 18:58:27.058 INFO job is completed.
2021-12-28 18:58:27.058 INFO Task creation time(任务创建时间): 2021-12-28 18:56:38, Task scheduling time(任务调度时间): 2021-12-28 18:56:38, Task start time(任务开始时间): 2021-12-28 18:56:38, Mission end time(任务结束时间): 2021-12-28 18:58:27
2021-12-28 18:58:27.058 INFO Your mission(您的任务) 27 The total time spent is(总耗时时间为): 1.8 分钟
2021-12-28 18:58:27.058 INFO Sorry. Your job completed with a status Failed. You can view logs for the reason.
[INFO] Job failed! Will not try get execute result.
============Result:================
TaskId:27
ExecId: exec_id018019linkis-cg-entrancebd15-21-32-217:9104LINKISCLI_codeweaver_spark_0
User:codeweaver
Current job status:FAILED
extraMsg:
errCode: 11014
errDesc: 队列CPU资源不足
############Execute Error!!!########
Beta Was this translation helpful? Give feedback.
All reactions