Failing count(*) for the 1st time under beeline and also using prior table length (FB) #65

ramajob72 · 2019-09-03T03:55:41Z

We are using this Serde for past few years under CLI but we are in the process of switching to beeline. While testing under beeline, we are encountering 2 strange behavior, which I think is related to same problem. All our tables are FB files and the same JAR is working perfectly fine under CLI and count(*) is returning fine.

When we issue count() for table (FB file) after add JAR for the very 1st time, we are getting following exception. But if we re-issue count() from exactly same table, it is running fine. But if we are selecting a column(s) from the table for the very 1st time, then there is NO issue.
"diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Fixed record length 0 is invalid. It should be set to a value greater than zero".
Any subsequent 1st count(*) for another table, the length of prior table is used and doesn't match with the total length of the 2nd table file and hence throwing exception. Here also, if we select column(s), then the current table FB length is used.

Any help is appreciated.

Thanks,
Rama.

rbheemana · 2019-12-20T15:33:31Z

Please share exact command line statements, also could you please let me know if you are using serde2 or serde3

rbheemana · 2019-12-20T20:55:47Z

Please check https://community.cloudera.com/t5/Support-Questions/Adding-hive-auxiliary-jar-files/td-p/120245 and try to place the jar on hdfs location

poongs84 · 2019-12-31T11:11:45Z

Hi ,
The below is the command we are trying after login to beeline
ADD JAR //CobolSerdeHive.jar;
create table default.MY_BINARY_TABLE
ROW FORMAT SERDE 'com.savy3.hadoop.hive.serde3.cobol.CobolSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.FixedLengthInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/user/myid/mybinarytable/'
TBLPROPERTIES ('cobol.layout.url'='/user/myid/mybinarytable.copybook','fb.length'='80');

We tried to have the JAR in HDFS also , but we are getting same "Fixed record length 0 is invalid " issue.But not tried yet Auxiliary configuration .

It is working fine in HIVE CLI .
Also it is working in beeline if we use Hive engine as MR instead of Tez.

Can you please help why we are getting exception when we run in beeline using Tez engine.

Exception

Caused by: java.io.IOException: java.io.IOException: Fixed record length 0 is invalid. It should be set to a value greater than zero
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:258)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 25 more
Caused by: java.io.IOException: Fixed record length 0 is invalid. It should be set to a value greater than zero
at org.apache.hadoop.mapred.FixedLengthInputFormat.getRecordReader(FixedLengthInputFormat.java:84)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:255)
... 26 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Fixed record length 0 is invalid. It should be set to a value greater than zero
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

rbheemana · 2019-12-31T18:27:41Z

Add jar won't work, please place your jar in hive.aux.jars.path specified in hive-site.xml or place it in hdfs and set hive.aux.jars.path=<hdfs_location> in beeline

<property>
  <name>hive.aux.jars.path</name>
  <value>/usr/hdp/current/hive-server2/auxlib/</value>
</property>

Let me know if it still fails

poongs84 · 2020-01-02T09:37:30Z

I tried to set hive.aux.jars.path=<hdfs_location> in beeline like below .But it is not able to recognize
serde3.cobol.CobolSerDe class.

: jdbc:hive2://myhost.bnymellon.net:38001> set hive.aux.jars.path=hdfs:///user/myid/jar/CobolSerdeHive.jar ;
No rows affected (0.006 seconds)
0: jdbc:hive2://myhost.bnymellon.net:38001> set hive.aux.jars.path;
+------------------------------------------------------------------+--+
| set |
+------------------------------------------------------------------+--+
| hive.aux.jars.path=hdfs:///user/myid/jar/CobolSerdeHive.jar |
+------------------------------------------------------------------+--+
1 row selected (0.013 seconds)
0: jdbc:hive2://myhost.bnymellon.net:38001> SELECT COUNT(*) FROM DEFAULT.MYTABLE;
Error: Error while compiling statement: FAILED: RuntimeException MetaException(message:java.lang.ClassNotFoundException Class com.savy3.hadoop.hive.serde3.cobol.CobolSerDe not found) (state=42000,code=40000)
Am I doing anything wrong in above statements?
I am yet to try option 1 you suggested (place your jar in hive.aux.jars.path specified in hive-site.xml ) .

poongs84 · 2020-01-06T06:20:03Z

Hi Ram - We added cobolserde jar to auxlib path and modified hive-site.xml .
But it is still throwing same error .
Can you please help us to resolve this issue .
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1572986800142_128710_1_00, diagnostics=[Task failed, taskId=task_1572986800142_128710_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Fixed record length 0 is invalid. It should be set to a value greater than zero
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Fixed record length 0 is invalid. It should be set to a value greater than zero
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:135)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674)
at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633)
at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more

poongs84 · 2020-01-07T12:57:21Z

Hi Ram - Can you please see if you can help us on this issue

rbheemana · 2020-01-07T13:32:02Z

@poongs84 the custom serde is written for map reduce. It won’t work for Tez or Spark.

Try setting execution engine to mr before running your query.

poongs84 · 2020-01-07T13:36:11Z

Hi Ram - Thanks for suggestion . One question . Why it works for Tez mode in HIVE CLI but not in beeline

rbheemana · 2020-01-07T13:41:19Z

@poongs84 that’s something I need to debug further.. I never intended it to work with tez.. if it is working in hive cli and not in beeline, there may be extra modifications we need to add to code to support beeline and tez...We need to further look into beeline and tez documentation...

poongs84 · 2020-01-31T12:47:49Z

Hi Ram - Since we are planning to move to HDP 3 soon, there it supports only beeline and looks like MR engine also is NOT supported . Is there anyway to make this cobol serde jar to work in tez engine with beeline .

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/hive-overview/content/hive_whats_new_in_this_release_hive.html

poongs84 · 2020-02-04T03:02:20Z

HI Ram - Can you please see if this JAR can be made to work on Tez with Beeline.

poongs84 · 2020-02-10T06:28:25Z

Hi Ram - Can you see if you can help us on this issue

rbheemana · 2020-02-11T01:26:03Z

@poongs84 Please use org.apache.hadoop.mapreduce.lib.input.FixedLengthInputFormat if you are using beeline..
I believe org.apache.hadoop.mapred.FixedLengthInputFormat is causing problem with beeline and tez

ramajob72 · 2020-02-11T04:39:57Z

Thanks for the input Ram. We will give a try. Can you please confirm if the change is needed in 1) import from org.apache.hadoop.mapred.FixedLengthInputFormat to org.apache.hadoop.mapreduce.lib.input.FixedLengthInputFormat in CobolSerDe (both instances) and CobolSerDeUtils classes is sufficient and 2) CREATE TABLE statement "STORED AS INPUTFORMAT"?

rbheemana · 2020-02-12T14:45:07Z

Just in create table will do, I believe.

poongs84 · 2020-02-13T09:50:29Z

I am able to create table .But when I execute select statement it throws error

ADD JAR //CobolSerdeHive.jar;
create table default.MY_BINARY_TABLE
ROW FORMAT SERDE 'com.savy3.hadoop.hive.serde3.cobol.CobolSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapreduce.lib.input.FixedLengthInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/user/myid/mybinarytable/'
TBLPROPERTIES ('cobol.layout.url'='/user/myid/mybinarytable.copybook','fb.length'='80');

select count(*) from default.default.MY_BINARY_TABLE

Error: Error while compiling statement: FAILED: SemanticException 1:22 Input format must implement InputFormat. Error encountered near token 'MY_BINARY_TABLE'(state=42000,code=40000)

rbheemana · 2020-02-13T10:00:19Z

@poongs84 error suggests ur table name is wrong... I see 2 instances of default in table name, correct it and try again

poongs84 · 2020-02-13T10:06:15Z

@rbheemana .. that is typo error while I paste in comments .The query executed is just have default.MY_BINARY_TABLE

rbheemana · 2020-02-13T10:09:30Z

What is the word near offset 22 in your exact query then..?

poongs84 · 2020-02-13T10:13:08Z

the below is the query executed and failed with exception
select count(*) from default.MY_BINARY_TABLE ;

Error: Error while compiling statement: FAILED: SemanticException 1:22 Input format must implement InputFormat. Error encountered near token 'MY_BINARY_TABLE ' (state=42000,code=40000)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failing count(*) for the 1st time under beeline and also using prior table length (FB) #65

Failing count(*) for the 1st time under beeline and also using prior table length (FB) #65

ramajob72 commented Sep 3, 2019

rbheemana commented Dec 20, 2019

rbheemana commented Dec 20, 2019

poongs84 commented Dec 31, 2019

rbheemana commented Dec 31, 2019

poongs84 commented Jan 2, 2020

poongs84 commented Jan 6, 2020

poongs84 commented Jan 7, 2020

rbheemana commented Jan 7, 2020 •

edited

Loading

poongs84 commented Jan 7, 2020

rbheemana commented Jan 7, 2020

poongs84 commented Jan 31, 2020

poongs84 commented Feb 4, 2020

poongs84 commented Feb 10, 2020

rbheemana commented Feb 11, 2020 •

edited

Loading

ramajob72 commented Feb 11, 2020

rbheemana commented Feb 12, 2020

poongs84 commented Feb 13, 2020

rbheemana commented Feb 13, 2020

poongs84 commented Feb 13, 2020

rbheemana commented Feb 13, 2020 •

edited

Loading

poongs84 commented Feb 13, 2020

Failing count(*) for the 1st time under beeline and also using prior table length (FB) #65

Failing count(*) for the 1st time under beeline and also using prior table length (FB) #65

Comments

ramajob72 commented Sep 3, 2019

rbheemana commented Dec 20, 2019

rbheemana commented Dec 20, 2019

poongs84 commented Dec 31, 2019

rbheemana commented Dec 31, 2019

poongs84 commented Jan 2, 2020

poongs84 commented Jan 6, 2020

poongs84 commented Jan 7, 2020

rbheemana commented Jan 7, 2020 • edited Loading

poongs84 commented Jan 7, 2020

rbheemana commented Jan 7, 2020

poongs84 commented Jan 31, 2020

poongs84 commented Feb 4, 2020

poongs84 commented Feb 10, 2020

rbheemana commented Feb 11, 2020 • edited Loading

ramajob72 commented Feb 11, 2020

rbheemana commented Feb 12, 2020

poongs84 commented Feb 13, 2020

rbheemana commented Feb 13, 2020

poongs84 commented Feb 13, 2020

rbheemana commented Feb 13, 2020 • edited Loading

poongs84 commented Feb 13, 2020

rbheemana commented Jan 7, 2020 •

edited

Loading

rbheemana commented Feb 11, 2020 •

edited

Loading

rbheemana commented Feb 13, 2020 •

edited

Loading