-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to create Marker file #7909
Comments
@koochiswathiTR this usually means timeline server was at capacity to respond. is this issue repeatable? or just 1 time occurrence |
@xushiyan xushiyan |
@koochiswathiTR this is likely caused by a concurrency bug handling marker creation requests at the timeline server, which is fixed by #6383, since 0.12.1 release. Are you able to try the new release? If the job remains on 0.11.1 release, you may set |
@koochiswathiTR : any updates on this end. If the issue got resolved, can you please close it. |
Seems like this issue is still showing in 0.13.0. Any updates on it ? |
I have a similar issue while trying to run a compaction job. Are you all seeing this while compaction too? |
There is a fix for the the marker file creation: #9003 |
We're using Hudi
Any ideas how to fix this? |
I am having the same issue.. any idea? |
We eventually abandoned Hudi on AWS/S3 entirely, and we are streaming the changes into a Redshift serverless cluster instead. It might be possible to tweak some configuration options to avoid the throttling or make Spark handle it more gracefully, but we could not make Hudi work reliably with our data sets in the end, despite having help from AWS support engineers. |
the version specified is quite old and we recommend to upgrade Hudi version 0.14.0 + |
We see Failed to create marker File
org.apache.hudi.exception.HoodieRemoteException: Failed to create marker file w_profilerRef_16/91b1a776-2a6b-4a1f-945a-c34084bfb1b7-0_7002-8506-3604370_20221208045421755.parquet.marker.MERGE
ip-100-67-243-210.8043.aws-int.thomsonreuters.com:38839 failed to respond
at org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers.create(TimelineServerBasedWriteMarkers.java:149)
at org.apache.hudi.table.marker.WriteMarkers.create(WriteMarkers.java:64)
at org.apache.hudi.io.HoodieWriteHandle.createMarkerFile(HoodieWriteHandle.java:187)
at org.apache.hudi.io.HoodieMergeHandle.init(HoodieMergeHandle.java:187)
at org.apache.hudi.io.HoodieMergeHandle.(HoodieMergeHandle.java:139)
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.getUpdateHandle(HoodieSparkCopyOnWriteTable.java:257)
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.handleUpdate(HoodieSparkCopyOnWriteTable.java:220)
at org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:227)
at org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$57154431$1(HoodieCompactor.java:138)
at org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1498)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.http.NoHttpResponseException: ip-100-67-243-210.8043.aws-int.thomsonreuters.com:38839 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.apache.http.client.fluent.Request.execute(Request.java:143)
at org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers.executeRequestToTimelineServer(TimelineServerBasedWriteMarkers.java:177)
at org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers.create(TimelineServerBasedWriteMarkers.java:145)
... 31 more
Our Hudi Configs :
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
We see these exceptions in our spark streaming application,
We ingest data to hudi datasets,
We see these exceptions in our logs, Can you please help us here
A clear and concise description of the problem.
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description AWS
Hudi version : 0.11.1
Spark version :3.2.1
Hive version :NA
Hadoop version :
Storage (HDFS/S3/GCS..) :S3
Running on Docker? (yes/no) :no
Additional context
Add any other context about the problem here.
Stacktrace
org.apache.hudi.exception.HoodieRemoteException: Failed to create marker file w_profilerRef_16/91b1a776-2a6b-4a1f-945a-c34084bfb1b7-0_7002-8506-3604370_20221208045421755.parquet.marker.MERGE
ip-100-67-243-210.8043.aws-int.thomsonreuters.com:38839 failed to respond
at org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers.create(TimelineServerBasedWriteMarkers.java:149)
at org.apache.hudi.table.marker.WriteMarkers.create(WriteMarkers.java:64)
at org.apache.hudi.io.HoodieWriteHandle.createMarkerFile(HoodieWriteHandle.java:187)
at org.apache.hudi.io.HoodieMergeHandle.init(HoodieMergeHandle.java:187)
at org.apache.hudi.io.HoodieMergeHandle.(HoodieMergeHandle.java:139)
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.getUpdateHandle(HoodieSparkCopyOnWriteTable.java:257)
at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.handleUpdate(HoodieSparkCopyOnWriteTable.java:220)
at org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:227)
at org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$57154431$1(HoodieCompactor.java:138)
at org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1498)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.http.NoHttpResponseException: ip-100-67-243-210.8043.aws-int.thomsonreuters.com:38839 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.apache.http.client.fluent.Request.execute(Request.java:143)
at org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers.executeRequestToTimelineServer(TimelineServerBasedWriteMarkers.java:177)
at org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers.create(TimelineServerBasedWriteMarkers.java:145)
... 31 more
The text was updated successfully, but these errors were encountered: