Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive in Kudu #20697

Closed
findinpath opened this issue Feb 14, 2024 · 1 comment · Fixed by #20990
Labels
bug Something isn't working test

Comments

@findinpath
Copy link
Contributor

https://github.com/trinodb/trino/actions/runs/7895760334/job/21548695765?pr=20371

Error:  io.trino.plugin.kudu.TestKuduWithDisabledInferSchemaConnectorSmokeTest -- Time elapsed: 20.40 s <<< ERROR!
io.trino.testing.QueryFailedException: not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive
	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:133)
	at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:496)
	at io.trino.testing.DistributedQueryRunner.execute(DistributedQueryRunner.java:481)
	at io.trino.testing.QueryAssertions.copyTable(QueryAssertions.java:517)
	at io.trino.testing.QueryAssertions.copyTable(QueryAssertions.java:510)
	at io.trino.testing.QueryAssertions.copyTpchTables(QueryAssertions.java:503)
	at io.trino.plugin.kudu.KuduQueryRunnerFactory.createKuduQueryRunnerTpch(KuduQueryRunnerFactory.java:107)
	at io.trino.plugin.kudu.KuduQueryRunnerFactory.createKuduQueryRunnerTpch(KuduQueryRunnerFactory.java:83)
	at io.trino.plugin.kudu.BaseKuduConnectorSmokeTest.createQueryRunner(BaseKuduConnectorSmokeTest.java:43)
	at io.trino.testing.AbstractTestQueryFramework.init(AbstractTestQueryFramework.java:113)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
	Suppressed: java.lang.Exception: SQL: CREATE TABLE IF NOT EXISTS nation AS SELECT * FROM tpch.tiny.nation
		at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:499)
		... 15 more
Caused by: io.trino.spi.TrinoException: not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive
	at io.trino.plugin.kudu.KuduClientSession.createTable(KuduClientSession.java:316)
	at io.trino.plugin.kudu.KuduMetadata.beginCreateTable(KuduMetadata.java:363)
	at io.trino.spi.connector.ConnectorMetadata.beginCreateTable(ConnectorMetadata.java:805)
	at io.trino.tracing.TracingConnectorMetadata.beginCreateTable(TracingConnectorMetadata.java:646)
	at io.trino.metadata.MetadataManager.beginCreateTable(MetadataManager.java:1118)
	at io.trino.tracing.TracingMetadata.beginCreateTable(TracingMetadata.java:598)
	at io.trino.sql.planner.optimizations.BeginTableWrite$Rewriter.createWriterTarget(BeginTableWrite.java:231)
	at io.trino.sql.planner.optimizations.BeginTableWrite$Rewriter.visitTableFinish(BeginTableWrite.java:174)
	at io.trino.sql.planner.optimizations.BeginTableWrite$Rewriter.visitTableFinish(BeginTableWrite.java:92)
	at io.trino.sql.planner.plan.TableFinishNode.accept(TableFinishNode.java:105)
	at io.trino.sql.planner.plan.SimplePlanRewriter$RewriteContext.rewrite(SimplePlanRewriter.java:81)
	at io.trino.sql.planner.plan.SimplePlanRewriter$RewriteContext.lambda$defaultRewrite$0(SimplePlanRewriter.java:72)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:423)
	at io.trino.sql.planner.plan.SimplePlanRewriter$RewriteContext.defaultRewrite(SimplePlanRewriter.java:72)
	at io.trino.sql.planner.plan.SimplePlanRewriter.visitPlan(SimplePlanRewriter.java:37)
	at io.trino.sql.planner.plan.SimplePlanRewriter.visitPlan(SimplePlanRewriter.java:21)
	at io.trino.sql.planner.plan.PlanVisitor.visitOutput(PlanVisitor.java:49)
	at io.trino.sql.planner.plan.OutputNode.accept(OutputNode.java:82)
	at io.trino.sql.planner.plan.SimplePlanRewriter.rewriteWith(SimplePlanRewriter.java:31)
	at io.trino.sql.planner.optimizations.BeginTableWrite.optimize(BeginTableWrite.java:77)
	at io.trino.sql.planner.LogicalPlanner.runOptimizer(LogicalPlanner.java:309)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:270)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:239)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:234)
	at io.trino.execution.SqlQueryExecution.doPlanQuery(SqlQueryExecution.java:486)
	at io.trino.execution.SqlQueryExecution.planQuery(SqlQueryExecution.java:466)
	at io.trino.execution.SqlQueryExecution.start(SqlQueryExecution.java:404)
	at io.trino.execution.SqlQueryManager.createQuery(SqlQueryManager.java:264)
	at io.trino.dispatcher.LocalDispatchQuery.startExecution(LocalDispatchQuery.java:145)
	at io.trino.dispatcher.LocalDispatchQuery.lambda$waitForMinimumWorkers$2(LocalDispatchQuery.java:129)
	at io.airlift.concurrent.MoreFutures.lambda$addSuccessCallback$12(MoreFutures.java:568)
	at io.airlift.concurrent.MoreFutures$3.onSuccess(MoreFutures.java:543)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1135)
	at io.trino.$gen.Trino_testversion____20240214_025310_2767.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
	Suppressed: java.lang.Exception: Current plan:
                Output[columnNames = [rows]]
                │   Layout: [rows:bigint]
                └─ TableCommit[target = kudu.default.nation]
                   │   Layout: [rows:bigint]
                   └─ LocalExchange[partitioning = SINGLE]
                      │   Layout: [partialrows:bigint, fragment:varbinary]
                      └─ RemoteExchange[type = GATHER]
                         │   Layout: [partialrows:bigint, fragment:varbinary]
                         └─ TableWriter[]
                            │   Layout: [partialrows:bigint, fragment:varbinary]
                            │   nationkey := nationkey
                            │   name := name
                            │   regionkey := regionkey
                            │   comment := comment
                            └─ RemoteExchange[partitionCount = 100, type = REPARTITION]
                               │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
                               └─ TableScan[table = tpch:tiny:nation]
                                      Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
                                      nationkey := tpch:nationkey
                                      regionkey := tpch:regionkey
                                      name := tpch:name
                                      comment := tpch:comment

		at io.trino.sql.planner.optimizations.BeginTableWrite.optimize(BeginTableWrite.java:83)
		... 17 more
Caused by: org.apache.kudu.client.NonRecoverableException: not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive
	at org.apache.kudu.client.KuduException.transformException(KuduException.java:110)
	at org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:470)
	at org.apache.kudu.client.KuduClient.createTable(KuduClient.java:138)
	at io.trino.plugin.kudu.ForwardingKuduClient.createTable(ForwardingKuduClient.java:40)
	at io.trino.plugin.kudu.KuduClientSession.createTable(KuduClientSession.java:313)
	... 36 more
	Suppressed: org.apache.kudu.client.KuduException.OriginalException: Original asynchronous stack trace
		at org.apache.kudu.client.RpcProxy.dispatchMasterError(RpcProxy.java:414)
		at org.apache.kudu.client.RpcProxy.responseReceived(RpcProxy.java:288)
		at org.apache.kudu.client.RpcProxy.access$000(RpcProxy.java:64)
		at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:158)
		at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:154)
		at org.apache.kudu.client.Connection.channelRead0(Connection.java:362)
		at org.apache.kudu.shaded.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at org.apache.kudu.shaded.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1534)
		at org.apache.kudu.shaded.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1283)
		at org.apache.kudu.shaded.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1330)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:[447](https://github.com/trinodb/trino/actions/runs/7895760334/job/21548695765?pr=20371#step:5:448))
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at org.apache.kudu.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at org.apache.kudu.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
		at org.apache.kudu.shaded.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:[493](https://github.com/trinodb/trino/actions/runs/7895760334/job/21548695765?pr=20371#step:5:494))
		at org.apache.kudu.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
		at org.apache.kudu.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
		... 3 more
@ebyhr
Copy link
Member

ebyhr commented Feb 16, 2024

Error:  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 19.90 s <<< FAILURE! -- in io.trino.plugin.kudu.TestKuduIntegrationIntegerColumns
Error:  io.trino.plugin.kudu.TestKuduIntegrationIntegerColumns -- Time elapsed: 19.90 s <<< ERROR!
io.trino.testing.QueryFailedException: not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive
	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:133)
	at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:496)
	at org.apache.kudu.client.KuduException.transformException(KuduException.java:110)
	at org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:564)
	at org.apache.kudu.client.KuduClient.createTable(KuduClient.java:140)
	at io.trino.plugin.kudu.ForwardingKuduClient.createTable(ForwardingKuduClient.java:40)
	at io.trino.plugin.kudu.schema.SchemaEmulationByTableNameConvention.createTableIfNotExists(SchemaEmulationByTableNameConvention.java:168)
	at io.trino.plugin.kudu.schema.SchemaEmulationByTableNameConvention.createAndFillSchemasTable(SchemaEmulationByTableNameConvention.java:154)
	at io.trino.plugin.kudu.schema.SchemaEmulationByTableNameConvention.listSchemaNames(SchemaEmulationByTableNameConvention.java:115)
	... 31 more
	Suppressed: org.apache.kudu.client.KuduException.OriginalException: Original asynchronous stack trace
		at org.apache.kudu.client.RpcProxy.dispatchMasterError(RpcProxy.java:414)
		at org.apache.kudu.client.RpcProxy.responseReceived(RpcProxy.java:288)
		at org.apache.kudu.client.RpcProxy.access$000(RpcProxy.java:64)
		at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:158)
		at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:154)
		at org.apache.kudu.client.Connection.channelRead0(Connection.java:377)
		at org.apache.kudu.shaded.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
		at org.apache.kudu.shaded.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383)
		at org.apache.kudu.shaded.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246)
		at org.apache.kudu.shaded.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
		at org.apache.kudu.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
		at org.apache.kudu.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
		at org.apache.kudu.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
		at org.apache.kudu.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
		at org.apache.kudu.shaded.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
		at org.apache.kudu.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
		at org.apache.kudu.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
		at org.apache.kudu.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
		... 3 more

https://github.com/trinodb/trino/actions/runs/7926313890/job/21641336652

@ebyhr ebyhr changed the title TestKuduWithDisabledInferSchemaConnectorSmokeTest flaky test not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive in Kudu Feb 16, 2024
@ebyhr ebyhr changed the title not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive in Kudu Flaky not enough live tablet servers to create a table with the requested replication factor 1; 0 tablet servers are alive in Kudu Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test
Development

Successfully merging a pull request may close this issue.

2 participants