You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/CC @relud, there appears to be a persistent issue with telemetry live sink that will need to be addressed this week (or dropped and recovered from pbd, but we don't have an automated mechanism for that):
{
insertId: "ul5b6261edy7hygbk"
jsonPayload: {
endOfBatch: false
instant: {
epochSecond: 1590375717
nanoOfSecond: 429000000
}
level: "ERROR"
loggerFqcn: "org.apache.logging.slf4j.Log4jLogger"
loggerName: "com.mozilla.telemetry.ingestion.sink.io.Pubsub$Read"
message: "failed to deliver message"
thread: "Gax-2"
threadId: 25
threadPriority: 5
thrown: {
commonElementCount: 0
extendedStackTrace: "java.lang.NullPointerException: null
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.coerceToBqType(PubsubMessageToObjectNode.java:508) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.processField(PubsubMessageToObjectNode.java:395) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.transformForBqSchema(PubsubMessageToObjectNode.java:286) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.processField(PubsubMessageToObjectNode.java:374) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.transformForBqSchema(PubsubMessageToObjectNode.java:286) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.processField(PubsubMessageToObjectNode.java:374) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.transformForBqSchema(PubsubMessageToObjectNode.java:286) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.processField(PubsubMessageToObjectNode.java:342) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.transformForBqSchema(PubsubMessageToObjectNode.java:286) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.processField(PubsubMessageToObjectNode.java:342) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.transformForBqSchema(PubsubMessageToObjectNode.java:286) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload.apply(PubsubMessageToObjectNode.java:243) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.transform.PubsubMessageToObjectNode$Payload$WithOpenCensusMetrics.apply(PubsubMessageToObjectNode.java:114) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.io.Gcs$Write$Ndjson.encodeInput(Gcs.java:51) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.io.Gcs$Write$Ndjson.encodeInput(Gcs.java:34) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.util.BatchWrite.apply(BatchWrite.java:148) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.util.BatchWrite.apply(BatchWrite.java:26) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.config.SinkConfig$Output.apply(SinkConfig.java:128) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.mozilla.telemetry.ingestion.sink.config.SinkConfig$Output.apply(SinkConfig.java:109) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:995) ~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2137) ~[?:1.8.0_252]
at com.mozilla.telemetry.ingestion.sink.io.Pubsub$Read.lambda$new$2(Pubsub.java:46) ~[ingestion-sink-0.1-SNAPSHOT.jar:?]
at com.google.cloud.pubsub.v1.MessageDispatcher$4.run(MessageDispatcher.java:379) [google-cloud-pubsub-1.102.1.jar:1.102.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
"
name: "java.lang.NullPointerException"
}
}
From my read, it appears coerceToBqType is receiving a null as its JsonNode o argument and o.isNull() is not having any of it. /CC @jklukas as well since he modified the code most recently.
The text was updated successfully, but these errors were encountered:
I don't think it would have been worth the effort, but in case it is ever needed, it's possible to recover from an undeliverable message by this process:
stop the decoder
wait for the sink to deliver all deliverable messages, i.e. wait for unacked messages to stabilize for 5 minutes (sink) or 30 minutes (loader)
stop the sink
download and remove outstanding message(s) from the sink subscription, e.g. gcloud pubsub subscriptions pull SUBSCRIPTION --auto-ack --format=json > bad_messages.json
We could also use a dead letter queue if 100 delivery attempts is sufficient to consider a message undeliverable. In any case I don't plan on automating any mechanisms around handling undeliverable messages since I agree it's not worth the effort.
/CC @relud, there appears to be a persistent issue with telemetry live sink that will need to be addressed this week (or dropped and recovered from pbd, but we don't have an automated mechanism for that):
From my read, it appears
coerceToBqType
is receiving anull
as itsJsonNode o
argument ando.isNull()
is not having any of it. /CC @jklukas as well since he modified the code most recently.The text was updated successfully, but these errors were encountered: