-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to query large data set with scan-query
via broker
#4865
Comments
Oops....looks like I must specify a very long timeout which is weird to me in streaming case ;) |
The error described above can be fixed to explicitly specify a timeout in the query: {
...
"context": {
"timeout": 36000000
}
} However, when I query larger data set (e.g., tens GB of merged and compressed segments, larger than the memory allocated to java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Query[426a21c5-5420-46de-9fda-7f080fbf7e3b] url[http://master-c457d875.node.local:8083/druid/v2/] failed with exception msg [java.lang.OutOfMemoryError: GC overhead limit exceeded]
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.server.QueryResource$1.write(QueryResource.java:218) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:71) ~[jersey-core-1.19.3.jar:1.19.3]
at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:57) ~[jersey-core-1.19.3.jar:1.19.3]
at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:302) ~[jersey-server-1.19.3.jar:1.19.3]
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1510) ~[jersey-server-1.19.3.jar:1.19.3]
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) ~[jersey-server-1.19.3.jar:1.19.3]
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) ~[jersey-server-1.19.3.jar:1.19.3]
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) ~[jersey-servlet-1.19.3.jar:1.19.3]
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) ~[jersey-servlet-1.19.3.jar:1.19.3]
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) ~[jersey-servlet-1.19.3.jar:1.19.3]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0]
at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) ~[guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) ~[guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) ~[guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) ~[guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) ~[guice-servlet-4.1.0.jar:?]
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) ~[guice-servlet-4.1.0.jar:?]
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) ~[jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.Server.handle(Server.java:534) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Query[426a21c5-5420-46de-9fda-7f080fbf7e3b] url[http://master-c457d875.node.local:8083/druid/v2/] failed with exception msg [java.lang.OutOfMemoryError: GC overhead limit exceeded]
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:139) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:800) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:642) ~[jackson-databind-2.4.6.jar:2.4.6]
at io.druid.server.QueryResource$1.write(QueryResource.java:210) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
... 39 more
Caused by: io.druid.java.util.common.RE: Query[426a21c5-5420-46de-9fda-7f080fbf7e3b] url[http://master-c457d875.node.local:8083/druid/v2/] failed with exception msg [java.lang.OutOfMemoryError: GC overhead limit exceeded]
at io.druid.client.DirectDruidClient$1$3.hasMoreElements(DirectDruidClient.java:286) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
at java.io.SequenceInputStream.nextStream(SequenceInputStream.java:109) ~[?:1.8.0_144]
at java.io.SequenceInputStream.close(SequenceInputStream.java:232) ~[?:1.8.0_144]
at com.fasterxml.jackson.dataformat.smile.SmileParser._closeInput(SmileParser.java:452) ~[jackson-dataformat-smile-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.core.base.ParserBase.close(ParserBase.java:334) ~[jackson-core-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.dataformat.smile.SmileParser.close(SmileParser.java:472) ~[jackson-dataformat-smile-2.4.6.jar:2.4.6]
at io.druid.client.DirectDruidClient$JsonParserIterator.close(DirectDruidClient.java:653) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.java.util.common.guava.CloseQuietly.close(CloseQuietly.java:39) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.client.DirectDruidClient$3.cleanup(DirectDruidClient.java:531) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.client.DirectDruidClient$3.cleanup(DirectDruidClient.java:521) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.java.util.common.guava.BaseSequence$2.close(BaseSequence.java:142) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.java.util.common.io.Closer.close(Closer.java:206) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.java.util.common.guava.MergeSequence$3.close(MergeSequence.java:158) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.java.util.common.guava.WrappingYielder.close(WrappingYielder.java:81) ~[java-util-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.jackson.DruidDefaultSerializersModule$4.serialize(DruidDefaultSerializersModule.java:132) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
at io.druid.jackson.DruidDefaultSerializersModule$4.serialize(DruidDefaultSerializersModule.java:118) ~[druid-processing-0.10.1-iap3.jar:0.10.1-iap3]
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:800) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:642) ~[jackson-databind-2.4.6.jar:2.4.6]
at io.druid.server.QueryResource$1.write(QueryResource.java:210) ~[druid-server-0.10.1-iap3.jar:0.10.1-iap3]
... 39 more
|
Looks like there's no If query directly through all It will be nice to have |
scan-query
scan-query
via broker
IIRC there is no real backpressure between broker and historicals. This was discussed in #4229 where maxScatterGatherBytes was introduced as a workaround to at least keep clusters stable (although it artificially limits response sizes). One thing you could do is query the historicals directly if you are really pulling a huge amount of data. That should work well since there is some backpressure there (the historicals will pause scanning segments if you aren't reading the results fast enough). I think it's also be good to investigate ways to implement backpressure in the broker. If you are into helping there, @stevenchen3, that would be great :) |
@gianm thanks for the explanation. If I query directly from historicals, for example, to get all raw data, will the results include duplicate entries (segments)? If I understand correctly, segments are replicated among historicals. |
@stevenchen3 Sorry for the late response, but if you query directly from historicals, you'd want to specify a specific set of segments to each historical (like the broker does) in order to prevent duplicates. |
while there are multiple PRs trying to solve this by limiting memory used in DirectDruidClient which is great in general. |
@himanshug But what happens if the results from historicals show up faster than they can be written to the client… where are they buffered? |
@gianm I think, In that case they would be buffered at broker using the same code that other PRs are doing. the short circuit path would be more to avoid everything else related to merging. but, maybe noop merging is not much overhead for scan queries and in that case we wouldn't need it. |
Hope to fix this via #6313. |
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time. |
By the way, this should be fixed now by #6313. You would need to enable it by setting |
Recently, I ran into the issue that I failed to query an interval with about 4 merged and compressed segments, each of which is about 500MB (about 1.9 million rows, extracted from the compressed segment, the size is about 7GB), using
scan-query
(version 0.10.1). It works for intervals with 2 of such segments.I ran the query using
curl
and output the result directly to a local file and observed that the memory consumption ofbroker
andhistorical
processes goes quickly as the query began running, and eventually reaches to the allocated memory limit (memory pressue seems to occur). And I eventually got the following exception fromhistorical
(similar errors onbroker
), and the query failed:If I understand correctly,
scan-query
should solve the memory pressure issue thatselect-query
has. Any advice, what might go wrong?Thanks.
The text was updated successfully, but these errors were encountered: