Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query search fails with too many users searching in parallel #352

Closed
AjitPS opened this issue May 13, 2019 · 11 comments
Closed

Query search fails with too many users searching in parallel #352

AjitPS opened this issue May 13, 2019 · 11 comments
Assignees

Comments

@AjitPS
Copy link
Collaborator

AjitPS commented May 13, 2019

KnetMiner Wheat (and other instances) have query search failing during training workshops if multiple attendees use the same queries (eg, Wheat example query 3) and press Search at the same time. The server code was designed to handle multi-thread requests but doesn't work in practice. Issue was found for KnetMiner release 3.0 - https://github.com/Rothamsted/knetminer/releases/tag/v3.0

Complete logs from Training workshop (KnetMiner ws.log and tomcat's localhost_access log) attached. Could also be for different queries if too many queries are launched in parallel (so needs more investigation by us).
ws_knet_workshop.log
localhost_access_log_knet_workshop.txt

Also, may be useful to build some training-specific example queries as well as our example queries are for larger use cases (returning 5k+ results), so work better for conference presentations but not necessarily for training workshops.

@AjitPS
Copy link
Collaborator Author

AjitPS commented May 13, 2019

Full logs attached. We need to review ws.log and the server/dataSource code; and ondexServiceProvider to see why it fails to cope with multi-user concurrent access/ queries.

@AjitPS AjitPS changed the title Query search fails with too many users searching at the same time Query search fails with too many users searching in parallel May 13, 2019
@marco-brandizi
Copy link
Member

marco-brandizi commented May 14, 2019

I'm not sure the code is written entirely in a thread-safe way. For instance, there are maps like mapGene2HitConcept that are unsynchronised and written concurrently. Re-arranging OndexServiceProv would be a preliminary step to address bugs like this better.

Moreover, we could take bug like this as an opportunity to add unit tests: now it's possible to use the testing architecture already in place for aratiny-ws (ie, add tests to that project). Features from Junit 5 would be another tool worth to be considered.

@dicknetherlands
Copy link
Contributor

Yes, this is a simple thread-safety issue. For some reason the original Knetminer code uses global static variables to store state during a query, rather than instance variables within an object. Changing these (and creating an object instance per query rather than a static global search method) will fix this.

@AjitPS
Copy link
Collaborator Author

AjitPS commented Jul 29, 2019

We have documented some of the API modes at:
https://docs.google.com/document/d/1KyZaBwq0uLnK9NIArIytRrI1CN6xZ5hkG21Nro1KyCo/edit?usp=sharing @dicknetherlands

The genepage mode renders the response network so is not the best one to test parallel user queries via unit tests or JMeter.

You could use the other modes: countHits (returns JSON with count/no. of matches for query), network (just NetworkView JSON-like response without rendering it) and genome (returns the GeneView .tab, EvidenceView .tab and MapView/gviewer .xml all wrapped as JSON), e.g.:

@mdonepudi @KeywanHP fyi, you could use similar queries for these 3 modes to test 1000s of parallel queries for countHits, network or genome modes, to ensure all work if launched in parallel.

@AjitPS
Copy link
Collaborator Author

AjitPS commented Jul 30, 2019

would be good to know what in https://github.com/Rothamsted/knetminer/blob/20190722_mdjuly2019/common/server-base/src/main/java/rres/knetminer/datasource/server/KnetminerServer.java breaks when multiple users query KnetMiner as shown by @mdonepudi earlier today, that it fails before invoking methods from OndexServiceProvider.

The genome mode query does the search and also uses the large HashMaps (now concurrentHashMaps by @mdonepudi) so may be linked to it happening in tests for genome mode and not the others (countHits, network).

Once we fix it, i.e., get valid responses back for parallel queries, would be good to see if we can prevent a major performance degradation from concurrentHashMaps. @mdonepudi @dicknetherlands

@dicknetherlands
Copy link
Contributor

This variable

and possibly others too are not thread safe. That is why it is failing. If you run two searches at once, they both update this variable at the same time. (Caveat: I have not seen @mdonepudi 's latest work so don't know if he's already fixed this one.)

@dicknetherlands
Copy link
Contributor

would be good to know what in https://github.com/Rothamsted/knetminer/blob/20190722_mdjuly2019/common/server-base/src/main/java/rres/knetminer/datasource/server/KnetminerServer.java breaks when multiple users query KnetMiner as shown by @mdonepudi earlier today, that it fails before invokig metjods from OndexServiceProvider.

could you post a stack trace?

@AjitPS
Copy link
Collaborator Author

AjitPS commented Jul 30, 2019

thanks Richard, @mdonepudi showed us a demo earlier so maybe could share the stack trace here.

@mdonepudi
Copy link
Contributor

Below is the stacktrace. Am not ruling out OndexServiceProvider being the cause for this exception. Changed all HashMaps in OndexServiceProvider to ConcurrentHashMaps to isolate the cause and still throwing this exception.

21:48:41.404 [qtp1799431661-83] ERROR rres.knetminer.datasource.server.KnetminerServer - Exception while running genome java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101] at rres.knetminer.datasource.server.KnetminerServer._handle(KnetminerServer.java:268) [classes/:?] at rres.knetminer.datasource.server.KnetminerServer.handle(KnetminerServer.java:206) [classes/:?] at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101] at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) [spring-web-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) [spring-web-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:116) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) [javax.servlet-api-3.1.0.jar:3.1.0] at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) [spring-webmvc-4.3.6.RELEASE.jar:4.3.6.RELEASE] at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:867) [jetty-servlet-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623) [jetty-servlet-9.4.14.v20181114.jar:9.4.14.v20181114] at org.apache.logging.log4j.web.Log4jServletFilter.doFilter(Log4jServletFilter.java:71) [log4j-web-2.3.jar:2.3] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610) [jetty-servlet-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) [jetty-servlet-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) [jetty-security-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) [jetty-servlet-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.Server.handle(Server.java:502) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) [jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) [jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) [jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]

@dicknetherlands
Copy link
Contributor

dicknetherlands commented Jul 30, 2019 via email

AjitPS added a commit that referenced this issue Jul 31, 2019
#352 made variables (scoredCandidates and mapGene2Concepts) threadsafe
@AjitPS
Copy link
Collaborator Author

AjitPS commented Jul 31, 2019

fyi, added by @mdonepudi in #390

AjitPS added a commit that referenced this issue Dec 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants