Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: min_threads / pixeldata.threads mismatch leads to hang #125

Open
joshmoore opened this issue Apr 21, 2021 · 1 comment
Open

Bug: min_threads / pixeldata.threads mismatch leads to hang #125

joshmoore opened this issue Apr 21, 2021 · 1 comment
Assignees

Comments

@joshmoore
Copy link
Member

see: https://forum.image.sc/t/pixeldata-threads-and-pyramid-generation-issues/49794

Summary of results: In general, it seems that omero.sessions.timeout does matter, but only when omero.threads.min_threads does not exceed omero.pixeldata.threads, and only when both files are imported in a single import.

@chris-allan
Copy link
Member

chris-allan commented Jan 24, 2023

With the research that is going into #154, specifically looking at the jstack output from @JulianHn on the image.sc thread, I believe at least part of the issue here is a bug with ome.services.pixeldata.PixelDataThread#doRun() when handling exceptional execution conditions. The current code looks like this:

public void doRun() {
if (performProcessing) {
final ExecutorCompletionService<Object> ecs =
new ExecutorCompletionService<Object>(executor.getService(),
new ArrayBlockingQueue<Future<Object>>(numThreads));
@SuppressWarnings("unchecked")
List<EventLog> eventLogs = (List<EventLog>)
executor.execute(getPrincipal(), work);
for (final EventLog log : eventLogs) {
ecs.submit(new Callable<Object>(){
@Override
public Object call()
throws Exception
{
return go(log);
}
});
}
int count = eventLogs.size();
while (count > 0) {
try {
Future<Object> future = ecs.poll(500, TimeUnit.MILLISECONDS);
if (future != null && future.get() != null) {
count--;
}
} catch (ExecutionException ee) {
onExecutionException(ee);
} catch (InterruptedException ie) {
log.debug("Interrupted; looping", ie);
}
}
}
}

You will notice that the while loop starting on line 256 has a completion condition count > 0 and in the line above the count is set to the number of event logs that were submitted for processing. If one of those event log processing tasks completes exceptionally calling future.get() will throw an ExecutionException 1 which will then be caught and handled by onExecutionException(). Unfortunately, when this happens count is not decremented and the while loop will proceed indefinitely resulting in no new event logs being processed.

I think we can fix this easily enough while dealing with #154.

Edit: If an InterruptedException is thrown somewhere and caught it will not cause the count to be decremented.

/cc @jburel, @erickmartins, @perlman

Footnotes

  1. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/Future.html#get()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants