Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WriteBatcher hangs in awaitCompletion #836

Closed
srinathgit opened this issue Oct 5, 2017 · 2 comments
Closed

WriteBatcher hangs in awaitCompletion #836

srinathgit opened this issue Oct 5, 2017 · 2 comments

Comments

@srinathgit
Copy link
Contributor

A. The following test is performed on 3 node cluster with a db associated with 3 forests.
B. During the job, one of the host "rh7v-intel64-90-test-9.marklogic.com" is stopped and started soon after (within 30 seconds so that failover doesn't happen)
C. During multiple runs, most of them were successful without issues but I found one instance where the job was hanging in awaitCompletion
D. clientLog and thread dump are attached

	@Test
	public void testStopOneNode() throws Exception {
		Assert.assertTrue(dbClient.newServerEval().xquery(query1).eval().next().getNumber().intValue() == 0);
		final AtomicInteger successCount = new AtomicInteger(0);
		final AtomicBoolean failState = new AtomicBoolean(false);
		final AtomicInteger failCount = new AtomicInteger(0);
		try {
			WriteBatcher ihb2 = dmManager.newWriteBatcher();
			ihb2.withBatchSize(2);
			ihb2.withThreadCount(99);
			HostAvailabilityListener.getInstance(ihb2).withSuspendTimeForHostUnavailable(Duration.ofSeconds(50))
					.withMinHosts(2);
			NoResponseListener.getInstance(ihb2).withSuspendTimeForHostUnavailable(Duration.ofSeconds(50))
					.withMinHosts(2);
			ihb2.onBatchSuccess(batch -> {
				successCount.addAndGet(batch.getItems().length);
			}).onBatchFailure((batch, throwable) -> {
				throwable.printStackTrace();
				failState.set(true);
				failCount.addAndGet(batch.getItems().length);
			});

			writeTicket = dmManager.startJob(ihb2);
			AtomicBoolean isRunning = new AtomicBoolean(true);
			for (int j = 0; j < 50000; j++) {
				String uri = "/local/ABC-" + j;
				ihb2.add(uri, stringHandle);
			/*	if (dmManager.getJobReport(writeTicket).getSuccessEventsCount() > 200 && isRunning.get()) {
					isRunning.set(false);
					serverStartStop(hostNames[hostNames.length - 1], "stop");
				}*/
			}
			ihb2.flushAndWait();

		} catch (Exception e) {
			e.printStackTrace();
		}
		Thread.currentThread().sleep(5000L);
		System.out.println("Fail : " + failCount.intValue());
		System.out.println("Success : " + successCount.intValue());
		System.out.println("Count : " + dbClient.newServerEval().xquery(query1).eval().next().getNumber().intValue());
		Assert.assertTrue(dbClient.newServerEval().xquery(query1).eval().next().getNumber().intValue() == 50000);
	}
@sammefford
Copy link
Contributor

Let's address this in 4.0.4 since we're closing out this release and this doesn't appear to hurt the full processing of the job

@srinathgit
Copy link
Contributor Author

The issue hasn't surfaced in multiple test runs after the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants