Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add client launch retries #218

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sewerynplazuk
Copy link
Contributor

Add client-side launch retries. This is helpful to overcome instability of IPC based approach in case of running multiple tests in parallel.

}
}

NSLog(@"[SBTUITestTunnel] Tunnel ready after %fs", CFAbsoluteTimeGetCurrent() - self.launchStart);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] Why remove this? Maybe needs to be moved higher, next to the attemptLaunchcall?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I will revert this back once we agree on the solution.

* @param options List of options to be passed on launch.
* Valid options:
* SBTUITunneledApplicationLaunchOptionResetFilesystem: delete app's filesystem sandbox
* SBTUITunneledApplicationLaunchOptionDisableUITextFieldAutocomplete disables UITextField's autocomplete functionality which can lead to unexpected results when typing text.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not copy/paste here the options, will become hard to maintain. Maybe you can just say something like see method launchTunnelWithOptions for details about the options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options are listed in the twin method. I don't want to introduce discrepancies on the interface.

@tcamin
Copy link
Member

tcamin commented Nov 18, 2024

Could you elaborate and give some further context for the change? Is this on CI or locally on developers machines? I took a look at the last 10k test in CI and couldn't find a single case, and we're also running test concurrently (5 sims per node on a dozen of mac minis). I'm kind of unsure whether this changes could introduce other type of instabilities due to the fact that we're retrying only on just one side of the bridge. Did you investigate more in depth the reasons for your launch connection issues?

Partially related, UI tests are inherently unstable as there are many issues that can cause something to break and it is generally advisable to have an additional component that ensures that tests are automatically retried. Just as an example we built Mendoza which automatically retries our tests covering any type of instability, launch or test implementation related.

@tcamin
Copy link
Member

tcamin commented Nov 18, 2024

Did you try disabling IPC and use HTTP tunneling? Does that make any difference? https://github.com/Subito-it/SBTUITestTunnel/blob/master/Documentation/Setup.md#tunneling-mode?

@sewerynplazuk
Copy link
Contributor Author

sewerynplazuk commented Nov 19, 2024

@tcamin This comes out as a solution for launch exceptions on Failed getting IPC proxy. It happens on CI and is pretty rare (3 cases out of ~3000 tests on a single job).

@try {
	//Send a ping to check if the connection is still alive while waiting.
	[_connection.otherConnection.rootProxy _ping];
} @catch (NSException *exception) {
	if(_errorBlock)
	{
		_errorBlock([NSError errorWithDomain:DTXIPCErrorDomain code:1 userInfo:@{NSLocalizedDescriptionKey: exception.reason}]);
	}
	else
	{
		[exception raise];
	}
	somethingWentWrong = YES;
}

I understand that the ping above fails. Unfortunately I wasn't able to narrow down the issue any further so I anticipate some sort of race condition (perhaps more than a single ping should be performed?)

Currently, there is no way to reliably recover from this error (while a retry helps in our case) other than restarting the test, which has some overhead I'd like to avoid. Using client's delegate to capture the issue is too late.

Did you try disabling IPC and use HTTP tunneling? Does that make any difference?

I saw this issue #127 and switching to HTTP solves it but it is not possible in our case.

@tcamin
Copy link
Member

tcamin commented Nov 21, 2024

I tried to manually replicate the exception path the first time - [_DTXIPCDistantObject forwardInvocation:] is invoked but the retry logic did not seem to work. Moreover when running the Debug build configuration I see the - [SBTUITestTunnelServer takeOffOnceIPCWithServiceIdentifier] fail at synchronousRemoteObjectProxyWithErrorHandler. Is there any way we could write a test that verifies the retry logic?

@tcamin This comes out as a solution for launch exceptions on Failed getting IPC proxy. It happens on CI and is pretty rare (3 cases out of ~3000 tests on a single job).

@try {
	//Send a ping to check if the connection is still alive while waiting.
	[_connection.otherConnection.rootProxy _ping];
} @catch (NSException *exception) {
	if(_errorBlock)
	{
		_errorBlock([NSError errorWithDomain:DTXIPCErrorDomain code:1 userInfo:@{NSLocalizedDescriptionKey: exception.reason}]);
	}
	else
	{
		[exception raise];
	}
	somethingWentWrong = YES;
}

I understand that the ping above fails. Unfortunately I wasn't able to narrow down the issue any further so I anticipate some sort of race condition (perhaps more than a single ping should be performed?)

Currently, there is no way to reliably recover from this error (while a retry helps in our case) other than restarting the test, which has some overhead I'd like to avoid. Using client's delegate to capture the issue is too late.

Did you try disabling IPC and use HTTP tunneling? Does that make any difference?

I saw this issue #127 and switching to HTTP solves it but it is not possible in our case.

@sewerynplazuk
Copy link
Contributor Author

Thanks. I'll have a look into this, as well as into unit testing the retries.

@tcamin
Copy link
Member

tcamin commented Nov 21, 2024

Thanks. I'll have a look into this, as well as into unit testing the retries.

Thanks! It would be great if you could add an integration test, similarly to the other ones that have already been written for the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants