Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix threading issue in IPC connection #210

Closed

Conversation

alvar-bolt
Copy link
Contributor

This change addresses the possibility of
DTXIPCConnection._slaveDidConnectWithName being called from non-main thread by dispatching the work to main thread when it is being called from non main thread. This is necessary so that _otherConnection gets assigned on main thread.

SBTUITestTunnelClient.serverDidConnect is updated to wait for ipcConnection to become valid. This is needed because DTXIPCConnection._slaveDidConnectWithName and SBTUITestTunnelClient.serverDidConnect can be called from different threads upon which we dispatch both to main. But SBTUITestTunnelClient.serverDidConnect work can start first at which point it can lead to assertion of ipcConnection not being valid as ipcConnection._otherConnection has not yet been assigned.


These threading issues can result in following assertion:

Thread 1: "[SBTUITestTunnelClient] Failed getting IPC proxy, Error Domain=DTXIPCErrorDomain Code=1 \"Connection <DTXIPCConnection: 0x60000265b9c0> is invalid.\" UserInfo={NSLocalizedDescription=Connection <DTXIPCConnection: 0x60000265b9c0> is invalid.}"

In order to debug this issue, I added bunch of NSLogs to the codebase (alvar-bolt@e96bbc4#diff-313307a5a8b84223b5eb7add983e43794ab1c14ea722485ed889152e5519b725R326).

From there, I ran a test that called app.userDefaultsObject immediately after tunnel was launched successfully. This is important as it ends up calling -[_DTXIPCDistantObject forwardInvocation:] on main thread which in turn calls _connection.isValid which is:

- (BOOL)isValid
{
	return _connection.isValid && _otherConnection.isValid;
}

Important part of that code is access of _otherConnection. As current bug allows writing of it from different thread, then we cannot be sure that it is set.

In our codebase is was able to find a test case where _slaveDidConnectWithName was called on non main thread in ~25% of the cases. Luckily I was also able to observe that behaviour in SBTUITestTunnel_Tests.MiscellaneousTests.testCustomCommand(), although there it happens in ~1% of the cases.

This behaviour does not always lead to assertion as there is still possibility that _otherConnection gets stored before it is being accessed from main thread. But if you are unlucky enough, it is still in progress and so, the isValid check will fail.

Example of _slaveDidConnectWithName

Screenshot 2024-05-15 at 09 56 24

This change addresses the possibility of
`DTXIPCConnection._slaveDidConnectWithName` being called from non-main
thread by dispatching the work to main thread when it is being called
from non main thread. This is necessary so that `_otherConnection`
gets assigned on main thread.

`SBTUITestTunnelClient.serverDidConnect` is updated to wait for
`ipcConnection` to become valid. This is needed because
`DTXIPCConnection._slaveDidConnectWithName` and `SBTUITestTunnelClient.serverDidConnect`
can be called from different threads upon which we dispatch both to
main. But `SBTUITestTunnelClient.serverDidConnect` work can start
first at which point it can lead to assertion of `ipcConnection` not
being valid as `ipcConnection._otherConnection` has not yet been
assigned.
@tcamin
Copy link
Member

tcamin commented May 18, 2024

Nice! I'll take a better look at the changes next week.

@tcamin
Copy link
Member

tcamin commented Jun 4, 2024

Apologies for the long delay!

I have reviewed the state flow during the initialization process, and I agree that there are potential race conditions as you pointed out.

Below, I have summarized the initial phases of the connection just for convenience

  │                                                                                                                                                                                   
  │                                                                                     ║                                                                                             
  │   UITestRunner (TunnelClient)                                                       ║ App (TunnelServer)                                                                          
  │   IPC master                                                                        ║ IPC slave                                                                                   
  │                                                                                     ║                                                                                             
  ╞═════════════════════════════════════════════════════════════════════════════════════╬════════════════════════════════════════════════════════════════════════════════════════     
  │                                                                                     ║                                                                                             
  │  ┌────────────────────────────────────────────┐                                     ║                                                                                             
  │  │                                            │                                     ║                                                                                             
  │  │ - [DTXIPCConnection initWithServiceName:]  ├────▶  Becomes IPC master            ║                                                                                             
  │  │                                            │       _otherConnection = nil        ║                                                                                             
  │  └───────────────────┬────────────────────────┘                                     ║                                                                                             
  │                      │                                                              ║                                                                                             
  │                      │                                                              ║                                                                                             
  │  ┌───────────────────▼────────────────────────┐                                     ║                                                                                             
  │  │                                            │                                     ║                                                                                             
  │  │        - [DTXIPCConnection resume]         │                                     ║                                                                                             
  │  │                                            │                                     ║                                                                                             
  │  └───────────────────┬────────────────────────┘                                     ║                                                                                             
  │                      │                                                              ║                                                                                             
  │                      │                                                              ║                                                                                             
  │  ┌───────────────────▼────────────────────────┐                                     ║   ┌───────────────────────────────────────────────────────────────────┐                     
  │  │                                            │                                     ║   │                                                                   │                     
  │  │                app.launch()                ├─────────────────────────────────────╫───▶  - [SBTUITestTunnelServer takeOffOnceIPCWithServiceIdentifier:]   │                     
  │  │                                            │                                     ║   │                                                                   │                     
  │  └────────────────────────────────────────────┘                                     ║   └──────────────────────┬────────────────────────────────────────────┘                     
  │                                                                                     ║                          │                                                                  
  │                                                                                     ║                          │                                                                  
  │                                                ┌──────────────────────────┐         ║   ┌──────────────────────▼─────────────────────┐                                            
  │                                                │                          │         ║   │                                            │                                            
  │                                            ┌───┤ _slaveDidConnectWithName │◀────────╫───┤ - [DTXIPCConnection initWithServiceName:]  ├────▶  Becomes IPC slave                    
  │                                            │   │                          │  ASYNC  ║   │                                            │       _otherConnection != nil              
  │                                            │   └──────────────────────────┘         ║   └──────────────────────┬─────────────────────┘       This allows sending messages from    
  │                                            │                                        ║                          │                             slave to master.                     
  │        ┌───────────────────────────────────▼──────────────────────────────────┐     ║                          │                                                                  
  │        │                                                                      │     ║   ┌──────────────────────▼─────────────────────┐       IPC master _otherConnection still nil
  │        │ _otherConnection = [NSConnection connectionWithRegisteredName:host:] │     ║   │                                            │                                            
  │        │                                                                      │     ║   │        - [DTXIPCConnection resume]         │                                            
  │        └──────────────────────────────────────────────────────────────────────┘     ║   │                                            │                                            
  │                                                                                     ║   └──────────────────────┬─────────────────────┘                                            
  │                                                                                     ║                          │                                                                  
  │        ┌─────────────────────────────────────────────────────────────────────┐      ║                          │                                                                  
  │        │ Initialization completed, however as long as _otherConnection isn't │      ║                          │                                                                  
  │        │    initialized in _slaveDidConnectWithName we are unable to send    │      ║   ┌──────────────────────┴─────────────────────┐                                            
  │        │                       messaged to the slave.                        │      ║   │                                            │                                            
  │        │                                                                     ◀──────╫───┤    - [self.ipcProxy serverDidConnect:]     │                                            
  │        │The race condition is here and occurs if _slaveDidConnectWithName is │      ║   │                                            │                                            
  │        │             invoked after serverDidConnect is received.             │      ║   └────────────────────────────────────────────┘                                            
  │        │                                                                     │      ║                                                                                             
  │        └─────────────────────────────────────────────────────────────────────┘      ║                                                                                             
  │                                                                                     ║                                                                                             
  │                                                                                     ║                                                                                             
  │                                                                                     ║                                                                                             
  ▼                                                                                     ║                                                                                             
time                                                                                                                                                                                  

Instead of recursively calling checkConnectionAndProceed (which could potentially lead to infinite recursion) I'm proposing a different approach in this branch where I'm synchronizing the access to _otherConnection and synchronously waiting for the isValid property to be true. Wdyt, would that be a viable solution?

@alvar-bolt
Copy link
Contributor Author

Hey. I agree, synchronising it on separate queue is better approach. I tried your solution on our test suite and did not get any assertions from ~3000 runs. So I'm happy to continue with your solution.
I guess it makes more sense for you to open new PR and I'll close this with reference to that.

@tcamin tcamin mentioned this pull request Jun 6, 2024
@tcamin
Copy link
Member

tcamin commented Jun 6, 2024

Thanks a lot for giving it a try! I opened #212 and will try to merge it soon.

Apologies again for the slow reaction time, but it's a particularly busy period right now!

@alvar-bolt
Copy link
Contributor Author

No need to apologise. You are offering a free software and it's not an easy task to maintain open source project. So, I'm very thankful to you for even taking time and looking into it.

@tcamin
Copy link
Member

tcamin commented Jun 7, 2024

I've merged #212, gonna close this one. Thanks again for bringing up this issue 👍

@tcamin tcamin closed this Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants