-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashes in 1.1.1 #158
Comments
I believe you need to add deinit {
outputStream?.close()
inputStream?.close()
} |
Closing the streams seems to work most of the time, but there's still the occasional crash |
Actually it seems like if there's a lot of activity going on with the socket when it's being deinit it's more likely to crash |
I just noticed this when upgrading my project to 1.1.1. If you run with zombies enabled, you get: I'm not sure about the close() calls you mention @nuclearace, but I think the delegate for each of the streams need to be cleared. NSStream's delegate is |
@mattlilek even setting the delegates to nil seems to cause a crash occasionally. |
yeah this is a weird one (and I didn't see it when I ran my Autobahn test project, really strange). I ran the profiler and it looks like it is still getting delegate messages from the CFStreams which obviously is a problem if the object has been deallocated. Let me know if the fixes works and I will do a new version. |
Still seeing some crashes |
hmmm, ok I checked in another possible fix. I am not able to reproduce the issue with the same node server and example app, let me know if that resolve it. |
It's much less common than the other ones though, and certainly a race condition |
ok good to know. We will probably have to add some protection before scheduling the to queue if the WebSocket has been deallocated. I will try to do that tomorrow if time permits. |
I am trying to reproduce that last crash with the test app + node server, is there anything I can do to modify the example to trigger the crash more frequently? So far I haven't been able to reproduce it after multiple tries. |
Let me see if there's a way to boost the probability of it happening. |
@daltoniam I updated the test project. It's not the exact same crash that gets triggered, but it's the same type of crash--trying to form a closure while the object is in the process of being released. |
Ack, on second thought, I keep finding places where it wants to crash because of trying to retain the websocket while it's being released |
But adding that released property has definitely made it seem less likely to crash. |
@nuclearace ok I think I found a way around the issue, just not sure I love it though. If you add a mutex protected boolean before all of the dispatch_async calls, the issue will be resolved (NSLock + bool property). I'm not sure I love that solution though and I'm reconsidering if switching over to the dispatch queue based code is really worth it at this point. I might just switch over to a single pthread that does that same thing had has its own run loop since I don't think we had these issues with the older code. |
Yeah, I didn't like my pseudo-solution and it made me wonder if this threading approach was really worth it, since it seems to be fairly prone to bugginess. |
Ok, so I did some more research and testing on this. I implemented the pthread and runloop solution, but it runs into the same problem. This is kinda of encouraging though, as it confirms the race condition will happen regardless of our setup. The big issue is because the WebSocket object is being deallocated from the main thread (which makes sense), but it maintains its own queue of background work inside which can cause the race condition. I don't love the idea of adding a mutex lock but it is the only way to ensure that if the object is randomly deallocated while in the middle of work it won't crash. I don't necessarily want to switch back to the old system as it "abusing" GCD by blocking the global queue it is on, hence the issue when a lot of WebSockets are created it keeps all the background work from being able to execute. I know this restates a lot of the obvious but I wanted to document all of this incase questions come up later about it. I think the "safest" solution at this point is to add the mutex protection. Ideally in most cases it won't be needed as the WebSockets will be properly disconnected before being deallocated, but I want to protect against so Starscream isn't blamed or causing a crash in the case when a WebSocket is deallocated before disconnect. Any questions or input, please add 😄 |
+1 |
I just checked in the fix discussed. Let me know if that seems to resolve it and I will do another release. |
Seems to have fixed things |
|
In the latest version I've noticed some crashes after the delegate is released. It doesn't happen all the time, but it happens pretty often. Not sure if I'm doing something crazy.
Test project: https://github.com/nuclearace/WebSocketCrash
The text was updated successfully, but these errors were encountered: