Crashes in 1.1.1 #158

nuclearace · 2016-01-20T18:07:25Z

In the latest version I've noticed some crashes after the delegate is released. It doesn't happen all the time, but it happens pretty often. Not sure if I'm doing something crazy.

Test project: https://github.com/nuclearace/WebSocketCrash

nuclearace · 2016-01-20T18:22:01Z

I believe you need to add

deinit {
    outputStream?.close()
    inputStream?.close()
}

nuclearace · 2016-01-20T18:54:45Z

Closing the streams seems to work most of the time, but there's still the occasional crash

nuclearace · 2016-01-20T19:45:22Z

Actually it seems like if there's a lot of activity going on with the socket when it's being deinit it's more likely to crash

mattlilek · 2016-01-20T23:43:58Z

I just noticed this when upgrading my project to 1.1.1. If you run with zombies enabled, you get:
*** -[Starscream.WebSocket respondsToSelector:]: message sent to deallocated instance 0x1ff86bdc0

I'm not sure about the close() calls you mention @nuclearace, but I think the delegate for each of the streams need to be cleared. NSStream's delegate is unowned(unsafe) not weak so they're left dangling, just like in the Obj-C days. At least that consistently fixed the crashes for my project.

nuclearace · 2016-01-21T01:05:00Z

@mattlilek even setting the delegates to nil seems to cause a crash occasionally.

daltoniam · 2016-01-21T21:51:41Z

yeah this is a weird one (and I didn't see it when I ran my Autobahn test project, really strange). I ran the profiler and it looks like it is still getting delegate messages from the CFStreams which obviously is a problem if the object has been deallocated. Let me know if the fixes works and I will do a new version.

nuclearace · 2016-01-21T23:08:57Z

Still seeing some crashes

daltoniam · 2016-01-21T23:45:55Z

hmmm, ok I checked in another possible fix. I am not able to reproduce the issue with the same node server and example app, let me know if that resolve it.

nuclearace · 2016-01-21T23:52:07Z

That fixes most of the crashes, but I still get that occasional crash that looks like this

nuclearace · 2016-01-21T23:53:08Z

It's much less common than the other ones though, and certainly a race condition

daltoniam · 2016-01-22T04:37:22Z

ok good to know. We will probably have to add some protection before scheduling the to queue if the WebSocket has been deallocated. I will try to do that tomorrow if time permits.

daltoniam · 2016-01-23T03:31:05Z

I am trying to reproduce that last crash with the test app + node server, is there anything I can do to modify the example to trigger the crash more frequently? So far I haven't been able to reproduce it after multiple tries.

nuclearace · 2016-01-23T13:52:52Z

Let me see if there's a way to boost the probability of it happening.

nuclearace · 2016-01-23T14:18:23Z

@daltoniam I updated the test project. It's not the exact same crash that gets triggered, but it's the same type of crash--trying to form a closure while the object is in the process of being released.

nuclearace · 2016-01-23T14:46:02Z

One way to fix that I've come up with is to have some private property that tracks if its being released and set that to true in deint and then check if it's true before places where dispatch_async is called. But I'm still getting a weird crash that happens once in a while.

nuclearace · 2016-01-23T14:58:17Z

Ack, on second thought, I keep finding places where it wants to crash because of trying to retain the websocket while it's being released

nuclearace · 2016-01-23T14:59:54Z

But adding that released property has definitely made it seem less likely to crash.

daltoniam · 2016-01-24T00:42:34Z

@nuclearace ok I think I found a way around the issue, just not sure I love it though. If you add a mutex protected boolean before all of the dispatch_async calls, the issue will be resolved (NSLock + bool property).

I'm not sure I love that solution though and I'm reconsidering if switching over to the dispatch queue based code is really worth it at this point. I might just switch over to a single pthread that does that same thing had has its own run loop since I don't think we had these issues with the older code.

nuclearace · 2016-01-24T14:14:37Z

Yeah, I didn't like my pseudo-solution and it made me wonder if this threading approach was really worth it, since it seems to be fairly prone to bugginess.

daltoniam · 2016-01-26T02:23:55Z

Ok, so I did some more research and testing on this. I implemented the pthread and runloop solution, but it runs into the same problem. This is kinda of encouraging though, as it confirms the race condition will happen regardless of our setup. The big issue is because the WebSocket object is being deallocated from the main thread (which makes sense), but it maintains its own queue of background work inside which can cause the race condition. I don't love the idea of adding a mutex lock but it is the only way to ensure that if the object is randomly deallocated while in the middle of work it won't crash. I don't necessarily want to switch back to the old system as it "abusing" GCD by blocking the global queue it is on, hence the issue when a lot of WebSockets are created it keeps all the background work from being able to execute.

I know this restates a lot of the obvious but I wanted to document all of this incase questions come up later about it. I think the "safest" solution at this point is to add the mutex protection. Ideally in most cases it won't be needed as the WebSockets will be properly disconnected before being deallocated, but I want to protect against so Starscream isn't blamed or causing a crash in the case when a WebSocket is deallocated before disconnect.

Any questions or input, please add 😄

GuyKahlon · 2016-01-26T15:22:16Z

+1

daltoniam · 2016-01-28T05:40:45Z

I just checked in the fix discussed. Let me know if that seems to resolve it and I will do another release.

nuclearace · 2016-01-28T14:05:43Z

Seems to have fixed things

daltoniam · 2016-02-09T03:08:00Z

1.1.2 released!

nuclearace mentioned this issue Jan 20, 2016

Crashes on deinit since 5.2.0 socketio/socket.io-client-swift#283

Closed

daltoniam added a commit that referenced this issue Jan 21, 2016

possible fix for #158

f189e7c

daltoniam added a commit that referenced this issue Jan 21, 2016

another possible fix for #158

6c34076

daltoniam mentioned this issue Jan 24, 2016

Hangs on 401 response #161

Closed

nuclearace mentioned this issue Jan 27, 2016

Starscream hangs when NSStream closes due to connectivity issues #164

Closed

daltoniam mentioned this issue Jan 27, 2016

Close stream after error #165

Closed

daltoniam added a commit that referenced this issue Jan 28, 2016

fixes for #158, #159, #161, #164

afefdff

daltoniam closed this as completed Feb 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crashes in 1.1.1 #158

Crashes in 1.1.1 #158

nuclearace commented Jan 20, 2016

nuclearace commented Jan 20, 2016

nuclearace commented Jan 20, 2016

nuclearace commented Jan 20, 2016

mattlilek commented Jan 20, 2016

nuclearace commented Jan 21, 2016

daltoniam commented Jan 21, 2016

nuclearace commented Jan 21, 2016

daltoniam commented Jan 21, 2016

nuclearace commented Jan 21, 2016

nuclearace commented Jan 21, 2016

daltoniam commented Jan 22, 2016

daltoniam commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

daltoniam commented Jan 24, 2016

nuclearace commented Jan 24, 2016

daltoniam commented Jan 26, 2016

GuyKahlon commented Jan 26, 2016

daltoniam commented Jan 28, 2016

nuclearace commented Jan 28, 2016

daltoniam commented Feb 9, 2016

Crashes in 1.1.1 #158

Crashes in 1.1.1 #158

Comments

nuclearace commented Jan 20, 2016

nuclearace commented Jan 20, 2016

nuclearace commented Jan 20, 2016

nuclearace commented Jan 20, 2016

mattlilek commented Jan 20, 2016

nuclearace commented Jan 21, 2016

daltoniam commented Jan 21, 2016

nuclearace commented Jan 21, 2016

daltoniam commented Jan 21, 2016

nuclearace commented Jan 21, 2016

nuclearace commented Jan 21, 2016

daltoniam commented Jan 22, 2016

daltoniam commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

nuclearace commented Jan 23, 2016

daltoniam commented Jan 24, 2016

nuclearace commented Jan 24, 2016

daltoniam commented Jan 26, 2016

GuyKahlon commented Jan 26, 2016

daltoniam commented Jan 28, 2016

nuclearace commented Jan 28, 2016

daltoniam commented Feb 9, 2016