SignalR, No Connection with that ID #9917

rezanid · 2019-05-02T13:45:40Z

I have an asp.net core app that uses SignalR to report metrics that are visualized in the browser. It has been working fine for several weeks, but today I noticed that SignalR fails to connect very often and even when it connects, it disconnect shortly after. I'm using LongPolling and I can see in the network logs of Chrome that the first request to "negotiate" is always successful.

{"connectionId":"Vg4Mieg0ya3aD17-f8kMxw","availableTransports":[{"transport":"LongPolling","transferFormats":["Text","Binary"]}]}

but the second one which is a POST to "notify" is not (49 out of 50 times) and it responds with 404 Not found.

No Connection with that ID

I sometimes get the following as well.

{"error":"Handshake was canceled."}

The app works flawlessly when hosted locally (using localhost). It only happens when the app is hosted (IT/FT/..) in Azure as a Web App Service and all the resources of the application are downloaded successfully. I have the above issues only for SignalR requests from browser and logs on the server side don't show anything wrong.
Target framework: .NET Core 2.2

The text was updated successfully, but these errors were encountered:

BrennanConroy · 2019-05-02T15:23:19Z

Are you using a multi-instance web app? If so do you have ARR Affinity (sticky sessions) enabled?

rezanid · 2019-05-03T07:55:38Z

My colleague just checked. We have 2 instances and ARR Affinity is on. I'm going to deploy to a local IIS (as opposed to express that I'm using to debug) with production build and similar URL (virtual folders) to see if I can reproduce the issue in my dev box.

BrennanConroy · 2019-05-06T17:26:41Z

Could you collect a network trace of the failing connections? https://docs.microsoft.com/en-us/aspnet/core/signalr/diagnostics?view=aspnetcore-2.2#network-traces

rezanid · 2019-05-08T08:59:45Z

Thanks for the suggestion. I did and to be sure I also asked my colleague to reduce the instance count to 1 to reduce the number of things that can go wrong. After reducing the instances, it's now working fine apart from random Gateway timeout issues that can happen a few times during the day. If it happens I think SignalR stops polling permanently until I refresh the page. Is there any retry-policy + circuit breaker in it that I can configure?

BrennanConroy · 2019-05-08T16:11:34Z

After reducing the instances, it's now working fine

This heavily indicates that sticky sessions is not enabled or working correctly. The network traces (with 2 instances) would be useful to see what's going on.

Is there any retry-policy + circuit breaker in it that I can configure?

In 3.0 we're adding support for automatic reconnect, but for now you can do something like https://docs.microsoft.com/en-us/aspnet/core/signalr/javascript-client?view=aspnetcore-2.2#reconnect-clients

rezanid · 2019-05-08T17:02:47Z

I will do the trace with multiple instances tomorrow to prove your theory.

Thanks for the suggestion. I thought I have read all the documentation, but obviously I have missed this one at least.

analogrelay · 2019-05-21T21:36:43Z

Closing this as we haven't heard from you and generally close issues with no response after ~7 days. Please feel free to comment if you're able to get the information we're looking for and we can reopen the issue to investigate further!

jagathprasanga · 2019-07-12T09:58:31Z

Im also experiencing same issue , configurations are
Asp.Net Core 2.2 Load balancing between two instances (Redis is using)
Hosted on kestrel server on CentOS behind the HAProxy.
Please help me on this matter.
Thanks.

analogrelay · 2019-07-12T15:47:17Z

Asp.Net Core 2.2 Load balancing between two instances (Redis is using)
Hosted on kestrel server on CentOS behind the HAProxy.

Do you have session persistence (also called session affinity or "sticky sessions") enabled to ensure that all requests for the same user always go to the same server? SignalR requires that all HTTP requests for a single connection go to the same physical server and if any request ends up getting routed to a different server, you'll get a 404.

davidfowl · 2019-07-12T15:50:31Z

Lets improve the error message to include that information both in the logs and response.

analogrelay · 2019-07-12T16:44:44Z

cough #5350 cough

jagathprasanga · 2019-07-13T01:18:11Z

Yes, 404 error disappears when sticky sessions enabled in HAProxy, but why SignalR can't share connection information via Redis, then what is the purpose of Redis backplane as described here (https://docs.microsoft.com/en-us/aspnet/core/signalr/redis-backplane?view=aspnetcore-2.2)

davidfowl · 2019-07-13T01:23:05Z

Yes, 404 error disappears when sticky sessions enabled in HAProxy, but why SignalR can't share connection information via Redis,

Because it's unreliable to expect a persistent connection to exist across multiple server instances.

then what is the purpose of Redis backplane as described here (https://docs.microsoft.com/en-us/aspnet/core/signalr/redis-backplane?view=aspnetcore-2.2)

Message propagation across multiple servers (broadcast, group send, sending to another connection on another server).

jagathprasanga · 2019-07-13T01:38:48Z

Ok, I got the answer, Thank you very much.

scabana · 2019-07-15T17:01:12Z

Yes, 404 error disappears when sticky sessions enabled in HAProxy, but why SignalR can't share connection information via Redis,

Because it's unreliable to expect a persistent connection to exist across multiple server instances.

@davidfowl

I've seen a few presentations about the subject. I do understand this does reduce the complexity of the SignalR codebase. But, with cloud workloads becoming more and more common, scaling out and load balancing instead of scaling up an instance (be it on app services or kubernetes), load balancing on a round robin is being used more and more. Now, add geo localized deployments behing an Azure Traffic Manager and the problem shows up even more. This limitation (keeping a single server endpoint for 1 signal-r connection) is increasing the complexity of deployment and maintenance of systems since for this specific usage, we need a sticky session/server pair where, if in a micro-service pattern, most if not all services will not require sticky sessions. Move from app services to kubernetes and this become a bigger problem since containers are expected to get re-balanced over time. Is there a plan to enable breaking this link between server and session? If there's no plan, is there a work around?

Thanks a lot!

davidfowl · 2019-07-15T17:14:30Z

There is no plan. This is why we built the azure SignalR service, to hide the complexity of scaling persistent connections. It’s not about making the code base cleaner, it’s about making it easier to scale persist connections. This isn’t your typical stateless web tier, it’s stateful and you need to be aware of that when you use persistent connections. If for some reason you can’t use the service, then split your SignalR traffic from your web tier and scale them differently (that effectively what the service gives you for free).

scabana · 2019-07-15T18:10:37Z

Thank you for the quick reply.

zeroregard · 2019-07-19T08:21:31Z

Hi, I'm getting this issue after having my server running for a while. As far as I know, I have not scaled out my SignalR server to multiple instances, I assume this is something you have to configure actively on Azure somewhere (?) If it happens on a single instance, what could be the reason behind it? The exact error I get is

Request Finished Successfully, but the server sent an error. Status Code: 404-Not Found Message: No Connection with that ID

davidfowl · 2019-07-19T16:32:45Z

If it happens on a single server then something else is wrong. What server is this and does it happen randomly or can you reproduce it.

zeroregard · 2019-07-22T08:05:23Z

It's an Azure Web App running ASP.NET Core with the newest version of SignalR. I increased the timeout intervals, but it didn't seem to do a difference. It happens randomly, although there seems to be periods where it happens several times in a row.

analogrelay · 2019-07-22T15:11:15Z

@mathiassiig I'd strongly advise you to create a new issue with your specific scenario. Since this is a closed issue, it doesn't come up in our regular tracking so you're relying on David and I alone paying attention to our GitHub notifications (which we generally do, but can easily get behind on ;)).

BrennanConroy added the area-signalr Includes: SignalR clients and servers label May 2, 2019

BrennanConroy added the waiting label May 6, 2019

rezanid closed this as completed May 8, 2019

rezanid reopened this May 8, 2019

analogrelay closed this as completed May 21, 2019

ryancyq mentioned this issue Oct 13, 2019

Abp signalR random get the 404 error response aspnetboilerplate/aspnetboilerplate#4919

Closed

ghost locked as resolved and limited conversation to collaborators Dec 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SignalR, No Connection with that ID #9917

SignalR, No Connection with that ID #9917

rezanid commented May 2, 2019 •

edited

Loading

BrennanConroy commented May 2, 2019

rezanid commented May 3, 2019 •

edited

Loading

BrennanConroy commented May 6, 2019

rezanid commented May 8, 2019 •

edited

Loading

BrennanConroy commented May 8, 2019

rezanid commented May 8, 2019

analogrelay commented May 21, 2019

jagathprasanga commented Jul 12, 2019

analogrelay commented Jul 12, 2019

davidfowl commented Jul 12, 2019

analogrelay commented Jul 12, 2019

jagathprasanga commented Jul 13, 2019

davidfowl commented Jul 13, 2019

jagathprasanga commented Jul 13, 2019

scabana commented Jul 15, 2019

davidfowl commented Jul 15, 2019

scabana commented Jul 15, 2019

zeroregard commented Jul 19, 2019

davidfowl commented Jul 19, 2019

zeroregard commented Jul 22, 2019

analogrelay commented Jul 22, 2019

SignalR, No Connection with that ID #9917

SignalR, No Connection with that ID #9917

Comments

rezanid commented May 2, 2019 • edited Loading

BrennanConroy commented May 2, 2019

rezanid commented May 3, 2019 • edited Loading

BrennanConroy commented May 6, 2019

rezanid commented May 8, 2019 • edited Loading

BrennanConroy commented May 8, 2019

rezanid commented May 8, 2019

analogrelay commented May 21, 2019

jagathprasanga commented Jul 12, 2019

analogrelay commented Jul 12, 2019

davidfowl commented Jul 12, 2019

analogrelay commented Jul 12, 2019

jagathprasanga commented Jul 13, 2019

davidfowl commented Jul 13, 2019

jagathprasanga commented Jul 13, 2019

scabana commented Jul 15, 2019

davidfowl commented Jul 15, 2019

scabana commented Jul 15, 2019

zeroregard commented Jul 19, 2019

davidfowl commented Jul 19, 2019

zeroregard commented Jul 22, 2019

analogrelay commented Jul 22, 2019

rezanid commented May 2, 2019 •

edited

Loading

rezanid commented May 3, 2019 •

edited

Loading

rezanid commented May 8, 2019 •

edited

Loading