Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better blazor scale-out documentation #17986

Closed
schmitch opened this issue Dec 20, 2019 · 33 comments
Closed

better blazor scale-out documentation #17986

schmitch opened this issue Dec 20, 2019 · 33 comments
Assignees
Labels
affected-all This issue impacts all the customers area-blazor Includes: Blazor, Razor Components Docs This issue tracks updating documentation feature-blazor-server Pillar: Technical Debt Priority:1 Work that is critical for the release, but we could probably ship without severity-major This label is used by an internal tool

Comments

@schmitch
Copy link

currently our company want's to scale-out blazor a little bit, however the documentation is really lacking, especially what happens if redis/azure signalr services is used as a backend.

I mean currently SignalR itself can be configured without sticky sessions and redis.
However there is nearly no documentation, if something like that can be done with blazor.

Also what happens in case of a server failure, can a secon server take over the work when signalr runs with a redis backing store? or does blazor still keep everything in memory?

@pranavkm
Copy link
Contributor

@schmitch there's a fair bit of documentation about deploying Blazor with Azure SignalR Service here: https://docs.microsoft.com/en-us/aspnet/core/host-and-deploy/blazor/server?view=aspnetcore-3.1#signalr-configuration which states that sticky sessions are required. https://docs.microsoft.com/en-us/aspnet/core/blazor/state-management?view=aspnetcore-3.1 also covers some details about how to effectively manage state with multiservers.

Do these documents cover your questions?

@pranavkm pranavkm added area-blazor Includes: Blazor, Razor Components Docs This issue tracks updating documentation labels Dec 20, 2019
@pranavkm pranavkm added this to the Discussions milestone Dec 20, 2019
@nullpainter
Copy link

I'm in the same boat as @schmitch. I have read that sticky sessions are not required when using Redis and if the only transport is WebSockets, but this appears to require client-side configuration. I can't see any documentation about whether this is possible to configure with Blazor.

I've implemented custom state management for ephemeral state, but this isn't really the issue.

What I'm looking for is a practical guide on how to redeploy or upgrade nodes in a multi-node cluster without the "Could not reconnect to the server. Reload the page to restore functionality." error being displayed to end users.

I was assuming - perhaps naively, having not played with SignalR before - that I could use a Redis backplane, somehow disable the need for sticky sessions, and all state is replicated across servers.

I modified ConfigureServices to add Redis support to SignalR:

services.AddServerSideBlazor();
services.AddSignalR().AddStackExchangeRedis(...);

I can see pub/sub channels created, but I can't see any messages being sent. Should there be?

tl;dr - how is one supposed to scale out Blazor applications, not use Azure SignalR Service, and maintain zero-downtime for end users? The state management page is only part of the puzzle, and that seems a fairly straightforward one to address.

@nullpainter
Copy link

This is a similar question to #9734, with the same state management page discussed, however this doesn't address that poster's concerns either.

It's straightforward to persist application state (view models etc.) via an ad-hoc state manager, but what is resulting in the 'could not reconnect' message? Is there some way to implement a non ad-hoc manager - i.e. some sort of state manager registered with Blazor - so Blazor automatically reconnects?

I assumed this message was relating to the SignalR connection being broken, so I'm confused why the state management page keeps being referenced.

@nullpainter
Copy link

Or is the message that SignalR terminations are unavoidable, and that the Blazor approach is for the developer to to maintain state as per ASP.NET Core Blazor state management?

@physix99
Copy link

physix99 commented Sep 4, 2020

I also have come across the same issue. It's impossible to update a blazor server side app without all connected clients receiving the connection disconnected message - which I understand why, but I'd really like a workaround.

The same thing also happens when you use Azure Deployment slots and using Azure SignalR service, as ultimately the signalr circuit is being lost.

Your idea of using Redis for the signalr connections would have been great if that worked. Is it possible for that to be implemented?

@danroth27 can the client circuit object be serialized?

@danroth27
Copy link
Member

danroth27 commented Sep 4, 2020

@physix99 There are a number of different ways you can handle persisting the app state for a given client: https://docs.microsoft.com/aspnet/core/blazor/state-management?pivots=server In .NET 5, the support for using protected browser storage will be built in to the framework.

@physix99
Copy link

physix99 commented Sep 8, 2020

@danroth27 Thanks Dan for your reply.

What I was actually referring to was serializing the signalr client object for Blazor Server. Which is what i believe contains all the information about the client (e.g. information needed to render the DOM).

What i'm after is being able to have 2 back end blazor servers for load balancing. Azure Signalr service in front of them - sticky sessions turned off.

Then if we need to update the blazor server application, we can do 1 server. All the signalr connections to that server will be dropped, but Azure will automatically join them to the other server. Because the Blazor Server circuit has been serialised between the two (using whatever mechanism needed), the client will join to the remaining online server with a very minimal reconnecting screen.

Obviously if there has been big changes the client will need to refresh the page to remake the connection, but for all major changes it should be good enough?

Thanks

@javiercn javiercn added affected-all This issue impacts all the customers enhancement This issue represents an ask for new feature or an enhancement to an existing one severity-blocking This label is used by an internal tool labels Oct 9, 2020 — with ASP.NET Core Issue Ranking
@SteveSandersonMS SteveSandersonMS added severity-major This label is used by an internal tool and removed severity-blocking This label is used by an internal tool labels Oct 9, 2020
@ghost
Copy link

ghost commented Dec 8, 2020

Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue.

This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue!

@ghost ghost closed this as completed Dec 8, 2020
@ghost
Copy link

ghost commented Dec 8, 2020

Thanks for contacting us.
We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@mkArtakMSFT
Copy link
Member

Reopening to document this.

@mkArtakMSFT mkArtakMSFT reopened this Dec 8, 2020
@leonkosak
Copy link

To summarize this: If there is Redis server and clients connect with SignalR only via WebSockets, Blazor Server-Side application is properly scaled-out (and therefore no need for Sticky Sessions), right?

@mkArtakMSFT mkArtakMSFT removed the enhancement This issue represents an ask for new feature or an enhancement to an existing one label Sep 16, 2021
@mkArtakMSFT mkArtakMSFT modified the milestones: 6.0.0, 6.0-docs-infra Oct 19, 2021
@ghost
Copy link

ghost commented Oct 6, 2023

Thanks for contacting us.

We're moving this issue to the .NET 9 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@maxsargentdev
Copy link

Just trying to work out some issues using blazor in my organization.

Is there any reason why blazor server doesn't support using redis as a backplane, when a vanilla signalr connection does support it?

Seems like it would be a nice feature.

@ghost
Copy link

ghost commented Dec 12, 2023

Thanks for contacting us.

We're moving this issue to the .NET 9 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@mkArtakMSFT mkArtakMSFT added the Priority:1 Work that is critical for the release, but we could probably ship without label Jan 11, 2024
@n2029-ndensan
Copy link

Currently, Blazor Server does not support cloud native. To support autoscaling, the architecture must be able to scale out and scale in. We recognize that we cannot safely scale in. I would like you to respond as soon as possible.

@garrettlondon1
Copy link

@danroth27 If I remember, you mentioned at Build during our conversations that Azure SignalR service is no longer being recommended for Blazor Server scale-out. Is that still true?

In addition, Azure Front Door also doesn't even support Websockets, which has also been on the roadmap for 4+ years.

A lot of the answers on this thread about using Redis as a backplane have also gone unanswered...

InteractiveServer developers are, I believe, genuinely confused with the odds stacked against them if they want to develop Enterprise grade applications :(.

@garrettlondon1
Copy link

garrettlondon1 commented Aug 16, 2024

https://learn.microsoft.com/en-us/aspnet/core/signalr/redis-backplane?view=aspnetcore-8.0

Using Redis as a SignalR backplane, I have a few thoughts.

Currently, Blazor Server has the default ComponentHub which handles interactive websocket SignalR connections containing the application state. And then, if you want to expose additional real-time functionality.. you have to do that through an additional Hub

When scaling beyond one instance, obviously that additional hub will just be sending messages to people on the same instance.

My understanding about using Redis as a backplane for Blazor Server is: it will not store application state at all, that still requires sticky sessions on the same instance, and the application state is still stored in memory in the aspnet process

Redis, will, though, correctly operate as a distributed backplane between all instances and real-time functionality on the additional hubs.

Using Redis as a backplane is not like using Azure SignalR service because my understanding is that the websocket connection does not live on the web server, but rather the client makes the websocket handshake with the Azure SignalR service.

While doing some testing, trying to validate this, I noticed that if
services.AddSignalR()
.AddStackExchangeRedis

contains a bad connection string, it will render all InteractiveServer components not functional.

The question is: why, if the Redis cluster does not contain any application state, and the websocket lives on the server, would InteractiveServer components not function

@misiek08
Copy link

misiek08 commented Sep 3, 2024

Reading this topic it looks like Blazor is still a toy. It’s not an Enterprise grade feature to have simple code deployments, simple scaling and HA (graceful failover of client between backends). 5 years ticking and not a single clear answer here…

Am I missing something or simple state serialization in storage like Redis could solve the issue?

@garrettlondon1
Copy link

garrettlondon1 commented Sep 5, 2024

Reading this topic it looks like Blazor is still a toy. It’s not an Enterprise grade feature to have simple code deployments, simple scaling and HA (graceful failover of client between backends). 5 years ticking and not a single clear answer here…

Am I missing something or simple state serialization in storage like Redis could solve the issue?

Simple code deployments, scaling, and HA are all possible with Blazor..

Your user will lose websocket state if they handshake is broken or circuit breaks for sure.. but I don't know if storing application state in Redis is a good idea for Blazor Server

Blazor Server has realtime DOM updates fly over websocket so quickly that if application state is not stored on the server, I imagine the latency between the user <> server <> redis and back is too significant.

Redis as a normal SignalR backplane that does not communicate DOM updates, but realtime functionality, can survive this delay.. DOM updates at the speed of Blazor Server, I don't think so. Azure SignalR doesn't have this problem because the users connect directly to the SignalR service and not the application, so there is less latency.. although Azure SignalR is still a poor solution.

Also: when it comes to "sticky sessions" for websockets and HTTP requests being mixed in Blazor Web App..

  • I am confused as to why the ARRAffinity token is needed for Blazor Server. There is no session state, there is just a websocket connection and a circuit. If that circuit is broken, what's the problem if the user opens a wss connection to another server?
  • What's the problem if the Blazor Web App makes an HTTP call to server A, but websocket traffic is going through server B.

I think the lack of documentation/answers here leaves a lot to be wondered about how Blazor and InteractiveServer should be used in production and enterprise grade workflows

@garrettlondon1
Copy link

garrettlondon1 commented Sep 5, 2024

A Blazor app prerenders in response to the first client request, which creates UI state on the server. When the client attempts to create a SignalR connection, the client must reconnect to the same server.

https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/signalr?view=aspnetcore-8.0#use-session-affinity-sticky-sessions-for-server-side-webfarm-hosting

@guardrex , so if Prerendering in Blazor is disabled, sticky sessions are not needed?

EDIT:

  • Sticky sessions are always required for Blazor Server due to disconnection/reconnection to same circuit

@guardrex
Copy link
Contributor

guardrex commented Sep 5, 2024

I doubt it, but this is for the product unit to address. I'm 👂 for their further remarks that I can place into the session affinity guidance.

@garrettlondon1
Copy link

garrettlondon1 commented Sep 10, 2024

I did manage to try this out on App Service with multiple instances with session affinity
image

Using only InteractiveServer with pre-rendering disabled. It did not work, received below errors about 50% of the time. Enhanced navigation worked fine when circuit was established, but not when hard refreshing.

@rendermode @(new InteractiveServerRenderMode(false)) , with static pages and static router

blazor.web.js:1  WebSocket connection to 'wss://app.com/_blazor?id=8upHQjrBq7LzcNkAmI4r3Q' failed: 

blazor.web.js:1 [2024-09-10T04:35:02.483Z] Information: (WebSockets transport) There was an error with the transport.
blazor.web.js:1  [2024-09-10T04:35:02.483Z] Error: Failed to start the transport 'WebSockets': Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled.
blazor.web.js:1  [2024-09-10T04:35:02.514Z] Error: Failed to start the connection: Error: Unable to connect to the server with any of the available transports. Error: WebSockets failed: Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. Error: LongPolling failed: Error: No Connection with that ID: Status code '404'
blazor.web.js:1  [2024-09-10T04:35:02.514Z] Error: Error: Unable to connect to the server with any of the available transports. Error: WebSockets failed: Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled. ServerSentEvents failed: Error: 'ServerSentEvents' does not support Binary. Error: LongPolling failed: Error: No Connection with that ID: Status code '404'
(anonymous) @ blazor.web.js:1
setTimeout
rootComponentsMayRequireRefresh @ blazor.web.js:1
onDocumentUpdated @ blazor.web.js:1
Ki @ blazor.web.js:1
blazor.web.js:1  [2024-09-10T04:35:02.514Z] Error: Failed to start the circuit.

@garrettlondon1
Copy link

garrettlondon1 commented Sep 25, 2024

Related #58078

and #58079

@misiek08
Copy link

Simple code deployments, scaling, and HA are all possible with Blazor..

Your user will lose websocket state if they handshake is broken or circuit breaks for sure..

I'll try to write a simple page and try to get back with results. It sounds like whole logic with break here.

but I don't know if storing application state in Redis is a good idea for Blazor Server

I'm not sure about all the state of DOM. Maybe some of them can be rebuilt from "the" state. Keeping whole DOM in Redis is stupid, I agree.

Blazor Server has realtime DOM updates fly over websocket so quickly that if application state is not stored on the server, I imagine the latency between the user <> server <> redis and back is too significant.

I'm not sure if it would be so bad, because I could implement "sharded circuit storage" and just scale Redis horizontally. All I need here is a way to redeploy or just scale Blazor without hard session affinity. So user is connected to same server for some longer period, but if I need to redeploy or scale up it can easily reconnect elsewhere, server will rebuild state getting it from the storage and just continue serving the user.

Redis as a normal SignalR backplane that does not communicate DOM updates, but realtime functionality, can survive this delay.. DOM updates at the speed of Blazor Server, I don't think so. Azure SignalR doesn't have this problem because the users connect directly to the SignalR service and not the application, so there is less latency.. although Azure SignalR is still a poor solution.

Can SignalR be used as complete solution or are we talking about message bus only?

I think the lack of documentation/answers here leaves a lot to be wondered about how Blazor and InteractiveServer should be used in production and enterprise grade workflows

That's why I wrote it looks like a toy. With all those sealed classes around circuits it looks like (after just reading code via Github, without even cloning) it's impossible to really use it in enterprise, long running UIs. I have a case with 500k people online on the website and Blazor performance is okayish for the case, but all HA and resillency topics look bad. Currently I use Go backend, looking into C# for migration and React frontend.

The connection could not be found on the server

That's what I want to serialize and save in Redis (we can do this on server shutdown and/or connection lost/close) and read when client connects to restore state.

@garrettlondon1
Copy link

garrettlondon1 commented Oct 29, 2024

Simple code deployments, scaling, and HA are all possible with Blazor..
All I need here is a way to redeploy or just scale Blazor without hard session affinity.

See #58079

Session affinity is always required for Blazor Server because reconnection process has to connect to same server since thats where circuit is stored and pre-rendered view according to Javier

@mkArtakMSFT
Copy link
Member

This issue is quite old now.
We have existing documentation for the topics covered in this issue.
If there is anything specific you think we're missing, please file a separate issue for it.

@mkArtakMSFT mkArtakMSFT closed this as not planned Won't fix, can't repro, duplicate, stale Nov 14, 2024
@misiek08
Copy link

@mkArtakMSFT can you just post a link to these docs? If I understand correctly Blazor is not ready for any scaling and all classes that could help - are sealed. First tests showed me this, I'm trying to make a smallest reproducible package and wanted to share it here, but issue is now closed...

@garrettlondon1
Copy link

garrettlondon1 commented Nov 15, 2024

@misiek08 what concerns do you have about scaling Blazor Server?

The only thing that comes to mind for me is that Azure Front Door has failed to support Websockets for 5 years after thousands of upvotes

Scaling Blazor Server is really dependent on how much additional memory you allow the circuit to consume. If you have a lot of scoped services and a lot of circuits, multiply approx the amount of memory you need to persist those scoped services * the number of circuits.

Other than that, it's the same as any other stateful web app using session affinity, besides hoping your users latency to the socket isn't too high.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affected-all This issue impacts all the customers area-blazor Includes: Blazor, Razor Components Docs This issue tracks updating documentation feature-blazor-server Pillar: Technical Debt Priority:1 Work that is critical for the release, but we could probably ship without severity-major This label is used by an internal tool
Projects
None yet
Development

No branches or pull requests