High Availability #520

calebmeyer · 2015-08-20T16:51:51Z

Can this app be set up for high availability? If you have a cluster, can you have a load balancer, at least two application nodes, and several mongo nodes?

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Sing-Li · 2015-08-20T19:17:52Z

We need to try it in a lab environment. Configure a mongo replica set with op log. Load balancer needs to support sticky sessions if you have old browsers and/or older mobile clients. @calebmeyer -- do you have access to test facilities for HA configurations? If issues are found in such configurations - we can fix them. (note - some optional packages, such as file-upload to local filesystem, are by design not HA compatible)

calebmeyer · 2015-08-20T19:37:30Z

I have access to some testing facilities. I will be trying this out. Can you recommend a load balancer?

Sing-Li · 2015-08-20T19:47:19Z

haproxy? Other community members may have more experience with other software load balancers. Please keep us posted on how it goes.

geekgonecrazy · 2015-08-22T06:51:20Z

I enjoy playing with these things. I'll give this a go.

So things wanted tested?

cluster mongodb (couple nodes)
rocket.chat (couple instances. Pointing to the master of the mongo cluster)
load balancer (haproxy or nginx)

Anything else? I guess rattle it a bit and see what bolts fall out? :D

Sing-Li · 2015-08-22T12:47:23Z

My 2c

variety of client browser/devices (especially IE versions)
force-fail mongo instance; force-fail meteor instance; observe interruption in user experience - expected to see some UI anomalies due to delayed session recovery/fail-over, but no data lost

Sing-Li · 2015-09-07T02:02:09Z

@calebmeyer any update on experiments with clusters of more than 4 RC instances? Thanks.

calebmeyer · 2015-09-07T05:38:07Z

I have logged a JIRA for access to my organization's openstack cluster, I'm just waiting on a response. I'll keep you posted.

Sing-Li · 2015-09-07T07:12:21Z

Thanks, @calebmeyer ! 👍

Sing-Li · 2015-09-11T22:15:22Z

#769 <---- where are the HA guys when we really need them?! 😀

geekgonecrazy · 2015-09-11T22:17:19Z

We need to do some testing this weekend

leefaus · 2015-09-15T22:18:16Z

@geekgonecrazy @Sing-Li

I have an HA solution up and running at Google Cloud. Here is the private Gist that describes what I did. I still need to tweak some of the docs, but it is running famously at http://130.211.152.251/.

geekgonecrazy · 2015-09-15T22:19:42Z

@leefaus sweet! Come across any issues?

Also did you use anything in front of your nodes?

leefaus · 2015-09-15T22:21:22Z

@geekgonecrazy

The only issue is trying to automate setting up the MongoDB Replica Set. I am digging more into that. And on Google Cloud, the HTTP Load Balancing doesn't support websockets so you need to use Network Load Balancing. Other than that, everything went according to plan.

Sing-Li · 2015-09-15T22:30:10Z

Thanks for the update! Automated provisioning of failed mongo would be cool ... k8s can do that for sure. But websocket load balance front end is going to take some testing and tinkering. I think. We need to come up with some way to blast it from the front, then introduce different failures at the back and see how the thing behaves 😸

What is NLB? Is it round robin?

Please keep us in the 'loop' :-)

calebmeyer · 2015-09-21T15:38:10Z

@leefaus Thanks for the awesome starting point!
@Sing-Li I got access to our internal open stack. I'll be looking into setting up a cluster with:
1 HAProxy node
2 Rocket.Chat meteor nodes
3 MongoDB nodes

I'm new to setting all this up, so it will likely take some time while I learn the technologies (or at least how to configure them).

Can anyone recommend testing tools for hitting the frontend, or should I just script something?

Sing-Li · 2015-09-21T15:48:10Z

@calebmeyer thanks for the update! Two suggestions:

Can you please 'up' the number of Rocket.Chat meteor nodes from 2 to either 6 or 10? The reason being the 24 x 7 demo chat that we have running now is already 4 nodes - and we see no problem at that tier so far
For front-end loading, I'd suggest scripting something yourself. We will have a Rocket.Chat specific scalable 'load test tool' in the future.

geekgonecrazy · 2015-09-21T16:10:26Z

@calebmeyer for load i'd recommend taking a look at Asteroid

Could use multiple instances of a script using this and hit the server with as many messages as you want.

calebmeyer · 2015-09-28T17:57:03Z

@Sing-Li will I need to set up sticky sessions?

leefaus · 2015-09-29T16:33:17Z

@calebmeyer

Yes, you need sticky sessions.

calebmeyer · 2015-11-13T20:50:56Z

Hey all, sorry for the long downtime. I got one of the openstack admins to help me. Didn't get my 6-10 app nodes, but I have it running behind a proxy on our corporate network.

I used @leefaus gist to help with the mongo replicaset (thank you very much!), and I have my haproxy balancing via source.

However, I noticed that avatars are stored on the filesystem in the administration/accounts section. Is that intended? How did we get around that limitation for the demo?

jonbaer · 2015-11-13T21:05:20Z

I am interested in this topic as well, I have haproxy pointing to nodes managed by PM2 w/ only a single Mongo cluster @ the moment, the thing I am mainly interested in though is how to stress test + what should be tested (as far as Meteor itself goes), it's my understanding that the DDP channel would be the bottleneck (do you test w/ something like https://github.com/observing/thor in that case). Id like to get a report where I can determine how much AWS/Rackspace requirements are needed for N active users + connections w/ Rocket.Chat. I would be interested in (documentation) what the good haproxy configuration would be (including SSL setup) ...

Sing-Li · 2015-11-14T03:53:43Z

Guys. Please open additional issues for capacity planning and load testing discussions.

There is enough complexity in just getting HA testing going alone.

@calebmeyer - so how many nodes do you have running? In Administration-> API, there is setting to change avatarStore_type to GridFS. This will store the avatars in mongo for your test scenarios.

calebmeyer · 2015-11-16T16:42:32Z

@Sing-Li I have 6 nodes: 1 proxy, 2 app, 3 mongo. I see the setting for Avatar Storage Type, do I also need to clear the Avatar Storage Path? Currently it says /var/www/rocket.chat/uploads/avatar/.

You're right about HA being different from load testing, which is why I was surprised we'd need so many nodes for it. I figured taking down one node should leave the app running for everyone, taking down both would take it down (in a high availability scenario).

Sing-Li · 2015-11-16T16:51:43Z

@calebmeyer I don't think you need to clear the path - but might as well.

And yes. It is just that our demo-server production is already at 4 app-nodes, and it is basically stable. However, it does not usually encounter the barrage of anomalies that you will be subjecting your cluster to 😄

So now, we'll just sit back and await your 'breakage reports' 👍

calebmeyer · 2015-11-16T17:01:38Z

Time to bring in the chaos monkey :) Thanks for the quick response. Turns out the avatar issue I was seeing went away when I cleared my cache, so I can leave the GridFS settings as they are.

richardwlu · 2016-03-03T21:45:21Z

What do we configure the MONGO_URL and MONGO_OPLOG_URL environment variables on a secondary instance of the app with a mongo replica set of 3 members? I noticed in this gist https://gist.github.com/leefaus/fd55eee32f1dc5918220 that there is a list for MONGO_URL but nothing for MONGO_OPLOG_URL.

Is the gist for MONGO_URL accurate and should the second instance MONGO_OPLOG_URL point to the primary mongodb or mongodb://localhost:27017/local?

We're currently running on CentOS and have one instance of the app running and am currently installing and bring up another Rocket.Chat instance of the app up on another server. We will be adding 3 members of the replica set, one on the primary Rocket.chat instance, one on the secondary Rocket.chat instance, and another on a separate mongo server.

engelgabriel · 2016-03-04T20:43:39Z

Please see #1867

engelgabriel · 2016-08-09T19:13:00Z

Hi, please follow https://github.com/RocketChat/Rocket.Chat.Docs/issues/68

benyanke · 2018-04-15T03:38:29Z

Not sure the status of this, but I'm willing to do whatever's needed to test on my docker cluster at home.

geekgonecrazy · 2018-04-16T03:35:21Z

See referenced issue on docs repo right above as well as our documentation. Read through various docker installs and you'll get a variaty of ways to do your setup

npm deps update

…thread-users [Threading] Configure limit of max invited users

…4fab4c [Upstream Catchup] Merge RC:master to develop_pwa

marceloschmidt added contrib: experts needed type: discussion labels Aug 20, 2015

geekgonecrazy mentioned this issue Aug 31, 2015

Federation Protocol #601

Closed

geekgonecrazy mentioned this issue Sep 15, 2015

Support multiple orgs on the same instance #658

Open

marceloschmidt added this to the Roadmap milestone Sep 21, 2015

guarilha mentioned this issue Sep 22, 2015

Create Documentation about scalability #847

Closed

rodrigok modified the milestones: Roadmap, Important Feb 23, 2016

engelgabriel closed this as completed Mar 4, 2016

simonclausen mentioned this issue Jun 4, 2016

High Availability Documentation #2964

Closed

engelgabriel modified the milestone: Important Dec 6, 2016

HappyTobi pushed a commit to HappyTobi/Rocket.Chat that referenced this issue Jul 10, 2018

Merge pull request RocketChat#520 from RocketChat/npm-update

e0e9d6c

npm deps update

Peym4n pushed a commit to redlink-gmbh/Rocket.Chat that referenced this issue Apr 4, 2019

Merge pull request RocketChat#520 from assistify/feature/limit-added-…

b349e4b

…thread-users [Threading] Configure limit of max invited users

Shailesh351 pushed a commit to Shailesh351/Rocket.Chat that referenced this issue Feb 16, 2021

Merge pull request RocketChat#520 from WideChat/develop_pwa-catchup-a…

990c01e

…4fab4c [Upstream Catchup] Merge RC:master to develop_pwa

engelgabriel mentioned this issue Apr 26, 2022

High Availability deployment RocketChat/docs-old#2037

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High Availability #520

High Availability #520

calebmeyer commented Aug 20, 2015

Sing-Li commented Aug 20, 2015

calebmeyer commented Aug 20, 2015

Sing-Li commented Aug 20, 2015

geekgonecrazy commented Aug 22, 2015

Sing-Li commented Aug 22, 2015

Sing-Li commented Sep 7, 2015

calebmeyer commented Sep 7, 2015

Sing-Li commented Sep 7, 2015

Sing-Li commented Sep 11, 2015

geekgonecrazy commented Sep 11, 2015

leefaus commented Sep 15, 2015

geekgonecrazy commented Sep 15, 2015

leefaus commented Sep 15, 2015

Sing-Li commented Sep 15, 2015

calebmeyer commented Sep 21, 2015

Sing-Li commented Sep 21, 2015

geekgonecrazy commented Sep 21, 2015

calebmeyer commented Sep 28, 2015

leefaus commented Sep 29, 2015

calebmeyer commented Nov 13, 2015

jonbaer commented Nov 13, 2015

Sing-Li commented Nov 14, 2015

calebmeyer commented Nov 16, 2015

Sing-Li commented Nov 16, 2015

calebmeyer commented Nov 16, 2015

richardwlu commented Mar 3, 2016

engelgabriel commented Mar 4, 2016

engelgabriel commented Aug 9, 2016

benyanke commented Apr 15, 2018

geekgonecrazy commented Apr 16, 2018

High Availability #520

High Availability #520

Comments

calebmeyer commented Aug 20, 2015

Sing-Li commented Aug 20, 2015

calebmeyer commented Aug 20, 2015

Sing-Li commented Aug 20, 2015

geekgonecrazy commented Aug 22, 2015

Sing-Li commented Aug 22, 2015

Sing-Li commented Sep 7, 2015

calebmeyer commented Sep 7, 2015

Sing-Li commented Sep 7, 2015

Sing-Li commented Sep 11, 2015

geekgonecrazy commented Sep 11, 2015

leefaus commented Sep 15, 2015

geekgonecrazy commented Sep 15, 2015

leefaus commented Sep 15, 2015

Sing-Li commented Sep 15, 2015

calebmeyer commented Sep 21, 2015

Sing-Li commented Sep 21, 2015

geekgonecrazy commented Sep 21, 2015

calebmeyer commented Sep 28, 2015

leefaus commented Sep 29, 2015

calebmeyer commented Nov 13, 2015

jonbaer commented Nov 13, 2015

Sing-Li commented Nov 14, 2015

calebmeyer commented Nov 16, 2015

Sing-Li commented Nov 16, 2015

calebmeyer commented Nov 16, 2015

richardwlu commented Mar 3, 2016

engelgabriel commented Mar 4, 2016

engelgabriel commented Aug 9, 2016

benyanke commented Apr 15, 2018

geekgonecrazy commented Apr 16, 2018