-
Notifications
You must be signed in to change notification settings - Fork 95
[QUESTION] How to get Cherami to Spread the Load #284
Comments
For reference my current configuration is:
The environment peaks at maybe 4000 messages a second, averages a couple hundred - We see cherami back up under the load at every peak time, and usually when this happens one of the storehosts will have hit its 4GB cap, while the others will be laying around a 100MB or so. Same with the output host except it hits its 2GB cap while the others sit firmly on 20MB. Any help from the active devs @datoug / @kobeyang / @kirg would be most appreciated, as this is the final stumbling block for us with Cherami |
Hi @datoug , thanks for getting back to me quickly. So now no data is flowing through Cherami - however I am wondering if this is due to other updates since I last pulled the image.
I get this error appear on the controllers. I cannot list consumergroups via the cli tool - only destinations. |
Actually you can ignore that, seems like the upgrade wasn't very clean. Will run it this evening and let you know how it goes. |
yea from the error, it seems your cherami-thrift is not update to date. |
Ok, so the resource profile now I've redeployed looks much more sensible, each host is using about the same amount of memory. The system is quiet overnight so the real test will be tomorrow morning when it comes under load - Will let you know then you can close this. Thanks again for all the help @datoug ! |
Hi @datoug, sorry for being slow to get back to you - the fix seems to have worked well, the load is far more evenly balanced across the cluster. Would you guys be open to pull requests around things like example k8s configs or documentation additions? |
@datoug is this fix being recreated or dropped? I just noticed it's been reverted out of master? |
@danudell-trustnetworks Yes, PRs are welcome. My patch had some test issues. I'm investigating that and will re-merge it after it's solved. |
@datoug Cheers, we will stick on the branch for now until you can sort the test failures. |
@danudell-trustnetworks on it now, will update you today or tomorrow. |
@danudell-trustnetworks I have landed #294, could you try latest master branch? |
Yeah that all works - cheers! You can close this on |
I'm running a Cherami cluster using docker containers. I have:
And Cassandra with 5 nodes, and a RF of 3.
However when we have times of heavy load across the system, most of the boxes sit idle with one storage host that climbs to all its memory and then is killed (OOM).
Is there a way to tell Cherami to spread the load to the other storage hosts? It seems all the extents are placed on the one host, and even when that hosts starts slowing down and getting behind, the load is never shared.
Cassandra is aware of all nodes, each node seems to be working
Many thanks!
The text was updated successfully, but these errors were encountered: