Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hibernate_after listener opt with 0 default value. #1821

Merged
merged 1 commit into from
May 10, 2018

Conversation

kzemek
Copy link
Contributor

@kzemek kzemek commented Apr 24, 2018

This PR introduces a new configuration variable for listeners: hibernate_after.
This variable configures the timeout (in milliseconds) after which a c2s and receiver process will hibernate if they're idle. It replaces old in-code constants that hibernated these processes after 90 seconds of idle time.

If hibernate_after is 0 (the default), the processes will hibernate as soon as they get a chance. Load tests have shown that this greatly reduces MongooseIM's memory consumption while producing no noticeable slowdown or increase in CPU load under common scenarios.

@kzemek kzemek force-pushed the exml-rapidxml-hibernate branch 2 times, most recently from 8a18db6 to bfd32e4 Compare April 30, 2018 12:22
@kzemek kzemek changed the title Exml rapidxml hibernate Add hibernate_after listener opt with 0 default value. Apr 30, 2018
@kzemek kzemek requested review from fenek and michalwski April 30, 2018 13:50
@kzemek kzemek force-pushed the exml-rapidxml-hibernate branch from bfd32e4 to 5f4a249 Compare April 30, 2018 13:52
@esl esl deleted a comment from codecov bot Apr 30, 2018
@codecov
Copy link

codecov bot commented Apr 30, 2018

Codecov Report

Merging #1821 into master will increase coverage by 0.01%.
The diff coverage is 73.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1821      +/-   ##
==========================================
+ Coverage   74.67%   74.69%   +0.01%     
==========================================
  Files         298      298              
  Lines       27764    27784      +20     
==========================================
+ Hits        20734    20752      +18     
- Misses       7030     7032       +2
Impacted Files Coverage Δ
src/ejabberd_socket.erl 70% <100%> (-1.24%) ⬇️
src/ejabberd_c2s.erl 84.66% <66.66%> (+0.3%) ⬆️
src/ejabberd_receiver.erl 80.85% <75.75%> (+1.48%) ⬆️
src/mod_mam_odbc_prefs.erl 95.6% <0%> (-3.3%) ⬇️
src/rdbms/mongoose_rdbms.erl 71.22% <0%> (-1.44%) ⬇️
src/mod_bosh_socket.erl 80.66% <0%> (-1%) ⬇️
src/p1_fsm_old.erl 36.6% <0%> (-0.76%) ⬇️
src/mod_mam_utils.erl 87.45% <0%> (-0.67%) ⬇️
src/mod_roster.erl 79.16% <0%> (+0.24%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a452aa...2a1db2a. Read the comment docs.

@arcusfelis
Copy link
Contributor

Do we exchange a lot of presences in our tests?
I would like to see how the change behaves with more presence probes or with MUC presences workload.

@kzemek
Copy link
Contributor Author

kzemek commented May 2, 2018

@arcusfelis These are presence-based load tests: without hibernate and with hibernate.

@kzemek
Copy link
Contributor Author

kzemek commented May 2, 2018

And here are load tests with users sending messages rapidly: without hibernate and with hibernate.

@arcusfelis
Copy link
Contributor

arcusfelis commented May 2, 2018

one_to_one test is more about message sending.

Presence based should have:

  • a lot of login/logouts
  • a lot of presence updates

The roster size also plays a role.

I am fine with the change, but still it would be nice to have presence-oriented tests as part of our testing on tide.

We don't wake up processes often in one_to_one test.

@kzemek
Copy link
Contributor Author

kzemek commented May 2, 2018

@arcusfelis Have you looked at the scenarios tested? The tests I linked as "presence tests" only test sending presence updates. Each user has 8-user roster and updates presence every time when in a normal test he would send a message to his 8 neighbors.

In the "rapid send" test, each user sends 5 messages/second and receives 40 messages/second.

@kzemek kzemek force-pushed the exml-rapidxml-hibernate branch from 5f4a249 to b9d5113 Compare May 7, 2018 09:42
@@ -421,3 +423,15 @@ gen_server_call_or_noproc(Pid, Message) ->
end.

gen_fsm() -> p1_fsm.

hibernate(Result) ->
case process_info(self(), message_queue_len) of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With p1_fsm some messages can be inside an internal message queue.
This means that you should check '$internal_queue_len' process dictionary entry too.

I would rather put hibernate logic inside p1_fsm_old.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for moving the logic to p1_fsm_old.

@@ -421,3 +423,15 @@ gen_server_call_or_noproc(Pid, Message) ->
end.

gen_fsm() -> p1_fsm.

hibernate(Result) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both hibernate/1 and maybe_hibernate/1 needs specs.
I am not fan of element/2 call to get state, because it's not dialyzer verifiable.
Just case can be a better option.

Copy link
Contributor

@arcusfelis arcusfelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs specs.

@arcusfelis
Copy link
Contributor

So, if we send a message each minute, than in the old test we never hibernate at all => can lead to higher memory usage.

@kzemek kzemek force-pushed the exml-rapidxml-hibernate branch 3 times, most recently from 31c2942 to 8ca7050 Compare May 7, 2018 16:35
@kzemek kzemek force-pushed the exml-rapidxml-hibernate branch from 8ca7050 to 8346606 Compare May 8, 2018 15:19
The default of 0 means that the processes hibernate on every
opportunity.
@kzemek kzemek force-pushed the exml-rapidxml-hibernate branch from 8346606 to 2a1db2a Compare May 9, 2018 10:07
@arcusfelis arcusfelis merged commit 0c5ffa0 into master May 10, 2018
@arcusfelis arcusfelis deleted the exml-rapidxml-hibernate branch May 10, 2018 11:17
@fenek fenek added this to the 3.0.0 milestone May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants