Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enchancement: use Arc + Mutex for dbpool instance instead clone() #5037

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

sillent
Copy link

@sillent sillent commented Oct 3, 2024

Hi, this is PR with minor changes - use Arc::clone instead .clone() instance of DbPool

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 4, 2024

Isn't it already a Arc<Mutex>, the connection at least?

conn: Arc::new(Mutex::new(Some(DbConnInner::$name(c)))),

Wrapping another around it again doesn't seem like a good option i think, but i might be wrong @dani-garcia ?

@sillent
Copy link
Author

sillent commented Oct 4, 2024

Isn't it already a Arc<Mutex>, the connection at least?

conn: Arc::new(Mutex::new(Some(DbConnInner::$name(c)))),

it's a DbConnInner wrapped not a DbPool

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 4, 2024

Still i think this is a bad idea. Ill do some tests.

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 4, 2024

And it is..

I tested this by calling /api/two-factor/get-duo endpoint using hey with 10 concurrent calls and a total of 100 times.
See the results below.

Arc + Mutex:

hey -c 10 -n 100 -m POST \
  -H 'authorization: Bearer eyJ0eX....Cw' \
  -d '{"masterPasswordHash":"N30....7s4="}' \
  http://127.0.0.1/api/two-factor/get-duo

Summary:
  Total:	88.4836 secs
  Slowest:	10.6611 secs
  Fastest:	0.8882 secs
  Average:	8.4133 secs
  Requests/sec:	1.1302
  
Response time histogram:
  0.888 [1]	|■
  1.866 [1]	|■
  2.843 [1]	|■
  3.820 [1]	|■
  4.797 [1]	|■
  5.775 [1]	|■
  6.752 [2]	|■
  7.729 [2]	|■
  8.707 [16]	|■■■■■■■■■■■
  9.684 [60]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  10.661 [14]	|■■■■■■■■■

Latency distribution:
  10% in 7.9186 secs
  25% in 8.0089 secs
  50% in 8.8373 secs
  75% in 8.8653 secs
  90% in 9.7137 secs
  95% in 9.7540 secs
  99% in 10.6611 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.8882 secs, 10.6611 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0005 secs
  resp wait:	8.4131 secs, 0.8870 secs, 10.6609 secs
  resp read:	0.0001 secs, 0.0001 secs, 0.0005 secs

Status code distribution:
  [200]	100 responses

Current main branch:

hey -c 10 -n 100 -m POST \
  -H 'authorization: Bearer eyJ0eX....Cw' \
  -d '{"masterPasswordHash":"N30....7s4="}' \
  http://127.0.0.1/api/two-factor/get-duo

Summary:
  Total:	13.1764 secs
  Slowest:	1.6208 secs
  Fastest:	0.9380 secs
  Average:	1.2475 secs
  Requests/sec:	7.5893
  
Response time histogram:
  0.938 [1]	|■■
  1.006 [0]	|
  1.075 [6]	|■■■■■■■■■■■■
  1.143 [18]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  1.211 [20]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  1.279 [19]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  1.348 [14]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  1.416 [11]	|■■■■■■■■■■■■■■■■■■■■■■
  1.484 [4]	|■■■■■■■■
  1.553 [5]	|■■■■■■■■■■
  1.621 [2]	|■■■■

Latency distribution:
  10% in 1.1011 secs
  25% in 1.1493 secs
  50% in 1.2363 secs
  75% in 1.3284 secs
  90% in 1.4567 secs
  95% in 1.5047 secs
  99% in 1.6208 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.9380 secs, 1.6208 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0003 secs
  resp wait:	1.2472 secs, 0.9378 secs, 1.6207 secs
  resp read:	0.0002 secs, 0.0000 secs, 0.0018 secs

Status code distribution:
  [200]	100 responses

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 4, 2024

The problem is that the concurrency doesn't work that well anymore because of the lock you set.
Now every call needs to wait for the lock to be released and that causes a huge delay.

This kinda defeats the usage of async also.
So, I'm not in favor of adding this.

@sillent
Copy link
Author

sillent commented Oct 4, 2024

Hm.. you right! lock() for read operation is not good! Refactor for RwLock

Performance is the same

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 4, 2024

But why as another layer around this?
If I'm correct, but but sure, from the top of my head this is all handled by R2D2 already.

Why should we add another layer of Arc with a RwLock?
I'm not seeing the benefits here.

Could you elaborate a bit more on this?

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 4, 2024

It might help a bit in limiting the allocations maybe. But it might be minimal.

@sillent
Copy link
Author

sillent commented Oct 4, 2024

But why as another layer around this? If I'm correct, but but sure, from the top of my head this is all handled by R2D2 already.

Why should we add another layer of Arc with a RwLock? I'm not seeing the benefits here.

Could you elaborate a bit more on this?

First of all, yes, reduce the number of allocations.
Yes, now this happens once at startup, but in the future, the number of functions requiring the dbpool instance may increase

And secondly, it may be necessary to change the parameters of this pool from different places (yeah, right now this is not needed anywhere, but maybe in future... who knows)

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 6, 2024

I think this should work. While i do not think we will add something which needs modifications to the pool or something, but have a bit less allocations is good too.

If you could rebase on the current main branch that would be cool :).

src/main.rs Outdated Show resolved Hide resolved
@sillent
Copy link
Author

sillent commented Oct 6, 2024

If you could rebase on the current main branch that would be cool :).

done

src/api/core/sends.rs Outdated Show resolved Hide resolved
@williamdes
Copy link
Contributor

Can you post a compared perf benchmark again ?

@sillent
Copy link
Author

sillent commented Oct 8, 2024

Can you post a compared perf benchmark again ?

hey usage

hey -c 10 -n 100 -m POST \                                                                                                                                                                                                       
  -H "Authorization: Bearer eyJ0eXAiO....." \
  -d '{"masterPasswordHash": "6Usmwe8e....."}' \
  "http://127.0.0.1:8000/api/two-factor/get-duo"

results for vaultwarden build from main branch:

Summary:
  Total:	1.2014 secs
  Slowest:	0.2152 secs
  Fastest:	0.0567 secs
  Average:	0.1088 secs
  Requests/sec:	83.2364

  Total data:	4100 bytes
  Size/request:	41 bytes

Response time histogram:
  0.057 [1]	|■
  0.073 [6]	|■■■■■■■■■
  0.088 [19]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.104 [27]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.120 [24]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.136 [6]	|■■■■■■■■■
  0.152 [4]	|■■■■■■
  0.168 [4]	|■■■■■■
  0.183 [5]	|■■■■■■■
  0.199 [1]	|■
  0.215 [3]	|■■■■


Latency distribution:
  10% in 0.0773 secs
  25% in 0.0872 secs
  50% in 0.1011 secs
  75% in 0.1150 secs
  90% in 0.1651 secs
  95% in 0.1825 secs
  99% in 0.2152 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0001 secs, 0.0567 secs, 0.2152 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0002 secs
  resp wait:	0.1086 secs, 0.0566 secs, 0.2151 secs
  resp read:	0.0001 secs, 0.0000 secs, 0.0010 secs

Status code distribution:
  [200]	100 responses

results for vaultwarden build from changed branch with RwLock:

Summary:
  Total:	1.1531 secs
  Slowest:	0.2729 secs
  Fastest:	0.0620 secs
  Average:	0.1072 secs
  Requests/sec:	86.7242

  Total data:	4100 bytes
  Size/request:	41 bytes

Response time histogram:
  0.062 [1]	|■
  0.083 [25]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.104 [29]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.125 [15]	|■■■■■■■■■■■■■■■■■■■■■
  0.146 [21]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.167 [5]	|■■■■■■■
  0.189 [2]	|■■■
  0.210 [1]	|■
  0.231 [0]	|
  0.252 [0]	|
  0.273 [1]	|■


Latency distribution:
  10% in 0.0722 secs
  25% in 0.0821 secs
  50% in 0.1005 secs
  75% in 0.1287 secs
  90% in 0.1414 secs
  95% in 0.1667 secs
  99% in 0.2729 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0001 secs, 0.0620 secs, 0.2729 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0002 secs
  resp wait:	0.1070 secs, 0.0619 secs, 0.2727 secs
  resp read:	0.0001 secs, 0.0000 secs, 0.0011 secs

Status code distribution:
  [200]	100 responses

run on my local Mac Pro m1

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 8, 2024

I can confirm those results with tests i did.

@Gerardv514
Copy link

Can you post a compared perf benchmark again ?

hey usage

hey -c 10 -n 100 -m POST \                                                                                                                                                                                                       
  -H "Authorization: Bearer eyJ0eXAiO....." \
  -d '{"masterPasswordHash": "6Usmwe8e....."}' \
  "http://127.0.0.1:8000/api/two-factor/get-duo"

results for vaultwarden build from main branch:

Summary:
  Total:	1.2014 secs
  Slowest:	0.2152 secs
  Fastest:	0.0567 secs
  Average:	0.1088 secs
  Requests/sec:	83.2364

  Total data:	4100 bytes
  Size/request:	41 bytes

Response time histogram:
  0.057 [1]	|■
  0.073 [6]	|■■■■■■■■■
  0.088 [19]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.104 [27]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.120 [24]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.136 [6]	|■■■■■■■■■
  0.152 [4]	|■■■■■■
  0.168 [4]	|■■■■■■
  0.183 [5]	|■■■■■■■
  0.199 [1]	|■
  0.215 [3]	|■■■■


Latency distribution:
  10% in 0.0773 secs
  25% in 0.0872 secs
  50% in 0.1011 secs
  75% in 0.1150 secs
  90% in 0.1651 secs
  95% in 0.1825 secs
  99% in 0.2152 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0001 secs, 0.0567 secs, 0.2152 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0002 secs
  resp wait:	0.1086 secs, 0.0566 secs, 0.2151 secs
  resp read:	0.0001 secs, 0.0000 secs, 0.0010 secs

Status code distribution:
  [200]	100 responses

results for vaultwarden build from changed branch with RwLock:

Summary:
  Total:	1.1531 secs
  Slowest:	0.2729 secs
  Fastest:	0.0620 secs
  Average:	0.1072 secs
  Requests/sec:	86.7242

  Total data:	4100 bytes
  Size/request:	41 bytes

Response time histogram:
  0.062 [1]	|■
  0.083 [25]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.104 [29]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.125 [15]	|■■■■■■■■■■■■■■■■■■■■■
  0.146 [21]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.167 [5]	|■■■■■■■
  0.189 [2]	|■■■
  0.210 [1]	|■
  0.231 [0]	|
  0.252 [0]	|
  0.273 [1]	|■


Latency distribution:
  10% in 0.0722 secs
  25% in 0.0821 secs
  50% in 0.1005 secs
  75% in 0.1287 secs
  90% in 0.1414 secs
  95% in 0.1667 secs
  99% in 0.2729 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0001 secs, 0.0620 secs, 0.2729 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0002 secs
  resp wait:	0.1070 secs, 0.0619 secs, 0.2727 secs
  resp read:	0.0001 secs, 0.0000 secs, 0.0011 secs

Status code distribution:
  [200]	100 responses

run on my local Mac Pro m1

Is this considered a significant improvement?

@sillent
Copy link
Author

sillent commented Oct 9, 2024

Is this considered a significant improvement?

changes not for performance improvement

@BlackDex
Copy link
Collaborator

BlackDex commented Oct 9, 2024

Well, maybe there is a very very small performance gain, but not one measurable. Since less allocation need to be made, less memory is used and less CPU cycles to perform it.

@dani-garcia
Copy link
Owner

The Pool type inside DbPoolInner is already wrapped in an Arc, it's from the r2d2 crate: https://github.com/sfackler/r2d2/blob/c1b0d9f976e12e97f92554063a7c6bd295eee471/src/lib.rs#L317

Also I don't think the RwLock is needed, we're only using the DbPool as a read-only reference, so if we wanted to do this, Arc<DbPool> should be enough.

That said, this seems like premature optimization to me.

@BlackDex
Copy link
Collaborator

I have the same thoughts.

I tested it to see if it would lower the amount of allocations and memory, but it isn't detectable as far as i can see. Sure a bit less allocations, but not really an improvement with memory or speed.
Mainly because it's all loaded during startup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants