Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Enhancements #69

Merged
merged 6 commits into from
Jul 5, 2024

Conversation

wolfendale
Copy link
Contributor

@wolfendale wolfendale commented Jul 5, 2024

Changes:

Since using the 2.0.3 release in production for the friends service, we noticed a bug which caused a leak in goroutines. The fix for this has already been applied in production but this backports that fix.

When observing the behaviour of the friends server, we noticed it was allocating a lot of memory. Currently we allocate a 64kb buffer every time a packet is sent to the server, that packet is only supposed to be held temporarily but it seems they're not being properly collected. This change means that the fixed pool of goroutines which respond to incoming packets will each have their own 64kb buffer that will be reused for each incoming packet instead, which should greatly reduce memory allocations.

@wolfendale wolfendale marked this pull request as ready for review July 5, 2024 15:46
@jonbarrow
Copy link
Member

jonbarrow commented Jul 5, 2024

The live server also hot patched these methods to no longer spawn goroutines. This change should also probably be back ported

Also we can see that DeriveKerberosKey is also producing a lot of allocations and using a lot of memory. According to pprof DeriveKerberosKey accounts for almost 40% of the applications entire allocations:

File: friends
Type: alloc_space
Time: Jul 5, 2024 at 11:52am (EDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 945832.13MB, 97.45% of 970564.26MB total
Dropped 758 nodes (cum <= 4852.82MB)
Showing top 10 nodes out of 19
      flat     flat%    sum%        cum    cum%
588871.18MB    60.67%   60.67% 592720.55MB 61.07%  github.com/PretendoNetwork/nex-go/v2.(*PRUDPServer).listenDatagram
356758.44MB    36.76%   97.43% 356758.44MB 36.76%  github.com/PretendoNetwork/nex-go/v2.DeriveKerberosKey
      137MB    0.014%   97.45% 371559.24MB 38.28%  github.com/PretendoNetwork/nex-go/v2.(*PRUDPEndPoint).processPacket
    40.51MB    0.0042%  97.45% 286847.14MB 29.55%  github.com/PretendoNetwork/nex-go/v2.(*PRUDPEndPoint).handleReliable
       18MB    0.0019%  97.45%   4935.45MB  0.51%  database/sql.(*DB).query
        5MB    0.00052% 97.45% 280457.63MB 28.90%  github.com/PretendoNetwork/nex-protocols-common-go/v2/ticket-granting.generateTicket
     1.50MB    0.00015% 97.45%  78875.56MB  8.13%  github.com/PretendoNetwork/nex-go/v2.(*PRUDPEndPoint).handleConnect
     0.50MB    5.2e-05% 97.45%  77077.36MB  7.94%  github.com/PretendoNetwork/nex-go/v2.(*PRUDPEndPoint).readKerberosTicket
          0    0%       97.45%   4935.45MB  0.51%  database/sql.(*DB).QueryContext
          0    0%       97.45%   4935.45MB  0.51%  database/sql.(*DB).QueryContext.func1

We can definitely do better, and this would be a good PR to address it

Using an implementation like this results in significantly fewer allocations:

func DeriveKerberosKey(pid int, password []byte) []byte {
	iterationCount := 65000 + pid%1024
	key := make([]byte, md5.Size)
	copy(key, password)

	for i := 0; i < iterationCount; i++ {
		hash := md5.Sum(key)
		copy(key, hash[:])
	}

	return key
}

though this has not been tested on a live server, I only ran benchmarks:

goos: linux
goarch: amd64
pkg: github.com/PretendoNetwork/nex-go/v2/kerb
cpu: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
BenchmarkOriginalDeriveKerberosKey-8         	    100	 10546657 ns/op	1040917 B/op	  65057 allocs/op
BenchmarkImprovedDeriveKerberosKey-8         	    174	  6740704 ns/op	     16 B/op	      1 allocs/op
BenchmarkOriginalDeriveKerberosKeyMemory-8   	    100	 10092520 ns/op	1040915 B/op	  65057 allocs/op
BenchmarkImprovedDeriveKerberosKeyMemory-8   	    172	  6807064 ns/op	     16 B/op	      1 allocs/op
PASS
ok  	github.com/PretendoNetwork/nex-go/v2/kerb	6.959s

This doesn't address the md5 CPU usage issue, but it drastically drops the memory usage

@wolfendale wolfendale force-pushed the performance-tweaks branch from 63173c6 to ad488be Compare July 5, 2024 16:39
@jonbarrow
Copy link
Member

LGTM 👍

timeout_manager.go Outdated Show resolved Hide resolved
Co-authored-by: Daniel López Guimaraes <112760654+DaniElectra@users.noreply.github.com>
Copy link
Member

@DaniElectra DaniElectra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jonbarrow jonbarrow merged commit d633dca into PretendoNetwork:master Jul 5, 2024
@wolfendale wolfendale deleted the performance-tweaks branch July 5, 2024 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants