Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudden drop in throughput #470

Closed
setevoy2 opened this issue Feb 21, 2020 · 10 comments
Closed

Sudden drop in throughput #470

setevoy2 opened this issue Feb 21, 2020 · 10 comments
Assignees

Comments

@setevoy2
Copy link

Hi.
We just started testing the Gorush, and faced with a weird issue: sometimes it works well and sends about 5.000 pushes.
Then it drops the speed to the about 50-80 messages/sec.
Any ideas why this can happen? Logs are clear (only the BadDevideToken happens, but that's from Apple and seems to be a "valid error").
The same history using any redis or memory backends.

Screenshot_20200221_162705

@yaronius
Copy link
Contributor

yaronius commented Feb 24, 2020

Might be related to sideshow/apns2#163

@yaronius
Copy link
Contributor

Ok, got the same issue, will provide just a bit more details. Maybe I am doing something wrong?

I installed gorush with Homebrew as stated in docs. Then I use the following config to run it locally (only for iOS pushes):

core:
  enabled: true
  mode: "release"
  port: "8088"
  max_notification: 100
  sync: false
api:
  push_uri: "/api/push"
  stat_go_uri: "/api/stat/go"
  stat_app_uri: "/api/stat/app"
  config_uri: "/api/config"
  sys_stat_uri: "/sys/stats"
  metric_uri: "/metrics"
  health_uri: "/healthz"
android:
  enabled: false
  apikey: "YOUR_API_KEY"
ios:
  enabled: true
  key_path: "apns-cert.p12"
  password: "my-secret-pass"
  production: false
log:
  format: "string" # string or json
  access_log: "stdout" # stdout: output to console, or define log path like "log/access_log"
  access_level: "debug"
  error_log: "stderr" # stderr: output to console, or define log path like "log/error_log"
  error_level: "debug"
  hide_token: false
stat:
  engine: "memory" # support memory, redis, boltdb, buntdb or leveldb

I have a script that sends notifications locally to my iPhone, I use it to benchmark throughput. Here it is:

#!/bin/sh

NUM_TOKENS=999

token="my-iphone-token-here"
tokens="\"$token\""
for i in $(seq 1 $NUM_TOKENS); do
    tokens+=",\"$token\""
done

NUM_NOTIFS=10
for i in $(seq 1 $NUM_NOTIFS); do
    curl --location --request POST 'http://localhost:8088/api/push' \
--header 'Content-Type: application/json' \
--data-raw '{
  "notifications": [
    {
      "topic": "my.app.bundleID",
      "tokens": ['$tokens'],
      "platform": 1,
      "message": "Hello, iOS!",
      "development": true
    }
  ]
}';
done

I get successful response when running it. Like the following:

{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}
{"counts":1000,"logs":[],"success":"ok"}

And when tracking the sending progress with both the logs and the stats API (http://localhost:8088/api/stat/app) I see that I have total_count equal to 10000 and push_success increases very slowly, somewhere about 50 per second. I have a fast network connection and also tried this on EC2 - results are pretty much the same.

Am I doing it wrong? Am I expecting too much of a throughput? What are the best practices for using Gorush for higher throughput?

@yaronius
Copy link
Contributor

yaronius commented Feb 25, 2020

And I got it. The reason for slow performance is very simple and obvious. The problem is in this code:

	for _, token := range req.Tokens {
		notification.DeviceToken = token

		// send ios notification
		res, err := client.Push(notification)

Actually, client.Push() is a time-consuming operation that runs synchronously. When I extract this call into a goroutine, I have an amazing performance boost (thousands pushes per second).
What really helped me to find the issue is this example from sideshow/apns2 library. Probably, a fix for this should be considered.

@appleboy appleboy self-assigned this Feb 29, 2020
@appleboy
Copy link
Owner

I will take it.

@appleboy
Copy link
Owner

Original code

var wg sync.WaitGroup
for i := 0; i < int(100000); i++ {
  client := <-clients // grab a client from the pool
  clients <- client // add the client back to the pool
  wg.Add(1)
  go func() {
    // grab a notification from your channel filled with notifications
    notification := <-notifications
    res, err := client.Push(notification)
    // handle response or error
    wg.Done()
  }()
}
wg.Wait()

should be changed to

var wg sync.WaitGroup
for i := 0; i < int(100000); i++ {
  client := <-clients // grab a client from the pool
- clients <- client // add the client back to the pool
  wg.Add(1)
  go func() {
    // grab a notification from your channel filled with notifications
    notification := <-notifications
    res, err := client.Push(notification)
    // handle response or error
    wg.Done()
+	clients <- client // add the client back to the pool
  }()
}
wg.Wait()

@McPo
Copy link
Contributor

McPo commented Apr 20, 2020

Has this been resolved? As we're seeing very slow throughput in some instances (3 per second, while providing a multiple tokens at once)

@yaronius
Copy link
Contributor

@McPo AFAIK no, because that code is not trivial to fix. I forked the repo and rewrote that part for my needs, but sacrificed some functionality (in particular, retrying on failure).

@McPo
Copy link
Contributor

McPo commented Apr 21, 2020

@yaronius Thanks for the response. Ive made a PR, its quite straightforward though, which has me suspicious that Im missing something (Im not a golang developer).

Ive tested it in sync mode (and it correctly returns logs) and have also tested retries, it all seems to work fine (albeit I have a very limited set of iOS device to test this with, so Im sending many multiples to the same one)

Would appreciate any feedback, or highlighting anything that Ive missed.

Cheers,
Emmet

@yaronius
Copy link
Contributor

@McPo thanks for the initiative. You are right, your PR looks quite straightforward 👍I was pretty much obsessed with trying to refactor the complexity of that loop and goto statement, which definitely made me blind to a simple solution 😄

@appleboy
Copy link
Owner

closed via #497

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants