Skip to content
This repository has been archived by the owner on Aug 21, 2023. It is now read-only.

Dumpling hangs on Aurora backup #181

Closed
mightyguava opened this issue Nov 2, 2020 · 1 comment · Fixed by #190
Closed

Dumpling hangs on Aurora backup #181

mightyguava opened this issue Nov 2, 2020 · 1 comment · Fixed by #190
Labels
type/bug This issue is a bug

Comments

@mightyguava
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.

Ran a dumpling backup against AWS RDS Aurora.

  1. What did you expect to see?

The dumpling backup exits successfully.

  1. What did you see instead?

From the dumpling side, there is 0 CPU utilization and no disk activity, yet dumpling does not terminate.

From the Aurora side, it has finished sending data. From metrics, it looked like that no data had been sent from Aurora to dumpling for several hours.

mysql> show full processlist;
+------+----------------+---------------------+------------+---------+-------+----------------------+-----------------------+
| Id   | User           | Host                | db         | Command | Time  | State                | Info                  |
+------+----------------+---------------------+------------+---------+-------+----------------------+-----------------------+
|    5 | rdsadmin       | localhost           | NULL       | Sleep   |     1 | delayed send ok done | NULL                  |
|    6 | rdsadmin       | localhost           | NULL       | Sleep   |     2 | cleaned up           | NULL                  |
|   59 | rdsadmin       | localhost           | NULL       | Sleep   |   403 | delayed send ok done | NULL                  |
| 3318 | root           | 10.137.3.129:34252  | NULL       | Sleep   | 19416 | delayed send ok done | NULL                  |
| 3319 | root           | 10.137.3.129:34276  | NULL       | Sleep   |  2885 | cleaned up           | NULL                  |
| 3320 | root           | 10.137.3.129:34290  | NULL       | Sleep   | 16441 | cleaned up           | NULL                  |
| 3321 | root           | 10.137.3.129:34298  | NULL       | Sleep   |  2740 | cleaned up           | NULL                  |
| 3322 | root           | 10.137.3.129:34300  | NULL       | Sleep   | 17061 | cleaned up           | NULL                  |
| 3323 | root           | 10.137.3.129:34308  | NULL       | Sleep   | 16443 | cleaned up           | NULL                  |
| 3324 | root           | 10.137.3.129:34310  | NULL       | Sleep   | 16345 | cleaned up           | NULL                  |
| 3325 | root           | 10.137.3.129:34312  | NULL       | Sleep   | 16452 | cleaned up           | NULL                  |
| 3326 | root           | 10.137.3.129:34316  | NULL       | Sleep   |  2743 | cleaned up           | NULL                  |
| 3327 | root           | 10.137.3.129:34318  | NULL       | Sleep   | 16397 | cleaned up           | NULL                  |
| 3328 | root           | 10.137.3.129:34320  | NULL       | Sleep   | 16431 | cleaned up           | NULL                  |
| 3329 | root           | 10.137.3.129:34324  | NULL       | Sleep   | 16432 | cleaned up           | NULL                  |
| 3330 | root           | 10.137.3.129:34326  | NULL       | Sleep   | 16456 | cleaned up           | NULL                  |
| 3331 | root           | 10.137.3.129:34328  | NULL       | Sleep   | 16455 | cleaned up           | NULL                  |
| 3332 | root           | 10.137.3.129:34330  | NULL       | Sleep   | 16441 | cleaned up           | NULL                  |
| 3333 | root           | 10.137.3.129:34332  | NULL       | Sleep   |  2726 | cleaned up           | NULL                  |
| 3334 | root           | 10.137.3.129:34336  | NULL       | Sleep   | 17065 | cleaned up           | NULL                  |
| 3335 | root           | 10.137.3.129:34338  | NULL       | Sleep   |  2788 | cleaned up           | NULL                  |
| 3336 | root           | 10.137.3.129:34344  | NULL       | Sleep   | 16396 | cleaned up           | NULL                  |
| 3337 | root           | 10.137.3.129:34346  | NULL       | Sleep   | 16352 | cleaned up           | NULL                  |
| 3338 | root           | 10.137.3.129:34348  | NULL       | Sleep   |  2822 | cleaned up           | NULL                  |
| 3339 | root           | 10.137.3.129:34350  | NULL       | Sleep   |  2885 | cleaned up           | NULL                  |
| 3340 | root           | 10.137.3.129:34352  | NULL       | Sleep   |  2702 | cleaned up           | NULL                  |
| 3341 | root           | 10.137.3.129:34354  | NULL       | Sleep   | 16351 | cleaned up           | NULL                  |
| 3342 | root           | 10.137.3.129:34356  | NULL       | Sleep   |  2884 | cleaned up           | NULL                  |
| 3343 | root           | 10.137.3.129:34358  | NULL       | Sleep   | 16351 | cleaned up           | NULL                  |
| 3344 | root           | 10.137.3.129:34362  | NULL       | Sleep   | 16460 | cleaned up           | NULL                  |
| 3345 | root           | 10.137.3.129:34364  | NULL       | Sleep   | 16414 | cleaned up           | NULL                  |
| 3346 | root           | 10.137.3.129:34368  | NULL       | Sleep   | 16442 | cleaned up           | NULL                  |
| 3347 | root           | 10.137.3.129:34372  | NULL       | Sleep   |  4628 | cleaned up           | NULL                  |
| 3348 | root           | 10.137.3.129:34374  | NULL       | Sleep   |  2885 | cleaned up           | NULL                  |
| 3349 | root           | 10.137.3.129:34376  | NULL       | Sleep   |  2618 | cleaned up           | NULL                  |
| 3350 | root           | 10.137.3.129:34378  | NULL       | Sleep   |  2681 | cleaned up           | NULL                  |
| 3351 | root           | 10.137.3.129:34380  | NULL       | Sleep   |  2607 | cleaned up           | NULL                  |
| 3352 | root           | 10.137.3.129:34382  | NULL       | Sleep   | 16450 | cleaned up           | NULL                  |
| 3353 | root           | 10.137.3.129:34384  | NULL       | Sleep   | 16447 | cleaned up           | NULL                  |
| 3354 | root           | 10.137.3.129:34386  | NULL       | Sleep   | 17072 | cleaned up           | NULL                  |
| 3355 | root           | 10.137.3.129:34390  | NULL       | Sleep   |  2885 | cleaned up           | NULL                  |
| 3356 | root           | 10.137.3.129:34392  | NULL       | Sleep   | 16358 | cleaned up           | NULL                  |
| 3357 | root           | 10.137.3.129:34394  | NULL       | Sleep   | 16340 | cleaned up           | NULL                  |
| 3358 | root           | 10.137.3.129:34398  | NULL       | Sleep   |  2884 | cleaned up           | NULL                  |
| 3359 | root           | 10.137.3.129:34402  | NULL       | Sleep   |  2701 | cleaned up           | NULL                  |
| 3360 | root           | 10.137.3.129:34404  | NULL       | Sleep   | 16396 | cleaned up           | NULL                  |
| 3361 | root           | 10.137.3.129:34406  | NULL       | Sleep   |  2684 | cleaned up           | NULL                  |
| 3362 | root           | 10.137.3.129:34408  | NULL       | Sleep   | 16449 | cleaned up           | NULL                  |
| 3363 | root           | 10.137.3.129:34410  | NULL       | Sleep   |  2706 | cleaned up           | NULL                  |
| 3364 | root           | 10.137.3.129:34414  | NULL       | Sleep   |  2608 | cleaned up           | NULL                  |
| 3365 | root           | 10.137.3.129:34416  | NULL       | Sleep   |  2717 | cleaned up           | NULL                  |
| 3366 | root           | 10.137.3.129:34418  | NULL       | Sleep   | 16340 | cleaned up           | NULL                  |
| 3367 | root           | 10.137.3.129:34420  | NULL       | Sleep   | 16429 | cleaned up           | NULL                  |
| 3368 | root           | 10.137.3.129:34422  | NULL       | Sleep   | 16413 | cleaned up           | NULL                  |
| 3369 | root           | 10.137.3.129:34426  | NULL       | Sleep   |  2709 | cleaned up           | NULL                  |
| 3370 | root           | 10.137.3.129:34430  | NULL       | Sleep   |  2614 | cleaned up           | NULL                  |
| 3371 | root           | 10.137.3.129:34432  | NULL       | Sleep   | 16411 | cleaned up           | NULL                  |
| 3372 | root           | 10.137.3.129:34434  | NULL       | Sleep   | 16446 | cleaned up           | NULL                  |
| 3373 | root           | 10.137.3.129:34436  | NULL       | Sleep   | 16409 | cleaned up           | NULL                  |
| 3374 | root           | 10.137.3.129:34440  | NULL       | Sleep   | 16397 | cleaned up           | NULL                  |
| 3375 | root           | 10.137.3.129:34442  | NULL       | Sleep   | 16396 | cleaned up           | NULL                  |
| 3376 | root           | 10.137.3.129:34446  | NULL       | Sleep   | 16418 | cleaned up           | NULL                  |
| 3377 | root           | 10.137.3.129:34448  | NULL       | Sleep   | 16421 | cleaned up           | NULL                  |
| 3378 | root           | 10.137.3.129:34450  | NULL       | Sleep   | 16397 | cleaned up           | NULL                  |
| 3379 | root           | 10.137.3.129:34452  | NULL       | Sleep   |  2727 | cleaned up           | NULL                  |
| 3380 | root           | 10.137.3.129:34454  | NULL       | Sleep   | 16341 | cleaned up           | NULL                  |
| 3381 | root           | 10.137.3.129:34456  | NULL       | Sleep   |  2600 | cleaned up           | NULL                  |
| 3382 | root           | 10.137.3.129:34458  | NULL       | Sleep   |  2881 | cleaned up           | NULL                  |
| 3706 | rdsadmin       | localhost           | NULL       | Sleep   |     0 | cleaned up           | NULL                  |
| 3707 | newswriter_adm | 10.138.66.152:41488 | newswriter | Query   |     0 | starting             | show full processlist |
+------+----------------+---------------------+------------+---------+-------+----------------------+-----------------------+
  1. Versions of the cluster
  • Dumpling version (run dumpling -V):
Release version: v4.0.7-9-gb84f64f
Git commit hash: b84f64ff362cedcb795aa23fa1188ba7b7c9a7d7
Git branch:      master
Build timestamp: 2020-10-27 04:21:05Z
Go version:      go version go1.15.3 linux/amd64
  • Source database version (execute SELECT version(); in a MySQL client):
mysql> SELECT version();
+------------+
| version()  |
+------------+
| 5.7.12-log |
+------------+
1 row in set (0.00 sec)
  1. Operation logs

Dumpling logs

Release version: v4.0.7-9-gb84f64f
Git commit hash: b84f64ff362cedcb795aa23fa1188ba7b7c9a7d7
Git branch: master
Build timestamp: 2020-10-27 04:21:05Z
Go version: go version go1.15.3 linux/amd64

[2020/10/30 15:43:05.700 +00:00] [INFO] [config.go:180] ["detect server type"] [type=MySQL]
[2020/10/30 15:43:05.700 +00:00] [INFO] [config.go:198] ["detect server version"] [version=5.7.12-log]
[2020/10/30 15:43:07.988 +00:00] [INFO] [ir_impl.go:208] ["get estimated rows count"] [estimateCount=2918441068]
[2020/10/30 20:17:36.882 +00:00] [INFO] [ir_impl.go:208] ["get estimated rows count"] [estimateCount=61042457]
[2020/10/30 20:18:37.658 +00:00] [INFO] [ir_impl.go:208] ["get estimated rows count"] [estimateCount=425612999]

Dumpling goroutines goroutine.txt
Dumpling goroutines (binary dump) goroutine.bin.zip

  1. Configuration of the cluster and the task
            - "--ca=/etc/rds-tls/rds-ca-2019-root.pem"
            - "--password=${PASSWORD}"
            - "--allow-cleartext-passwords"
            # FLUSH TABLES WITH READ LOCK is not allowed on Aurora
            - "--consistency=lock"
            - "-o=/backup/${JOB}"
            - "--database=${DATABASE}"
            - "--tables-list=${TABLES}"
            - "--rows=200000"
            - "--threads=64"
            - "--user=${DB_USER}"
            - "--host=${DB_HOST}"
            - "--port=3306"
            - "--filetype=csv"
@lichunzhu
Copy link
Contributor

lichunzhu commented Nov 6, 2020

@mightyguava dumpling may get blocked because dumpling didn't set readTimeout in DSN before. After #190 this problem should be fixed. I have tested this locally by randomly kill dumpling MySQL socket connection and dumpling won't get blocked any more.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type/bug This issue is a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants