-
Notifications
You must be signed in to change notification settings - Fork 23.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AOF rewrite should call fdatasync per second to avoid block the main thread's write(2) #1019
Comments
Thanks for your report, do you mean to call fdatasync in the child process that writes the AOF so that it avoids to accumulate too big buffers to flush on disk? We did something like that in replication during RDB transfer with good results actually so this probably makes sense for AOF as well. |
yes, and when i change the parameter to 32M, no busy log and no blocking any more :) 100.00% <= 317 milliseconds. if fdatasync only call per second, it will be 100M on my test machie, it still cause 2-4 seconds blocking. |
That's cool, however did you checked if the time to produce the AOF file changed in a measurable way? Thanks. |
I am running "redis-benchmark -r 100000008 -n 2500000 -d 256 -t SET" when AOF rewrite Use 1 minutes to write 8.6G AOF file, when vm.dirty_bytes=33554432 (32M) vm.dirty_background_bytes=33554432(32M) [3106] 21 Mar 17:38:01.059 * Background append only file rewriting started by pid 3292 After back to the default value of RedHat Enterprise 6, vm.dirty_ratio = 20, vm.dirty_background_ratio = 10. (6.4G in my 64G machine), use 1.5 minutes to write the same file. [3106] 21 Mar 17:42:02.811 * Background append only file rewriting started by pid 3323 and redis-benchmark report: 100.00% <= 37420 milliseconds |
Very interesting, thanks! I'll provide a branch with explicit fdatasync() call to see if the results are comparable to what you get modifying kernel parameters. Thanks for your help, it is appreciated. |
Looking forward it, i will test it for you. Thanks for you extreme quickly reply. |
This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue #1019 for more information.
Dear @springside, I just implemented what discussed here in the issue-1090 branch, please could you verify that the good effects now that we moved the idea inside Redis instead of modifying the kernel parameters are still here? :-) The branch is a fork of Thank you in advance! |
Sorry I was wrong... cherry-pick is not possible so I'm creating a 2.6-issue-1090 branch right now. |
This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue #1019 for more information.
done :-) |
thank you, but i am enjoying my vacation , and the test machine can only connect at office network. |
No rush! 7 April is perfect :-) Have a great vacation! |
Hi @antirez, the problem is gone. |
Hi @antirez , when would the code merge to 2.6/2,8 branch ? |
Right now :-) |
This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue #1019 for more information.
This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue #1019 for more information.
This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue #1019 for more information.
Merged everywhere, now adding support to enable/disable it in redis.conf. By default this will be enabled. Thank you for your help! |
Feature made configurable and committed, so now we are set for a release... but I've another pending thing in Redis Sentinel that I need to address before committing. |
Looking at the fix it seems like when writing more than 32MB (for example a 100MB key) we'll only get a single fsync. Am I missing something? |
@yoav-steinberg hi guy. so you should add bytes_written to r->processed_bytes before returing rioWrite |
@charsyam, processed_bytes is updated in the rio layer (not specific to files). This is another change in the GarantiaData fork. You can find it here: https://github.com/GarantiaData/redis/commit/3aab06bcc7460b46c96fbb8d3b704ab1227c6be4 |
@yoav-steinberg Hi, I might think I found another bug in your code.(but, it's trivial) if bytes_written becomes larger than io.file.fsync_interval and if len remains, static size_t rioFileWrite(rio *r, const void *buf, size_t len) {
size_t bytes_written = 0;
while (len) {
size_t bytes_to_write = (r->io.file.fsync_interval && r->io.file.fsync_interval < len) ? r->io.file.fsync_interval : len;
if (fwrite((char*)buf + bytes_written,bytes_to_write,1,r->io.file.fp) != 1)
return 0;
bytes_written += bytes_to_write;
len -= bytes_to_write;
//There is a possibility to call fsync many times(not intented)
//if bytes_written is larger than io.file.fsync_interval and if there are more data in current loop
if (r->io.file.fsync_interval && r->processed_bytes/r->io.file.fsync_interval < (r->processed_bytes + bytes_written)/r->io.file.fsync_interval)
fsync(fileno(r->io.file.fp));
}
return 1;
} |
@charsyam, keen eye and sorry for keeping posting these links :) anyway I fixed that bug shortly after, see: https://github.com/GarantiaData/redis/commit/785cf9297733e32110daed8ca8b16169ea4288b5 |
This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue redis#1019 for more information.
On my 64G memory & 100MB/s HD test machine, when 10G AOF file is rewrited, the main thread's write(2) would be blocked from 7 second to 14 second , whole redis would be blocked.
"Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis." time and time again.
I change the system parameter to let the system flush to disk more often, avoid to flush too much data one time
sysctl vm.dirty_bytes=33554432(32M)
i think it is good to modify the AOF rewrite code, let it call fdatasync directly.
The text was updated successfully, but these errors were encountered: