realloc calling free #447

rgowrishankar · 2016-07-20T22:08:06Z

In my program, I am seeing an intermittent crash:

#0  0x00007fc34859b5f7 in raise () from /lib64/libc.so.6
#1  0x00007fc34859cce8 in abort () from /lib64/libc.so.6
#2  0x00007fc3485db317 in __libc_message () from /lib64/libc.so.6
#3  0x00007fc3485e3023 in _int_free () from /lib64/libc.so.6
#4  0x00007fc3485e51b0 in _int_realloc () from /lib64/libc.so.6
#5  0x00007fc3485e5f22 in realloc () from /lib64/libc.so.6
#6  0x00007fc34a56d79d in sdsMakeRoomFor () from /lib64/libhiredis.so.0.12
#7  0x00007fc34a56d90a in sdscatlen () from /lib64/libhiredis.so.0.12
#8  0x00007fc34a56d2a0 in __redisAppendCommand () from /lib64/libhiredis.so.0.12
#9  0x00007fc34a56fac2 in __redisAsyncCommand () from /lib64/libhiredis.so.0.12
#10 0x00007fc34a570498 in redisvAsyncCommand () from /lib64/libhiredis.so.0.12
#11 0x00007fc34a57055a in redisAsyncCommand () from /lib64/libhiredis.so.0.12

From the realloc documentation:
if size is equal to zero, and ptr is not NULL, then the call is equivalent to free(ptr)

I am sending in a string like "hmset aa bb cc" (a longer one). I will try to write a smaller program that can reproduce it, but I am not sure if I will be able to consistently reporoduce it. I am running with version 12 on CentOS 7 platform. I could not find a dbg package on CentOS platform. If you can give me instructions on creating debug symbols I can load the coredump into GDB and gather more information.
Thanks,

Ramkumar

The text was updated successfully, but these errors were encountered:

michael-grunder · 2016-07-20T23:26:33Z

Hey,

I'm not sure where you would get a debug package for hiredis on CentOS but it should be really easy to just build from source.

Just download the source and build it with a make noopt command. Once you've got it built, you should have the .so and .a file. From there you can just link directly from that within your program.

cc [your .c files] /full/path/to/libhiredis.a

Or if you can create a small program where you can consistently reproduce the problem, just post it here and I can run it in a CentOS VM.

Cheers
Mike

rgowrishankar · 2016-07-21T20:00:50Z

Is there a way to load the symbols in gdb to look at an existing coredump?

michael-grunder · 2016-07-21T20:32:24Z

I've never really tried to do this, but it looks like you can, yes:

First build hiredis (I think you'll need to keep the same optimization setting for the build).

Then use objcopy to get just the symbols:

objcopy --only-keep-symbols libhiredis.so hiredis.symbols

Finally start gdb like so:

gdb --core <your.core.file> --symbols=hiredis.symbols

This worked for me in a CentOS vm, but I just made test program, and didn't try it against libhiredis itself.

badboy · 2016-07-21T21:41:56Z

I'm not sure what exactly is failing there. realloc will also call free on the given pointer if it has to re-allocate and move data.

I only see see two ways how this can lead to a failure:

A already broken pointer is passed, making it fail in completely unpredictable ways
The pointer is concurrently modified from somewhere else, making it a data race.

rgowrishankar · 2016-07-22T21:34:35Z

I allocate a new char* and set it to the command that needs to be sent to redis and I pass that in to redisAsyncCommand as both the privdata and format: For example:
char *message; asprintf(message, "command"); redisAsyncCommand(ac, process_callback, message, message);
and I free message in process_callback.
What is the preferred way of calling the redisAsyncCommand?
In a side note, if I use char message[100] instead of char* and dynamic allocation to pass the message to redisAsyncCommand it should still work right?

If I look at the code, in __redisAppendCommand(redisContext *c, const char *cmd, size_t len)
redisContext is passed in to redisAsyncCommand by my code, cmd is initialized in redisvAsyncCommand via call to redisvFormatCommand and then len is also returned by redisvFormatCommand. I don't think the message that I pass into redisAsyncCommand is causing any corruption. I will try to write a small program that does redis transactions to try to reproduce the issue.

michael-grunder · 2016-07-22T22:53:23Z

Also run the process through valgrind if you haven't yet. That might give you the hints you need.

Also if you post a small program with the problem I'm happy to help you track down the problem.

rgowrishankar · 2016-08-18T02:23:15Z

I ran my original program with valgrind and I see the following with version0.12:
==00:00:31:47.008 4575== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:31:47.008 4575== by 0x4E3A647: sdsnewlen (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.008 4575== by 0x4E3A0C4: redisBufferWrite (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.008 4575== by 0x4E3D3E5: redisAsyncHandleWrite (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.008 4575== by 0x591A48: redis_source_dispatch (glib.h:95)
==00:00:31:47.008 4575== by 0x508B999: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4000.0)

==00:00:31:47.020 4575== 82 bytes in 3 blocks are possibly lost in loss record 1,357 of 2,306
==00:00:31:47.020 4575== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:31:47.020 4575== by 0x4E3A647: sdsnewlen (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.020 4575== by 0x4E3CBB4: ??? (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.020 4575== by 0x4E3D497: redisvAsyncCommand (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.020 4575== by 0x4E3D559: redisAsyncCommand (in /usr/lib64/libhiredis.so.0.12)

==00:00:31:47.061 4575== 2,390 bytes in 2 blocks are possibly lost in loss record 2,179 of 2,306
==00:00:31:47.061 4575== at 0x4C2BB4A: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:31:47.061 4575== by 0x4E3A79C: sdsMakeRoomFor (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.061 4575== by 0x4E3A909: sdscatlen (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.061 4575== by 0x4E3DA3B: redisReaderFeed (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.061 4575== by 0x4E39F4C: redisBufferRead (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.061 4575== by 0x4E3D336: redisAsyncHandleRead (in /usr/lib64/libhiredis.so.0.12)
==00:00:31:47.061 4575== by 0x591A60: redis_source_dispatch (glib.h:100)

rgowrishankar · 2016-08-18T02:30:20Z

A different crash that I am seeing has the following backtrace
, "stacktrace":
[ { "crash_thread": true
, "frames":
[ { "address": 140439883885772
, "build_id": "e36c6455b21c1cb68020709a8d5466dcfd2d47f2"
, "build_id_offset": 523468
, "function_name": "__libc_free"
, "file_name": "/lib64/libc.so.6"
}
, { "address": 140439916953504
, "build_id": "cdaee6b30ac02a9c7e4538e32d46094836a90e61"
, "build_id_offset": 28576
, "function_name": "_dictClear"
, "file_name": "/lib64/libhiredis.so.0.12"
}
, { "address": 140439916954293
, "build_id": "cdaee6b30ac02a9c7e4538e32d46094836a90e61"
, "build_id_offset": 29365
, "function_name": "__redisAsyncFree"
, "file_name": "/lib64/libhiredis.so.0.12"
}
, { "address": 5840993
, "build_id": "905f073aa6729f53933b1d05d2e6ecc74f934c22"
, "build_id_offset": 1646689
, "function_name": "redis_source_dispatch"
, "file_name": "/usr/sbin/ovs-vswitchd"
}
, { "address": 140439913871770
, "build_id": "659986b68d0e9d920a9e693546262250b895dca8"
, "build_id_offset": 301466
, "function_name": "g_main_context_dispatch"
, "file_name": "/lib64/libglib-2.0.so.0"

michael-grunder · 2016-08-18T13:37:05Z

Hey,

The second error looks like maybe you are double freeing the redisAsyncContext. Are you perhaps disconnecting, and then making an explicit call to redisAsyncFree? Hiredis will free your context for you when you call redisAsyncDisconnect.

I'm just guessing at this point because there isn't really enough to go on without an example program to replicate the error.

rgowrishankar · 2016-08-18T18:41:35Z

Hi Michael,

thanks for the reply. The new crash is not happening while cleanup, it is during normal operation. I am running a standalone program trying to reproduce the crash.

I do have a question about using hiredis in multi-threaded program. I have multiple threads that are generating the data that needs to go into redis and I am using a global redisAsyncContext and all the threads are doing redisAsyncCommand using the same context. As long as the context is protected using a global lock, do you see any issues? Should I have only one thread using one redisAsyncContext?

Ram

rgowrishankar · 2016-08-26T23:24:04Z

memleak_hiredis.txt

I have a sample program that shows memleak. I have attached it.
To run this:
gcc memleak_hiredis.c -I /usr/include/hiredis -I /usr/include/glib-2.0/ -I /usr/lib64/glib-2.0/include/ -l hiredis -l glib-2.0 -l pthread -g -O0
valgrind --leak-check=yes ~/a.out

There are 2 glib contexts, one for doing async writes and other for subscription.
Child thread opens 2 connections to redis and does subscribe on one of them
Main thread uses the other connection to send the data via async send.
==5648==
==5648== HEAP SUMMARY:
==5648== in use at exit: 5,285 bytes in 58 blocks
==5648== total heap usage: 669 allocs, 611 frees, 26,361 bytes allocated
==5648==
==5648== 9 bytes in 1 blocks are possibly lost in loss record 7 of 58
==5648== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5648== by 0x4E3AF2B: sdsnewlen (sds.c:55)
==5648== by 0x4E3AFC6: sdsempty (sds.c:71)
==5648== by 0x4E3A8E0: redisBufferWrite (hiredis.c:808)
==5648== by 0x4E3EC7D: redisAsyncHandleWrite (async.c:548)
==5648== by 0x40131D: redis_source_dispatch (glib.h:95)
==5648== by 0x508D7A9: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DAF7: ??? (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DBAB: g_main_context_iteration (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x4014F5: process_pending_events (memleak_hiredis.c:63)
==5648== by 0x401A82: redis_notifications_handler (memleak_hiredis.c:267)
==5648== by 0x5382DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==5648==
==5648== 9 bytes in 1 blocks are possibly lost in loss record 8 of 58
==5648== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5648== by 0x4E3AF2B: sdsnewlen (sds.c:55)
==5648== by 0x4E3AFC6: sdsempty (sds.c:71)
==5648== by 0x4E3A8E0: redisBufferWrite (hiredis.c:808)
==5648== by 0x4E3EC7D: redisAsyncHandleWrite (async.c:548)
==5648== by 0x40131D: redis_source_dispatch (glib.h:95)
==5648== by 0x508D7A9: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DAF7: ??? (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DBAB: g_main_context_iteration (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x4014E1: process_pending_events (memleak_hiredis.c:62)
==5648== by 0x401A82: redis_notifications_handler (memleak_hiredis.c:267)
==5648== by 0x5382DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==5648==
==5648== 21 bytes in 1 blocks are possibly lost in loss record 17 of 58
==5648== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5648== by 0x4E3AF2B: sdsnewlen (sds.c:55)
==5648== by 0x4E3EF2E: __redisAsyncCommand (async.c:611)
==5648== by 0x4E3F13C: redisvAsyncCommand (async.c:658)
==5648== by 0x4E3F204: redisAsyncCommand (async.c:667)
==5648== by 0x4016FD: init_redis_subscribe (memleak_hiredis.c:119)
==5648== by 0x4017FA: init_everything (memleak_hiredis.c:149)
==5648== by 0x401A64: redis_notifications_handler (memleak_hiredis.c:263)
==5648== by 0x5382DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==5648== by 0x568DCEC: clone (in /usr/lib64/libc-2.17.so)
==5648==
==5648== 135 bytes in 1 blocks are possibly lost in loss record 51 of 58
==5648== at 0x4C2BB4A: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5648== by 0x4E3B170: sdsMakeRoomFor (sds.c:142)
==5648== by 0x4E3B3DB: sdscatlen (sds.c:241)
==5648== by 0x4E40545: redisReaderFeed (read.c:462)
==5648== by 0x4E3A794: redisBufferRead (hiredis.c:772)
==5648== by 0x4E3EBC7: redisAsyncHandleRead (async.c:526)
==5648== by 0x401354: redis_source_dispatch (glib.h:100)
==5648== by 0x508D7A9: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DAF7: ??? (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DBAB: g_main_context_iteration (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x4014E1: process_pending_events (memleak_hiredis.c:62)
==5648== by 0x401A82: redis_notifications_handler (memleak_hiredis.c:267)
==5648==
==5648== 1,223 bytes in 1 blocks are possibly lost in loss record 58 of 58
==5648== at 0x4C2BB4A: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5648== by 0x4E3B170: sdsMakeRoomFor (sds.c:142)
==5648== by 0x4E3B3DB: sdscatlen (sds.c:241)
==5648== by 0x4E40545: redisReaderFeed (read.c:462)
==5648== by 0x4E3A794: redisBufferRead (hiredis.c:772)
==5648== by 0x4E3EBC7: redisAsyncHandleRead (async.c:526)
==5648== by 0x401354: redis_source_dispatch (glib.h:100)
==5648== by 0x508D7A9: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DAF7: ??? (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x508DBAB: g_main_context_iteration (in /usr/lib64/libglib-2.0.so.0.4200.2)
==5648== by 0x4014F5: process_pending_events (memleak_hiredis.c:63)
==5648== by 0x401A82: redis_notifications_handler (memleak_hiredis.c:267)
==5648==
==5648== LEAK SUMMARY:
==5648== definitely lost: 0 bytes in 0 blocks
==5648== indirectly lost: 0 bytes in 0 blocks
==5648== possibly lost: 1,397 bytes in 5 blocks
==5648== still reachable: 3,888 bytes in 53 blocks
==5648== suppressed: 0 bytes in 0 blocks
==5648== Reachable blocks (those to which a pointer was found) are not shown.
==5648== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5648==
==5648== For counts of detected and suppressed errors, rerun with: -v
==5648== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 1 from 1)

badboy · 2016-08-27T10:34:18Z

All those "possibly lost" messages are likely false positives. The used string library, sds, does some fancy pointer shuffling to make it look like normal char pointers by pointing right after the actual header of the data, which means it's pointing somewhere in the middle of an allocated block. valgrind can't detect this.

As the example does not lead to a crash, I still have no idea what's causing the crash in your application.

Regarding a single globally-locked context used from multiple threads: I don't know glib's internals. It might be that it is not all to happy to get the write events from multiple threads, even if those events are mutually exclusive.

jkarneges · 2016-12-10T20:12:49Z

(EDIT: This may be a false alarm. I see in the code that hiredis grows a read buffer to 1024 bytes, and so in our small tests of sending a few commands it looked like a leak but maybe isn't. Leaving this comment here in case it helps another person.)

We're seeing similar "possibly lost" indications from valgrind, with a process that continues to grow in size (to hundreds of MB, until our process manager restarts it). It is a relatively small program so I think there is possibly a real leak here.

It's only possible to see the "possibly lost" blocks if we abort the process uncleanly. Normally our program cleans up on termination, and if we terminate cleanly then there are no lost blocks reported (not even possibly lost).

Any chance the realloc is repeatedly extending a buffer when it shouldn't need to? This could explain how a buffer could get longer and longer while running, yet still get freed on clean termination since there's only one pointer to manage it, as opposed to losing many pointers that still point to things.

==31462== 4,818 bytes in 16 blocks are possibly lost in loss record 419 of 438
==31462==    at 0x4C2FD5F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31462==    by 0x50A57FD: sdsMakeRoomFor (in /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13)
==31462==    by 0x50A5939: sdscatlen (in /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13)
==31462==    by 0x50A89AB: redisReaderFeed (in /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13)
==31462==    by 0x50A4FAC: redisBufferRead (in /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13)
==31462==    by 0x50A8296: redisAsyncHandleRead (in /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13)

I should note this is with hiredis 0.13.3, from the Ubuntu 16.04 package.

dagostinelli · 2016-12-21T15:11:39Z

FYI: This is the same issue as #493

fenglonz mentioned this issue Dec 9, 2016

core dump on calling sdsMakeRoomFor #493

Closed

dagostinelli mentioned this issue Dec 21, 2016

Problem in sdsMakeRoomFor() in sds.c #221 #480

Closed

badboy closed this as completed Jan 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

realloc calling free #447

realloc calling free #447

rgowrishankar commented Jul 20, 2016 •

edited by badboy

Loading

michael-grunder commented Jul 20, 2016

rgowrishankar commented Jul 21, 2016

michael-grunder commented Jul 21, 2016

badboy commented Jul 21, 2016

rgowrishankar commented Jul 22, 2016 •

edited

Loading

michael-grunder commented Jul 22, 2016

rgowrishankar commented Aug 18, 2016

rgowrishankar commented Aug 18, 2016

michael-grunder commented Aug 18, 2016

rgowrishankar commented Aug 18, 2016

rgowrishankar commented Aug 26, 2016

badboy commented Aug 27, 2016

jkarneges commented Dec 10, 2016 •

edited

Loading

dagostinelli commented Dec 21, 2016

realloc calling free #447

realloc calling free #447

Comments

rgowrishankar commented Jul 20, 2016 • edited by badboy Loading

michael-grunder commented Jul 20, 2016

rgowrishankar commented Jul 21, 2016

michael-grunder commented Jul 21, 2016

badboy commented Jul 21, 2016

rgowrishankar commented Jul 22, 2016 • edited Loading

michael-grunder commented Jul 22, 2016

rgowrishankar commented Aug 18, 2016

rgowrishankar commented Aug 18, 2016

michael-grunder commented Aug 18, 2016

rgowrishankar commented Aug 18, 2016

rgowrishankar commented Aug 26, 2016

badboy commented Aug 27, 2016

jkarneges commented Dec 10, 2016 • edited Loading

dagostinelli commented Dec 21, 2016

rgowrishankar commented Jul 20, 2016 •

edited by badboy

Loading

rgowrishankar commented Jul 22, 2016 •

edited

Loading

jkarneges commented Dec 10, 2016 •

edited

Loading