memcmp may be miscompiled by GCC #823

real-or-random · 2020-09-23T08:51:57Z

What about the other memcmp's we have in actual production code? (bip340 tag, tweak add check, scratch impl, sha256 selftest) does this bug affect those too?

Originally posted by @elichai in #822 (comment)

context: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95189

real-or-random · 2020-09-23T09:02:35Z

AFAIU (and I'm not 100% sure) this affects comparisons with fixed byte arrays which include zero bytes. The check may then early terminate at the first zero byte.

BIP340 tag: Strictly speaking yes, here we compare with a fixed string including zero bytes. Even though it will be weird if someone passed BIP0340/nonce\0AB or something like this. (and see Nonce functions #757 , maybe that's not a great API anyway).
Tweak add check: probably not, no constants here.
scratch: here the byte array is scratch so this should be fine.
sha256 selftest: no zero byte in the outpur either.

In the tests we have many many memcmp calls, some even with the all-zero arrays.

elichai · 2020-09-23T12:21:31Z

BIP340 tag: Strictly speaking yes, here we compare with a fixed string including zero bytes. Even though it will be weird if someone passed BIP0340/nonce\0AB or something like this. (and see Nonce functions #757 , maybe that's not a great API anyway).

I don't think it's explicit anywhere that the tag even needs to be a string, someone could use the timestamp of their product launch datetime, or their birthday as the tag, and that will contain zeros. (I have 0 idea if this is realistic or not, I do not know of products that use tagged hashes so don't know if they used string/int/bytes as tag)

real-or-random · 2020-09-23T12:44:31Z

@elichai
Indeed but I believe the bug only occurs with the zeros in the constant. We may want to verify/test this.

roconnor-blockstream · 2020-09-23T15:42:41Z

Is it sufficient to add a -fno-builtin-memcmp flag if it exists, (and maybe disable it if some autotools stuff has run and was unable to find the bug)?

(I have verified that my tests in #822 pass when -fno-builtin-memcmp is used.)

real-or-random · 2020-09-23T16:22:42Z

If "-fno-builtin-memcmp" is sufficient that sounds good. I think it would still be advisable to include an explicit memcmp test if no-builtin-memcmp is the solution

We could even include a self-test but I'm not sure about this.

And I'm still not sure if shipping our function is just better, even though it's "extremely dumb" as to your adequate summary. Pragmatically, the implementation is trivial and will just work everywhere without any compiler detection magic etc.
edit: Moreover, if the fix is in the code and not in the compiler flags, it's possible to use check the GCC version in the preprocessor, if we want to do this.

roconnor-blockstream · 2020-09-23T17:52:59Z

If we do that will we be able to detect indirect calls to memcmp (I'm thinking via other libc calls)? In that sense -fno-builtin-memcmp is more robust.

roconnor-blockstream · 2020-09-23T18:00:50Z

how does a secp256k1_memcmp help change behaviour in other compilation units?

sipa · 2020-09-23T18:01:49Z

(I'm thinking via other libc calls)

Well we're not going to be able to re-build libc. If there is an indirect bug due to libc being compiled with gcc and its memcmp calls working incorrectly, that would be a bug in the resulting libc library, and nothing we can do about that (except minimizing how much we rely on libc).

The compiler can in some cases "emit" memcmp calls automatically, though. I don't know if that's the case in our codebase, and I don't know if that's strictly for situations where a builtin wouldn't/can't be used - but if that is somehow subject to the same bug, having a custom memcmp function may not be enough.

sipa · 2020-09-23T18:04:12Z

Parallel Bitcoin Core issue: bitcoin/bitcoin#20005

roconnor-blockstream · 2020-09-23T18:15:20Z

Oh sorry I didn't mean fixing libc. I mean that the compiler hypothetically inlining some call to libc that in turn calls memcmp, that then gets optimized. I'm not familiar enough with C to have anything in particular in mind, so maybe it just isn't a thing to worry about. TBH I was really thinking of something analogous to std::lexicographical_compare.

sipa · 2020-09-23T18:20:08Z

The compiler can't inline calls to libc, as it doesn't know what is in the called functions.

What is possible is that some functionality is implemented in libc headers, through inline functions or other builtins, that directly call memcmp/__builtin_memcmp. I can't find any instances of this in my system C headers (except the definition of memcmp itself), though there are a few in STL C++ headers.

roconnor-blockstream · 2020-09-23T19:05:40Z

That is good to hear.

Fixes bitcoin-core#823.

elichai · 2020-09-25T12:23:44Z

The compiler can't inline calls to libc, as it doesn't know what is in the called functions.

I don't believe this is true, a compiler target also encodes the exact libc variant that is used, and the compiler can use that knowledge to its advantage (see #/776 for example) a few examples of how without headers the compiler can remove calls to libc and replace them with equivalent instructions while assuming those calls don't have side effects. (and are well known)
https://godbolt.org/z/EKe4rx

EDIT:
I see now that @sipa probably commented on this sentence by @roconnor-blockstream

I mean that the compiler hypothetically inlining some call to libc that in turn calls memcmp

I still believe this could happen, just like gcc turns this code into a "return 0":

#include <stddef.h>
void *calloc(size_t nmemb, size_t size);
int zeroed_alloc(int num) {
    int* p = calloc(num, sizeof(int));
    int ret = *p;
    free(p);
    return ret;
}

real-or-random · 2020-09-25T18:22:24Z

Given that we have some evidence that this does not happen when the GCC knows that the return value is compared with != 0 or == 0, I tend towards ignoring the issue. This bug looked very bad but in the end the scope is very limited. As long as our code is not affected, this is not much different from hundreds of other compiler bugs (e.g. #585).

Of course #825 is a simple fix but it's somewhat arbitrary then.

See also bitcoin/bitcoin#20005 (comment), which does not show any potential issues in secp256k1.

sipa · 2020-09-25T18:30:35Z

@elichai Sure, the compiler may know things about how C standard library functions behave (because they're specified by the standard, or because it knows additional promises the specifically used C standard library used makes). But (in general) it cannot actually look at the compiled library object code and inline it (in theory LTO could change that, but there is no LTO done for glibc IIRC). So just the fact that a particular function inside glibc is written using memcmp isn't relevant - except to the extent that it may be miscompiled inside glibc itself - and there is nothing we can do about that.

sipa · 2020-09-26T03:34:20Z

@real-or-random I don't know - even with evidence that the current codebase is unaffected, it's still scary - evidenced by the fact that we hit it randomly in PR #822 (thankfully in test-only code, but it could have been elsewhere).

real-or-random · 2020-09-26T09:17:11Z

If people feel we should do #825, then I'm not against this. It certainly won't hurt.

elichai · 2020-09-26T09:23:56Z

@roconnor-blockstream

I rebuilt bitcoin-0.20.1 (including libsecp256k1) using emit_diagnostic, and I also did not get any miscompiled memcmp messages.

Can you also try this on libsecp with all the features on + tests?

roconnor-blockstream · 2020-09-26T12:52:43Z

We can make secp256k1_memcmp, but is solving the issue for libsecp256k1 meaningful if we don't solve it for bitcoin?

roconnor-blockstream · 2020-10-14T18:15:56Z

Still nothing prevents reintroduction of memcmp other than diligence.

sipa · 2020-10-14T18:37:23Z

Let's keep this open to discuss how an accidental memcmp can be prevent.

CI could do a simple grep for the word memcmp in the source code?

roconnor-blockstream · 2020-10-14T18:41:25Z

For reference real-or-random posted a clang-query command at #825 (comment).

real-or-random · 2020-10-15T12:43:41Z

If there is some CI test it could check if any function outside of the library is called, except for whitelisted ones (memset, memcpy, malloc, and free, I believe), no? Maybe also make sure that malloc and free are only called via the wrapper functions. That might be useful independently of the memcmp concerns.

Hm, do we really want to restrict ourselves to a small list of standard library functions? I don't think the standard library is bad per se. Compiler bugs can happen everywhere, not only in calls to the standard library. Moreover, new calls are easily spotted in code review. I think memcmp is simply different because one needs to remember that memcmp is special.

If you want to give it a try:
clang-query src/secp256k1.c -c 'match callExpr(allOf(unless(isExpansionInSystemHeader()), callee(functionDecl(isExpansionInSystemHeader()))))'
This matches all calls to functions declared in system headers, unless the call itself happens in a system header.

roconnor-blockstream · 2020-10-15T18:45:00Z

If we are going to whitelist standard library functions, and I'm not arguing here that we should or shouldn't, one possible solution is to write our own header of standard library prototypes from our whitelist and disallow all system include files. That said I don't know how to enforce that system includes are disallowed.

elichai · 2020-10-15T19:04:22Z

That said I don't know how to enforce that system includes are disallowed.

-nostdinc :), the main problem is stdint.h, which we want to manage for us all the int types in different targets

real-or-random · 2020-10-15T19:39:19Z

See also #833

This add a simple static checker based on clang-query, which is a tool that could be described as a "clever grep" for abstract syntax trees (ASTs). As an initial proof of usefulness, this commit adds these checks: - No uses of floating point types. - No use of certain reserved identifiers (e.g., "mem(...)", bitcoin-core#829). - No use of memcmp (bitcoin-core#823). The checks are easily extensible. The main purpose is to run the checker on CI, and this commit adds the checker to the Travis CI script. This currently requires clang-query version at least 10. (However, it's not required not compile with clang version 10, or with clang at all. Just the compiler flags must be compatible with clang.) Clang-query simply uses the clang compiler as a backend for generating the AST. In order to determine the compile in which the code is supposed to be compiled (e.g., compiler flags such as -D defines and include paths), it reads a compilation database in JSON format. There are multiple ways to generate this database. The easiest way to obtain such a database is to use a tool that intercepts the make process and build the database. On Travis CI, we currently use "bear" for this purpose. It's a natural choice because there is an Ubuntu package for it. If you want to run this locally, bear is a good choice but other tools such as compiledb (Python) are available.

roconnor-blockstream · 2022-08-04T13:46:34Z

As fanquake noted in bitcoin/bitcoin#20005 (comment), this is fixed in GCC 10.3 and above.

benma · 2024-10-11T09:45:46Z

I just lost an afternoon trying to debug a valgrind false positive.

https://github.com/bitcoin-core/secp256k1/pull/1140/files#diff-3fe8f8fa0b765ad49f70d6c32f6a865be48faeb3c8d6dd5f8c274ca546ef5b61R1111 (click 'Load diff' on tests_impl.h).

The line

           CHECK(memcmp(output, input, sizeof(output)) == 0);

resulted in this CI failure:

https://github.com/bitcoin-core/secp256k1/actions/runs/11285462388/job/31388294841

==9263== Memcheck, a memory error detector
==9263== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==9263== Using Valgrind-3.24.0.GIT-lbmacos and LibVEX; rerun with -h for copyright info
==9263== Command: ./tests
==9263== 
test count = 2
random seed = ae250b1a3ee1ceb09ad0acfab0d10ecd
Skipping run_sha256_known_output_tests 1000000 (iteration count too low)
Skipping test_ecmult_constants_sha 2048 (iteration count too low)
Skipping test_ecmult_constants_2bit (iteration count too low)
==9263== Conditional jump or move depends on uninitialised value(s)
==9263==    at 0x7FF81B8D5D07: ???
==9263==    by 0x10001A8C4: ??? (in ./tests)
==9263==    by 0x10001CA1D: ??? (in ./tests)
==9263==    by 0x10003DA27: ??? (in ./tests)
==9263==    by 0x10026952D: (below main) (in /usr/lib/dyld)
==9263== 
==9263== Use of uninitialised value of size 8
==9263==    at 0x7FF81B8D5D23: ???
==9263==    by 0x10001A8C4: ??? (in ./tests)
==9263==    by 0x10001CA1D: ??? (in ./tests)
==9263==    by 0x10003DA27: ??? (in ./tests)
==9263==    by 0x10026952D: (below main) (in /usr/lib/dyld)
==9263== 
==9263== Use of uninitialised value of size 8
==9263==    at 0x7FF81B8D5D28: ???
==9263==    by 0x10001A8C4: ??? (in ./tests)
==9263==    by 0x10001CA1D: ??? (in ./tests)
==9263==    by 0x10003DA27: ??? (in ./tests)
==9263==    by 0x10026952D: (below main) (in /usr/lib/dyld)
==9263== 
==9263== Conditional jump or move depends on uninitialised value(s)
==9263==    at 0x10001A8C7: ??? (in ./tests)
==9263==    by 0x10001CA1D: ??? (in ./tests)
==9263==    by 0x10003DA27: ??? (in ./tests)
==9263==    by 0x10026952D: (below main) (in /usr/lib/dyld)
==9263== 
random run = ae7d7478f2fc4fde34e2625d2948ce83
no problems found
==9263== 
==9263== HEAP SUMMARY:
==9263==     in use at exit: 8,586 bytes in 169 blocks
==9263==   total heap usage: 28,425 allocs, 28,256 frees, 98,898,807,326 bytes allocated
==9263== 
==9263== LEAK SUMMARY:
==9263==    definitely lost: 4,160 bytes in 130 blocks
==9263==    indirectly lost: 0 bytes in 0 blocks
==9263==      possibly lost: 600 bytes in 3 blocks
==9263==    still reachable: 3,826 bytes in 36 blocks
==9263==         suppressed: 0 bytes in 0 blocks
==9263== Rerun with --leak-check=full to see details of leaked memory
==9263== 
==9263== Use --track-origins=yes to see where uninitialised values come from
==9263== For lists of detected and suppressed errors, rerun with: -s
==9263== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 123 from 19)
FAIL tests (exit status: 42)

I could not get valgrind to point me to the right line, so bisecting the line took a very long time. I don't think anything is actually wrong with the code there.

In the end changing memcmp to secp256k1_memcmp_var fixed it.

This happened on macOS x86 for both gcc and clang. The valgrind check on linux worked fine.

Consider adding a CI check that forbids the use uf memcmp(), so no one else has to go through the pain of debugging something like this again 🙈

edit:

@real-or-random

above you mention

Moreover, new calls are easily spotted in code review. I think memcmp is simply different because one needs to remember that memcmp is special.

This turned out to be a wrong assumption - in my case, no one pointed it out to me in review, and it was very hard for me to figure this one out. If the CI check had been added as discussed above, I would not have lost so much time on this.

real-or-random · 2024-11-05T10:09:23Z

@benma I see that a CI check could also be helpful to avoid a Valgrind false positive, but in these cases, you should probably report to https://github.com/LouisBrunner/valgrind-macos. Valgrind ships with a bunch of standard suppressions, but these need to be updated from time to time with new macOS versions. And we're probably one of the main users of this Valgrind macOS fork, so last time @hebasto updated the suppressions, see LouisBrunner/valgrind-macos#114.

@hebasto Do you think this one should also be submitted to upstream, even though we strictly speaking won't need it due to secp256k1_memcmp_var?

real-or-random · 2024-11-05T10:24:02Z

By the way, here's a godbolt link to check which GCC versions are affected by the original bug:

It would be nice to get rid of secp256k1_memcmp_var completely, but GCC 10.2 is affected, which is for example still in debian bullseye with has long-term support until August 31st, 2026...

gmaxwell mentioned this issue Sep 23, 2020

Increase precision of g1 and g2. #822

Closed

sipa mentioned this issue Sep 23, 2020

memcmp with constants that contain zero bytes are broken in GCC bitcoin/bitcoin#20005

Closed

1 task

real-or-random added a commit to real-or-random/secp256k1 that referenced this issue Sep 24, 2020

Switch to our own memcmp function

f5f1202

Fixes bitcoin-core#823.

real-or-random mentioned this issue Sep 24, 2020

Switch to our own memcmp function #825

Closed

real-or-random changed the title ~~memcmp~~ memcmp may be miscompiled by GCC Oct 7, 2020

real-or-random mentioned this issue Oct 13, 2020

Rip out non-endomorphism code + dependencies #830

Merged

luke-jr mentioned this issue Oct 13, 2020

Workaround for GCC bug 95189 using -fno-builtin-memcmp #832

Closed

sipa closed this as completed in 6173839 Oct 14, 2020

sipa reopened this Oct 14, 2020

real-or-random mentioned this issue Oct 19, 2020

Add simple static checker based on clang-query #833

Closed

real-or-random mentioned this issue Jan 21, 2023

Update musig module to BIP MuSig2 v1.0.0-rc.3 BlockstreamResearch/secp256k1-zkp#211

Merged

real-or-random added the assurance label Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memcmp may be miscompiled by GCC #823

memcmp may be miscompiled by GCC #823

real-or-random commented Sep 23, 2020 •

edited

Loading

real-or-random commented Sep 23, 2020 •

edited

Loading

elichai commented Sep 23, 2020 •

edited

Loading

real-or-random commented Sep 23, 2020

roconnor-blockstream commented Sep 23, 2020

real-or-random commented Sep 23, 2020 •

edited

Loading

roconnor-blockstream commented Sep 23, 2020 •

edited

Loading

roconnor-blockstream commented Sep 23, 2020

sipa commented Sep 23, 2020

sipa commented Sep 23, 2020

roconnor-blockstream commented Sep 23, 2020

sipa commented Sep 23, 2020

roconnor-blockstream commented Sep 23, 2020

elichai commented Sep 25, 2020 •

edited

Loading

real-or-random commented Sep 25, 2020 •

edited

Loading

sipa commented Sep 25, 2020 •

edited

Loading

sipa commented Sep 26, 2020

real-or-random commented Sep 26, 2020

elichai commented Sep 26, 2020

roconnor-blockstream commented Sep 26, 2020

roconnor-blockstream commented Oct 14, 2020

sipa commented Oct 14, 2020

roconnor-blockstream commented Oct 14, 2020

real-or-random commented Oct 15, 2020

roconnor-blockstream commented Oct 15, 2020

elichai commented Oct 15, 2020

real-or-random commented Oct 15, 2020

roconnor-blockstream commented Aug 4, 2022

benma commented Oct 11, 2024 •

edited

Loading

real-or-random commented Nov 5, 2024

real-or-random commented Nov 5, 2024

memcmp may be miscompiled by GCC #823

memcmp may be miscompiled by GCC #823

Comments

real-or-random commented Sep 23, 2020 • edited Loading

real-or-random commented Sep 23, 2020 • edited Loading

elichai commented Sep 23, 2020 • edited Loading

real-or-random commented Sep 23, 2020

roconnor-blockstream commented Sep 23, 2020

real-or-random commented Sep 23, 2020 • edited Loading

roconnor-blockstream commented Sep 23, 2020 • edited Loading

roconnor-blockstream commented Sep 23, 2020

sipa commented Sep 23, 2020

sipa commented Sep 23, 2020

roconnor-blockstream commented Sep 23, 2020

sipa commented Sep 23, 2020

roconnor-blockstream commented Sep 23, 2020

elichai commented Sep 25, 2020 • edited Loading

real-or-random commented Sep 25, 2020 • edited Loading

sipa commented Sep 25, 2020 • edited Loading

sipa commented Sep 26, 2020

real-or-random commented Sep 26, 2020

elichai commented Sep 26, 2020

roconnor-blockstream commented Sep 26, 2020

roconnor-blockstream commented Oct 14, 2020

sipa commented Oct 14, 2020

roconnor-blockstream commented Oct 14, 2020

real-or-random commented Oct 15, 2020

roconnor-blockstream commented Oct 15, 2020

elichai commented Oct 15, 2020

real-or-random commented Oct 15, 2020

roconnor-blockstream commented Aug 4, 2022

benma commented Oct 11, 2024 • edited Loading

real-or-random commented Nov 5, 2024

real-or-random commented Nov 5, 2024

real-or-random commented Sep 23, 2020 •

edited

Loading

real-or-random commented Sep 23, 2020 •

edited

Loading

elichai commented Sep 23, 2020 •

edited

Loading

real-or-random commented Sep 23, 2020 •

edited

Loading

roconnor-blockstream commented Sep 23, 2020 •

edited

Loading

elichai commented Sep 25, 2020 •

edited

Loading

real-or-random commented Sep 25, 2020 •

edited

Loading

sipa commented Sep 25, 2020 •

edited

Loading

benma commented Oct 11, 2024 •

edited

Loading