Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASan: Enable for globals #251

Merged
merged 2 commits into from
Dec 2, 2021
Merged

ASan: Enable for globals #251

merged 2 commits into from
Dec 2, 2021

Conversation

pwmarcz
Copy link
Contributor

@pwmarcz pwmarcz commented Nov 26, 2021

Description of the changes

This enables ASan for globally allocated memory, such as global variables or string literals.

The main addition is handling of .init_array, but this is also relatively simple.

Unfortunately it did not find any new bugs, but since the complexity cost is not big, I think it's worth keeping

How to test this PR?

There is a test for this, but you can try to also trigger it yourself. Try writing past a global variable, or reading after then end of a string literal.


This change is Reviewable

@pwmarcz pwmarcz force-pushed the pawel/asan-globals branch 2 times, most recently from 845ad2c to 08ed80a Compare November 30, 2021 14:17
@pwmarcz pwmarcz changed the title WIP: ASan globals ASan: Enable for globals Nov 30, 2021
@pwmarcz pwmarcz marked this pull request as ready for review November 30, 2021 14:23
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 35 files at r1, 23 of 23 files at r2, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @pwmarcz)

a discussion (no related file):
Another awesome PR!



common/include/asan.h, line 235 at r2 (raw file):

struct __asan_global {
    /* The address of the global. */
    uintptr_t beg;

Does it need to have this name? Can we rename to normal addr and put in a comment that it is called beg in ASan sources?


common/include/asan.h, line 250 at r2 (raw file):

    void* location;
    /* The address of the ODR (One Definition Rule) indicator symbol. */
    uintptr_t odr_indicator;

Do we really need all these fields? I guess you list them all because this is the object ASan gives to you? Maybe then add a suffix to them _unused?


common/src/asan.c, line 353 at r2 (raw file):

    for (size_t i = 0; i < n; i++) {
        /* Enable the below code for debugging */
#if 0

Do we want this? Historically, such unused code becomes stale after a while, and nobody fixes it...


common/src/init.c, line 20 at r2 (raw file):

void call_init_array(void) {
    void (**func)(void);
    for (func = &__init_array_start; func < &__init_array_end; func++) {

Do you know how it works with constructor priorities? I don't see anything in the linker scripts or here that would guarantee some ordering. This is not relevant right now (we only have ASan constructors with the same priority 1), but in the future we may want to add more constructors, and priorities will suddenly not work as expected...


LibOS/shim/src/shim_checkpoint.c, line 172 at r2 (raw file):

    size_t off = ADD_CP_OFFSET(len);
    /* Use `_real_memcpy` to bypass ASan. */
    _real_memcpy((char*)base + off, &__migratable[0], len);

Could you expand a bit? What happens if these migratable globals are copied via memcpy? I guess we'll step into red zones of globals, and ASan will complain?


Pal/src/host/Linux-SGX/sgx_main.c, line 1085 at r2 (raw file):

 * cases, we invoke the constructors from `.init_array` directly, so we have full control over
 * initialization.
 */

Why did this suddenly change? Is it because you now run with asan-globals, and now you have ASan-generated constructors (with priority 1), so before these constructors, you need to setup ASan (with priority 0), otherwise ASan-generated constructors fail?

I wonder if we need all these hacks in the untrusted (sgx_) code. On the other hand, this may catch some bugs. Ah, ignore my comment, I'm just thinking aloud.

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 35 files at r1, 5 of 23 files at r2, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv and @pwmarcz)


common/include/init.h, line 10 at r2 (raw file):

/*
 * Call the constructors specified in `.init_array` for the current object. Should be called during

What is the current object?


common/src/asan.c, line 353 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do we want this? Historically, such unused code becomes stale after a while, and nobody fixes it...

+1 to not keeping it.


common/src/init.c, line 15 at r2 (raw file):

 * referring to `__init_array_start` and `__init_array_end` in that object.
 */
extern void (*__init_array_start)(void);

I wonder what is the correct type there, this or void (**__init_array_start)(void)?
I would expect the type void (*init_array[])(void), then __init_array_start would be init_array[0], which doesn't really make sense - you cannot express that in C, I guess this works because this is not a real C thing, just an ELF symbol. If it was void (**__init_array_start)(void) then it would just be __init_array_start = &init_array[0]
I guess tldr is that you cannot have a symbol in the middle of an array in C alone.


common/src/init.c, line 20 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do you know how it works with constructor priorities? I don't see anything in the linker scripts or here that would guarantee some ordering. This is not relevant right now (we only have ASan constructors with the same priority 1), but in the future we may want to add more constructors, and priorities will suddenly not work as expected...

Isn't the order in init_array defined by constructor priorities?


LibOS/shim/src/shim_checkpoint.c, line 172 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Could you expand a bit? What happens if these migratable globals are copied via memcpy? I guess we'll step into red zones of globals, and ASan will complain?

Hm, but why is that? This symbol is created in linker scrip, so ASAN cannot posion it, could it?


LibOS/shim/src/shim_init.c, line 387 at r2 (raw file):

    log_setprefix(shim_get_tcb());

    call_init_array();

I don't like this. Does ASAN use memory allocations, like ever? Because they won't work at this point, if any function in init_array uses them. This also applies basically to anything, except simple assignments.


Pal/src/host/Linux/db_main.c, line 190 at r2 (raw file):

        INIT_FAIL(-ret, "_DkSystemTimeQuery() failed");

    call_init_array();

ditto (as in LibOS)


Pal/src/host/Linux-SGX/db_main.c, line 672 at r2 (raw file):

    }

    call_init_array();

ditto

Copy link
Contributor Author

@pwmarcz pwmarcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 27 of 35 files reviewed, 11 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @pwmarcz)


common/include/asan.h, line 235 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Does it need to have this name? Can we rename to normal addr and put in a comment that it is called beg in ASan sources?

Sure, let's do it. Renamed to addr.


common/include/asan.h, line 250 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do we really need all these fields? I guess you list them all because this is the object ASan gives to you? Maybe then add a suffix to them _unused?

Yes, this is the object ASan gives to me (in an array), so even if I read only a few fields, the layout/structure size has to be the same.

I added a comment explaining what we actually use, and removed the comments (superfluous, or irrelevant to us). But I don't like the idea of prefixing names with_unused. AFAIK, we don't do that for any other data structures that are part of ABI (e.g. kernel data structures).


common/include/init.h, line 10 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

What is the current object?

Changed to "ELF object", if you want a different phrasing, please advise.


common/src/asan.c, line 353 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

+1 to not keeping it.

OK. Deleted.


common/src/init.c, line 15 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I wonder what is the correct type there, this or void (**__init_array_start)(void)?
I would expect the type void (*init_array[])(void), then __init_array_start would be init_array[0], which doesn't really make sense - you cannot express that in C, I guess this works because this is not a real C thing, just an ELF symbol. If it was void (**__init_array_start)(void) then it would just be __init_array_start = &init_array[0]
I guess tldr is that you cannot have a symbol in the middle of an array in C alone.

Musl:

hidden void (*const __init_array_start)(void)=0, (*const __fini_array_start)(void)=0;

Glibc:

extern void (*__init_array_start []) (int, char **, char **);
extern void (*__init_array_end []) (int, char **, char **);

I think any of these (pointer, or array of pointers) will work. In my mind, the first one is simpler, but I can change. I'm not sure which one is more "correct": while you could treat _init_array_start as a global array, it's harder to make the case that _init_array_end is one.

(In LibOS we sort of take the middle road, with extern char __migratable[] and extern char __migratable_end. But I'm not sure if I want to be more clever than musl/glibc rather than just copy their solution).

However, void (**init_array_start)(void) (pointer to pointer) will not work: __init_array_start is supposed to be a variable located in the ._init_array section (not a pointer to that section), so treating that variable as a pointer-to-pointer will dereference it twice instead of once.


common/src/init.c, line 20 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Isn't the order in init_array defined by constructor priorities?

Good question. I assumed .init_array.* in the linker script does that, but there's also SORT_BY_INIT_PRIORITY(). I added that too.

(And KEEP(): as far as I understand, this prevents removing these sections when using --gc-sections, and it's also what the default script does).


LibOS/shim/src/shim_checkpoint.c, line 172 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Hm, but why is that? This symbol is created in linker scrip, so ASAN cannot posion it, could it?

@dimakuv is right. This is not about the __migratable symbol, but global variables in that section. I expanded the comment.


LibOS/shim/src/shim_init.c, line 387 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I don't like this. Does ASAN use memory allocations, like ever? Because they won't work at this point, if any function in init_array uses them. This also applies basically to anything, except simple assignments.

These ASan initializers do not allocate memory.

The only place in ASan that allocates memory is the ASan crash handler, when it calls describe_location. However, if that fails, it will proceed anyway. And the crash handler might run regardless of this call_init_array(), for instance for a stack buffer overflow.

So I'd rather not move this call later if it's not necessary: if it happens later, it will catch less bugs in Gramine initialization.


Pal/src/host/Linux-SGX/sgx_main.c, line 1085 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why did this suddenly change? Is it because you now run with asan-globals, and now you have ASan-generated constructors (with priority 1), so before these constructors, you need to setup ASan (with priority 0), otherwise ASan-generated constructors fail?

I wonder if we need all these hacks in the untrusted (sgx_) code. On the other hand, this may catch some bugs. Ah, ignore my comment, I'm just thinking aloud.

Yes. Previously we had no ASan-generated constructors (or rather, they were there, but they did nothing).

I regret this hack, but I think the source of trouble is linking against Glibc in untrusted code (which we do because we want to link against protobuf, IIRC), I wouldn't be surprised if it wasn't the last time it causes problems. And I think it's worth it to have ASan there.

dimakuv
dimakuv previously approved these changes Dec 1, 2021
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r3, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @pwmarcz)


Pal/src/host/Linux-SGX/sgx_main.c, line 1085 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Yes. Previously we had no ASan-generated constructors (or rather, they were there, but they did nothing).

I regret this hack, but I think the source of trouble is linking against Glibc in untrusted code (which we do because we want to link against protobuf, IIRC), I wouldn't be surprised if it wasn't the last time it causes problems. And I think it's worth it to have ASan there.

OK. Yes, Glibc sometimes gets in our way in the untrusted code, but it's not too bad.

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)


common/include/init.h, line 10 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Changed to "ELF object", if you want a different phrasing, please advise.

I just have a problem with the word current - it kind of implies that it changes (over time). Maybe just delete for the current ELF object?


common/src/init.c, line 15 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Musl:

hidden void (*const __init_array_start)(void)=0, (*const __fini_array_start)(void)=0;

Glibc:

extern void (*__init_array_start []) (int, char **, char **);
extern void (*__init_array_end []) (int, char **, char **);

I think any of these (pointer, or array of pointers) will work. In my mind, the first one is simpler, but I can change. I'm not sure which one is more "correct": while you could treat _init_array_start as a global array, it's harder to make the case that _init_array_end is one.

(In LibOS we sort of take the middle road, with extern char __migratable[] and extern char __migratable_end. But I'm not sure if I want to be more clever than musl/glibc rather than just copy their solution).

However, void (**init_array_start)(void) (pointer to pointer) will not work: __init_array_start is supposed to be a variable located in the ._init_array section (not a pointer to that section), so treating that variable as a pointer-to-pointer will dereference it twice instead of once.

Okay, you are right that this is just a variable in the middle of an array - we have to take its address anyway.


LibOS/shim/src/shim_checkpoint.c, line 172 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

@dimakuv is right. This is not about the __migratable symbol, but global variables in that section. I expanded the comment.

Thanks, makes sense now.


LibOS/shim/src/shim_init.c, line 387 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

These ASan initializers do not allocate memory.

The only place in ASan that allocates memory is the ASan crash handler, when it calls describe_location. However, if that fails, it will proceed anyway. And the crash handler might run regardless of this call_init_array(), for instance for a stack buffer overflow.

So I'd rather not move this call later if it's not necessary: if it happens later, it will catch less bugs in Gramine initialization.

Hopefully these initializers don't depend on anything else...

Copy link
Contributor Author

@pwmarcz pwmarcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski)


common/include/init.h, line 10 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I just have a problem with the word current - it kind of implies that it changes (over time). Maybe just delete for the current ELF object?

OK. Done.

boryspoplawski
boryspoplawski previously approved these changes Dec 1, 2021
Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners

dimakuv
dimakuv previously approved these changes Dec 2, 2021
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, "fixup! " found in commit messages' one-liners

This commit makes sure that constructors in `.init_array` section are
being run. These sections are normally empty for PAL and LibOS, except
when compiling with AddressSanitizer.

Signed-off-by: Paweł Marczewski <pawel@invisiblethingslab.com>
This enables ASan for globally allocated memory, such as global
variables or string literals.

Signed-off-by: Paweł Marczewski <pawel@invisiblethingslab.com>
@dimakuv dimakuv dismissed stale reviews from boryspoplawski and themself via 76eefd1 December 2, 2021 14:07
@dimakuv dimakuv force-pushed the pawel/asan-globals branch from 89ebd22 to 76eefd1 Compare December 2, 2021 14:07
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@dimakuv dimakuv merged commit 76eefd1 into master Dec 2, 2021
@dimakuv dimakuv deleted the pawel/asan-globals branch December 2, 2021 14:45
@pwmarcz pwmarcz mentioned this pull request Dec 2, 2021
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants