Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

LibOS: Return -EFAULT on name = NULL for sethostname/setdomainname (UBSAN) #2102

Closed
wants to merge 3 commits into from
Closed

LibOS: Return -EFAULT on name = NULL for sethostname/setdomainname (UBSAN) #2102

wants to merge 3 commits into from

Conversation

stefanberger
Copy link
Contributor

@stefanberger stefanberger commented Jan 19, 2021

Description of the changes

This patch fixes an error detected with UBSAN where we have to return -EFAULT on name = NULL for sethostname and setdomainname syscalls.


This change is Reviewable

@stefanberger
Copy link
Contributor Author

Finding those LTP-level bugs went a lot faster than anticipated since the pattern was always the same... there are still some left, though.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

a discussion (no related file):
This is actually a bug in test_user_memory, but this function is so hacky that I'm fine with this temporary solution (but it's not really correct, users can have memory mapped at zero sometimes and syscalls work correctly on it if it's actually mapped).


Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required)

@dimakuv
Copy link

dimakuv commented Jan 20, 2021

Jenkins, test this please

Copy link
Contributor

@yamahata yamahata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @stefanberger)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

This is actually a bug in test_user_memory, but this function is so hacky that I'm fine with this temporary solution (but it's not really correct, users can have memory mapped at zero sometimes and syscalls work correctly on it if it's actually mapped).

As far as test_user_memory is broken, it does handle NULL well (maybe not well, but as good as any other address). I'm not sure what went wrong exactly, but this is not the fix.


Copy link
Contributor Author

@stefanberger stefanberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @boryspoplawski)

a discussion (no related file):

Previously, boryspoplawski (Borys Popławski) wrote…

As far as test_user_memory is broken, it does handle NULL well (maybe not well, but as good as any other address). I'm not sure what went wrong exactly, but this is not the fix.

There are tons of locations that look like this here:

long shim_do_read(int fd, void* buf, size_t count) {
    if (!buf || test_user_memory(buf, count, true))
        return -EFAULT;

long shim_do_write(int fd, const void* buf, size_t count) {
    if (!buf || test_user_memory((void*)buf, count, false))
        return -EFAULT;

long shim_do_pwrite64(int fd, char* buf, size_t count, loff_t pos) {
    if (!buf || test_user_memory(buf, count, false))
        return -EFAULT;


Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @stefanberger)

a discussion (no related file):

Previously, stefanberger (Stefan Berger) wrote…

There are tons of locations that look like this here:

long shim_do_read(int fd, void* buf, size_t count) {
    if (!buf || test_user_memory(buf, count, true))
        return -EFAULT;

long shim_do_write(int fd, const void* buf, size_t count) {
    if (!buf || test_user_memory((void*)buf, count, false))
        return -EFAULT;

long shim_do_pwrite64(int fd, char* buf, size_t count, loff_t pos) {
    if (!buf || test_user_memory(buf, count, false))
        return -EFAULT;

That doesn't make it correct. Graphene is full of places that need to go away ...


Copy link
Contributor Author

@stefanberger stefanberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @boryspoplawski)

a discussion (no related file):

Previously, boryspoplawski (Borys Popławski) wrote…

That doesn't make it correct. Graphene is full of places that need to go away ...

Then at least it would be justifiable that it follows an established pattern that can be cleaned up once test_user_memory() handles the NULL pointer correctly. Of course it should handle address 0x8 correctly as well.
I suppose the fix would be to walk the VMas there? It seems to actually test memory by writing to it, which can end up re-writing bytes in Graphene code itself... ?


Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @stefanberger)

a discussion (no related file):

Previously, stefanberger (Stefan Berger) wrote…

Then at least it would be justifiable that it follows an established pattern that can be cleaned up once test_user_memory() handles the NULL pointer correctly. Of course it should handle address 0x8 correctly as well.
I suppose the fix would be to walk the VMas there? It seems to actually test memory by writing to it, which can end up re-writing bytes in Graphene code itself... ?

test_user_memory handles the NULL pointer correctly right now, it's just that this function is one big undefined behavior.
I have the rewrite of it in my pipeline, but I'm waiting for the signals PR to land before it.


Copy link
Contributor Author

@stefanberger stefanberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @boryspoplawski)

a discussion (no related file):

Previously, boryspoplawski (Borys Popławski) wrote…

test_user_memory handles the NULL pointer correctly right now, it's just that this function is one big undefined behavior.
I have the rewrite of it in my pipeline, but I'm waiting for the signals PR to land before it.

So we won't merge this but have to wait until the signal stuff, which I saw is using the same pattern as below in some cases, is merged in?


Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @boryspoplawski)

a discussion (no related file):

Previously, stefanberger (Stefan Berger) wrote…

So we won't merge this but have to wait until the signal stuff, which I saw is using the same pattern as below in some cases, is merged in?

What Borys means is that we don't understand why your patch would change anything, as test_user_memory seems to handle NULL pointers already. The fact that it's very hacky is a separate issue, but it seems that it should work in this case without the additional ifs you added. Quoting him: "As far as test_user_memory is broken, it does handle NULL well (maybe not well, but as good as any other address). I'm not sure what went wrong exactly, but this is not the fix.".
So, what was broken exactly before this patch? And why did it change?


Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @mkow)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

What Borys means is that we don't understand why your patch would change anything, as test_user_memory seems to handle NULL pointers already. The fact that it's very hacky is a separate issue, but it seems that it should work in this case without the additional ifs you added. Quoting him: "As far as test_user_memory is broken, it does handle NULL well (maybe not well, but as good as any other address). I'm not sure what went wrong exactly, but this is not the fix.".
So, what was broken exactly before this patch? And why did it change?

  1. We don't have to wait for signals PR, it's completely orthogonal.
  2. The PR you mentioned is using a similar but a bit different pattern: NULL argument is valid there and means that this parameter is not used, hence checks like if (ss && test_user_memory(ss, ...)

Copy link
Contributor Author

@stefanberger stefanberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @boryspoplawski and @mkow)

a discussion (no related file):

Previously, boryspoplawski (Borys Popławski) wrote…
  1. We don't have to wait for signals PR, it's completely orthogonal.
  2. The PR you mentioned is using a similar but a bit different pattern: NULL argument is valid there and means that this parameter is not used, hence checks like if (ss && test_user_memory(ss, ...)

My current guess is that UBSAN jumps on writing to the NULL pointer:

UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector. UBSan modifies the program at compile-time to catch various kinds of undefined behavior during program execution, for example:

    Using misaligned or null pointer
    Signed integer overflow
    Conversion to, from, or between floating-point types which would overflow the destination

https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html


Copy link
Contributor Author

@stefanberger stefanberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @boryspoplawski and @mkow)

a discussion (no related file):

Previously, stefanberger (Stefan Berger) wrote…

My current guess is that UBSAN jumps on writing to the NULL pointer:

UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector. UBSan modifies the program at compile-time to catch various kinds of undefined behavior during program execution, for example:

    Using misaligned or null pointer
    Signed integer overflow
    Conversion to, from, or between floating-point types which would overflow the destination

https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html

So then the correct fix is to disable UBSAN in the test_user_memory() function using __attribute__((no_sanitize("undefined"))) until this part is rewritten (based on VMA walk I suppose).


Copy link
Contributor Author

@stefanberger stefanberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @boryspoplawski and @mkow)

a discussion (no related file):

Previously, stefanberger (Stefan Berger) wrote…

So then the correct fix is to disable UBSAN in the test_user_memory() function using __attribute__((no_sanitize("undefined"))) until this part is rewritten (based on VMA walk I suppose).

Let me close this PR (wrong title) and send another one just with this fix.


@stefanberger stefanberger deleted the stefanberger/fix_setdomainname branch January 26, 2021 18:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants