Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that an out-of-bounds pointer shall trap #536

Merged
merged 1 commit into from
Jul 28, 2023

Conversation

yamt
Copy link
Contributor

@yamt yamt commented May 10, 2023

discussion: #505

@sunfishcode
Copy link
Member

Looks good to me. I'll leave this open for a while in case there are any other comments.

@pchickey
Copy link
Contributor

This is a change of behavior for wasi-common but I agree this the correct behavior to implement.

Should we specify the behavior for passing pointers with an incorrect alignment?

Do we need to clarify this spec test to include behaviors for chasing pointers that are out-of-bounds?

Do we need to trap before any other observable side effects, e.g. do we need to validate that all memory referred to in a iovec is in bounds before writing any outputs?

@sunfishcode
Copy link
Member

Should we specify the behavior for passing pointers with an incorrect alignment?

The Canonical ABI is currently proposed to trap on misaligned pointers, so yes, I'd say we should ideally do that here too.

Do we need to clarify this spec test to include behaviors for chasing pointers that are out-of-bounds?

Which spec test are you referring to?

Do we need to trap before any other observable side effects, e.g. do we need to validate that all memory referred to in a iovec is in bounds before writing any outputs?

If we're going to trap, we should trap before any observable side effects.

Should we trap for an iovec with any buffer out of bounds? The component model doesn't yet have an iovec, so it doesn't have an opinion here yet. POSIX doesn't appear to say anything. Linux's readv appears to eagerly report an EFAULT if any pointers are invalid (though it appears to ignore NULL pointers). I think we should trap, because there doesn't seem to be any reason to depend on invalid pointers being ignored.

@yamt
Copy link
Contributor Author

yamt commented May 10, 2023

Do we need to trap before any other observable side effects, e.g. do we need to validate that all memory referred to in a iovec is in bounds before writing any outputs?

If we're going to trap, we should trap before any observable side effects.

Should we trap for an iovec with any buffer out of bounds? The component model doesn't yet have an iovec, so it doesn't have an opinion here yet. POSIX doesn't appear to say anything. Linux's readv appears to eagerly report an EFAULT if any pointers are invalid (though it appears to ignore NULL pointers).

i don't think it's a good idea to require eager checks.
that's why i wrote "and the function needs to dereference it" in the text.

I think we should trap, because there doesn't seem to be any reason to depend on invalid pointers being ignored.

some implementations naturally ignore unused iovecs eg. on a short read.

@pchickey
Copy link
Contributor

Do we need to clarify this spec test to include behaviors for chasing pointers that are out-of-bounds?

Which spec test are you referring to?

Typo, i meant text.

It sounds like Linux is eager about returning a fault on invalid iovec pointers. @yamt are you aware of other operating systems which are lazier about this behavior, or is it just other wasi preview 1 implementations?

I am in favor of being eager about that fault in order to align with the component model - in preview 2 it will be eager no matter what.

@yamt
Copy link
Contributor Author

yamt commented May 11, 2023

@yamt are you aware of other operating systems which are lazier about this behavior, or is it just other wasi preview 1 implementations?

eg. netbsd

and as far as i tested with the following test code, linux doesn't seem eager either.
https://github.com/yamt/garbage/blob/master/c/iov_badaddr/a.c
(or probably i don't understand what @sunfishcode meant by "eagerly report an EFAULT if any pointers are invalid".)

@@ -77,6 +77,10 @@ For example, the `poll_oneoff` function has these arguments:
Pointer values are expected to be aligned, to the alignment of their pointee
type. If a misaligned pointer is passed to a function, the function shall trap.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pchickey misaligned pointer is documented here

@sunfishcode
Copy link
Member

eg. netbsd

and as far as i tested with the following test code, linux doesn't seem eager either.
https://github.com/yamt/garbage/blob/master/c/iov_badaddr/a.c
(or probably i don't understand what @sunfishcode meant by "eagerly report an EFAULT if any pointers are invalid".)

Oh, interesting. If I modify your testcase to use -1 instead of 1 as the invalid address, I do get an EFAULT on Linux:

With (void *)1:

readv(3, [{iov_base="X", iov_len=1}, {iov_base="", iov_len=1}], 2) = 1

With (void *)-1):

readv(3, [{iov_base=0x7ffe8fe44817, iov_len=1}, {iov_base=0xffffffffffffffff, iov_len=1}], 2) = -1 EFAULT (Bad address)

So I don't know what logic Linux is using to decide what gets an EFAULT, but whatever it's doing, it's doing it eagerly.

I think WASI should be eager here.

  • It's a bug in portable code if it's passing invalid pointers to readv, even if some OS's in some situations allow it to work. And it's not a harmless bug. NULL is a dereferenceble address in wasm, so if applications are relying on passing in NULL and having it be ignored, wasm is already breaking that. And if they're passing in a dangling address, then the only thing between them and memory corruption is luck. Eager checking protects programs from both of these hazards.

  • Lazy checking here wouldn't save engines the need to walk the iovec list eagerly. They'll need to do that anyway, to translate wasm addresses to host OS addresses before the readv call.

@yamt
Copy link
Contributor Author

yamt commented May 12, 2023

eg. netbsd

and as far as i tested with the following test code, linux doesn't seem eager either.
https://github.com/yamt/garbage/blob/master/c/iov_badaddr/a.c
(or probably i don't understand what @sunfishcode meant by "eagerly report an EFAULT if any pointers are invalid".)

Oh, interesting. If I modify your testcase to use -1 instead of 1 as the invalid address, I do get an EFAULT on Linux:

With (void *)1:

readv(3, [{iov_base="X", iov_len=1}, {iov_base="", iov_len=1}], 2) = 1

With (void *)-1):

readv(3, [{iov_base=0x7ffe8fe44817, iov_len=1}, {iov_base=0xffffffffffffffff, iov_len=1}], 2) = -1 EFAULT (Bad address)

So I don't know what logic Linux is using to decide what gets an EFAULT, but whatever it's doing, it's doing it eagerly.

interesting.

I think WASI should be eager here.

* It's a bug in portable code if it's passing invalid pointers to `readv`, even if some OS's in some situations allow it to work. And it's not a harmless bug. NULL is a dereferenceble address in wasm, so if applications are relying on passing in NULL and having it be ignored, wasm is already breaking that. And if they're passing in a dangling address, then the only thing between them and memory corruption is luck. Eager checking protects programs from both of these hazards.

* Lazy checking here wouldn't save engines the need to walk the iovec list eagerly. They'll need to do that anyway, to translate wasm addresses to host OS addresses before the `readv` call.

i'm not sure about this "lazy checking wouldn't save" point.

  • wasi is not necessarily implemented as host functions.
  • even when it is, fd_read is not necessarily implemented with a host readv call.

@sunfishcode
Copy link
Member

eg. netbsd

and as far as i tested with the following test code, linux doesn't seem eager either.
https://github.com/yamt/garbage/blob/master/c/iov_badaddr/a.c
(or probably i don't understand what @sunfishcode meant by "eagerly report an EFAULT if any pointers are invalid".)

Oh, interesting. If I modify your testcase to use -1 instead of 1 as the invalid address, I do get an EFAULT on Linux:
With (void *)1:

readv(3, [{iov_base="X", iov_len=1}, {iov_base="", iov_len=1}], 2) = 1

With (void *)-1):

readv(3, [{iov_base=0x7ffe8fe44817, iov_len=1}, {iov_base=0xffffffffffffffff, iov_len=1}], 2) = -1 EFAULT (Bad address)

So I don't know what logic Linux is using to decide what gets an EFAULT, but whatever it's doing, it's doing it eagerly.

interesting.

I think WASI should be eager here.

* It's a bug in portable code if it's passing invalid pointers to `readv`, even if some OS's in some situations allow it to work. And it's not a harmless bug. NULL is a dereferenceble address in wasm, so if applications are relying on passing in NULL and having it be ignored, wasm is already breaking that. And if they're passing in a dangling address, then the only thing between them and memory corruption is luck. Eager checking protects programs from both of these hazards.

* Lazy checking here wouldn't save engines the need to walk the iovec list eagerly. They'll need to do that anyway, to translate wasm addresses to host OS addresses before the `readv` call.

i'm not sure about this "lazy checking wouldn't save" point.

* wasi is not necessarily implemented as host functions.

* even when it is, `fd_read` is not necessarily implemented with a host `readv` call.

That's a good point. Some implementations could have some extra cost to this. Fortunately, MAX_IOV on many platforms is only 1024, and in my experience the number of buffers is often much less than that, so the cost shouldn't be high in practice. And when people get to the point of writing such implementations and benchmarking them, if the bounds checks turn out to be significant, we can add new interfaces that don't require them.

So I'd still say that the security and portable angles justify eager trapping here, especially this time. It will catch some cases where programs might be implicitly depending on readv implementations and data sources reading under a certain number of bytes in order to avoid hitting invalid buffers.

@pchickey
Copy link
Contributor

This has been open for a while (unfortuantely I lost track of it) and we haven't seen any objections, so I am merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants