Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support defining C-compatible variadic functions in Rust #2137

Merged
merged 25 commits into from
Sep 29, 2017
Merged
Changes from 24 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
019dff4
Support defining C-compatible variadic functions in Rust
joshtriplett Sep 3, 2017
3039f17
Add VaArg for size_t and ssize_t; sort and group the VaArg impls better
joshtriplett Sep 3, 2017
12eb69f
Allow calling extern "C" functions that accept a `va_list`
joshtriplett Sep 3, 2017
e6b5cfb
Drop variadic closures
joshtriplett Sep 3, 2017
e9fe291
Fix typo in sample code
joshtriplett Sep 3, 2017
247991e
Drop impls for libc, mention that type aliases cover this
joshtriplett Sep 3, 2017
0cca2ce
Make VaArg an unsafe trait
joshtriplett Sep 3, 2017
3161c45
Fix typo in sample code
joshtriplett Sep 3, 2017
7b7b3ff
Drop VaList::start; name the VaList argument instead, to improve life…
joshtriplett Sep 3, 2017
9d060c4
Declare args as `mut`
joshtriplett Sep 3, 2017
3769670
Rework the alternatives; the new syntax isn't the "alternative" anymore
joshtriplett Sep 3, 2017
63b5545
Fix another reference to VaList::start
joshtriplett Sep 3, 2017
5437ab2
Clarify the description of argument promotion
joshtriplett Sep 5, 2017
7bd8d7d
Get rid of the VaArg trait
joshtriplett Sep 5, 2017
96f80ac
Don't impl `Drop` directly; just talk about the necessary drop semantics
joshtriplett Sep 5, 2017
18a8b36
Fix `extern "C"` function declarations
joshtriplett Sep 6, 2017
7e5698e
Allow `VaList::arg` to return any type usable in an `extern "C" fn` s…
joshtriplett Sep 6, 2017
105f764
Use `extern type` to declare `VaList` (see RFC 1861)
joshtriplett Sep 7, 2017
6bdb95c
Move the mention of language items to the reference-level explanation
joshtriplett Sep 7, 2017
1546c82
Use the rustdoc-style `/* fields omitted */` syntax to declare VaList
joshtriplett Sep 7, 2017
eca3eae
Require that the function have at least one non-variadic argument
joshtriplett Sep 8, 2017
e3d1f5c
Add some clarifications on drop handling, and non-standard behavior
joshtriplett Sep 8, 2017
b2acc33
Stop using Clone; implement copy safely via a closure
joshtriplett Sep 13, 2017
4042834
Add an unresolved question on support for non-native ABIs
joshtriplett Sep 15, 2017
eb9c392
RFC 2137: Support defining C-compatible variadic functions in Rust
aturon Sep 29, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
265 changes: 265 additions & 0 deletions text/0000-variadic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
- Feature Name: variadic
- Start Date: 2017-08-21
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Support defining C-compatible variadic functions in Rust, via new intrinsics.
Rust currently supports declaring external variadic functions and calling them
from unsafe code, but does not support writing such functions directly in Rust.
Adding such support will allow Rust to replace a larger variety of C libraries,
avoid requiring C stubs and error-prone reimplementation of platform-specific
code, improve incremental translation of C codebases to Rust, and allow
implementation of variadic callbacks.

# Motivation
[motivation]: #motivation

Rust can currently call any possible C interface, and export *almost* any
interface for C to call. Variadic functions represent one of the last remaining
gaps in the latter. Currently, providing a variadic function callable from C
requires writing a stub function in C, linking that function into the Rust
program, and arranging for that stub to subsequently call into Rust.
Furthermore, even with the arguments packaged into a `va_list` structure by C
code, extracting arguments from that structure requires exceptionally
error-prone, platform-specific code, for which the crates.io ecosystem provides
only partial solutions for a few target architectures.

This RFC does not propose an interface intended for native Rust code to pass
variable numbers of arguments to a native Rust function, nor an interface that
provides any kind of type safety. This proposal exists primarily to allow Rust
to provide interfaces callable from C code.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

C code allows declaring a function callable with a variable number of
arguments, using an ellipsis (`...`) at the end of the argument list. For
compatibility, unsafe Rust code may export a function compatible with this
mechanism.

Such a declaration looks like this:

```rust
pub unsafe extern "C" fn func(arg: T, arg2: T2, mut args: ...) {
// implementation
}
```

The use of `...` as the type of `args` at the end of the argument list declares
the function as variadic. This must appear as the last argument of the
function, and the function must have at least one argument before it. The
function must use `extern "C"`, and must use `unsafe`. To expose such a
Copy link
Contributor

@jethrogb jethrogb Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the requirement should be extern "C". Rather, I'd limit it to something like “non-Rust calling conventions that support varargs on your platform.”

Notably, GCC lets you mix win64 and sysv64 varargs as long as you know which calling convention your va_list is. See cross-stdarg.h. For Rust, the calling convention should probably be encoded in the type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jethrogb This is an excellent point, and I can easily change that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting point. So far the design assumes a singe VaList type.
It seems like we may need one per calling convention available on your platform.

function as a symbol for C code to call directly, the function may want to use
`#[no_mangle]` as well; however, Rust code may also pass the function to C code
expecting a function pointer to a variadic function.

The `args` named in the function declaration has the type
`core::intrinsics::VaList<'a>`, where the compiler supplies a lifetime `'a`
Copy link
Contributor

@KamilaBorowska KamilaBorowska Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a feature with its own syntax use a type provided by core::intrinsics? I do understand the intent to create proper Rust interface later for this feature (instead of internal one, which is what intrinsics module means), but this also means that any uses of this ... syntax will need to be updated when that Rust interface will be created (it's too neat of a syntax to not use for that purpose).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xfix We can always define a different meaning for the ... syntax in a function signature that isn't extern "C", and that different meaning need not use VaList.

that prevents the arguments from outliving the variadic function.

To access the arguments, Rust provides the following public interfaces in
`core::intrinsics` (also available via `std::intrinsics`):

```rust
/// The argument list of a C-compatible variadic function, corresponding to the
/// underlying C `va_list`. Opaque.
pub struct VaList<'a> { /* fields omitted */ }

// Note: the lifetime on VaList is invariant
impl<'a> VaList<'a> {
/// Extract the next argument from the argument list. T must have a type
/// usable in an FFI interface.
pub unsafe fn arg<T>(&mut self) -> T;

/// Copy the argument list. Destroys the copy after the closure returns.
pub fn copy<'ret, F, T>(&self, F) -> T
where
F: for<'copy> FnOnce(VaList<'copy>) -> T, T: 'ret;
}
```

The type returned from `VaList::arg` must have a type usable in an `extern "C"`
FFI interface; the compiler allows all the same types returned from
`VaList::arg` that it allows in the function signature of an `extern "C"`
function.

All of the corresponding C integer and float types defined in the `libc` crate
consist of aliases for the underlying Rust types, so `VaList::arg` can also
extract those types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence doesn't appear to be necessary with 'usable in extern "C" requirement'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary; I thought it was useful as documentation, though.


Note that extracting an argument from a `VaList` follows the C rules for
argument passing and promotion. In particular, C code will promote any argument
smaller than a C `int` to an `int`, and promote `float` to `double`. Thus,
Rust's argument extractions for the corresponding types will extract an `int`
or `double` as appropriate, and convert appropriately.
Copy link
Member

@kennytm kennytm Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, when passing arguments to external variadic functions, Rust disallows f32, i8, i16, u8, u16 and bool (E0617). Should VaArg also follow this rule instead of performing implicit conversion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C does this kind of promotion already; it seems less error-prone to allow Rust code to extract the corresponding type of argument (e.g. args.next::<u16>()) rather than extracting a c_int and casting.


Like the underlying platform `va_list` structure in C, `VaList` has an opaque,
platform-specific representation.

A variadic function may pass the `VaList` to another function. However, the
lifetime attached to the `VaList` will prevent the variadic function from
returning the `VaList` or otherwise allowing it to outlive that call to the
variadic function. Similarly, the closure called by `copy` cannot return the
`VaList` passed to it or otherwise allow it to outlive the closure.

A function declared with `extern "C"` may accept a `VaList` parameter,
corresponding to a `va_list` parameter in the corresponding C function. For
instance, the `libc` crate could define the `va_list` variants of `printf` as
follows:

```rust
extern "C" {
pub unsafe fn vprintf(format: *const c_char, ap: VaList) -> c_int;
pub unsafe fn vfprintf(stream: *mut FILE, format: *const c_char, ap: VaList) -> c_int;
pub unsafe fn vsprintf(s: *mut c_char, format: *const c_char, ap: VaList) -> c_int;
pub unsafe fn vsnprintf(s: *mut c_char, n: size_t, format: *const c_char, ap: VaList) -> c_int;
}
```

Note that, per the C semantics, after passing `VaList` to these functions, the
caller can no longer use it, hence the use of the `VaList` type to take
ownership of the object. To continue using the object after a call to these
functions, use `VaList::copy` to pass a copy of it instead.

Conversely, an `unsafe extern "C"` function written in Rust may accept a
`VaList` parameter, to allow implementing the `v` variants of such functions in
Rust. Such a function must not specify the lifetime.

Defining a variadic function, or calling any of these new functions, requires a
feature-gate, `c_variadic`.

Sample Rust code exposing a variadic function:

```rust
#![feature(c_variadic)]

#[no_mangle]
pub unsafe extern "C" fn func(fixed: u32, mut args: ...) {
let x: u8 = args.arg();
let y: u16 = args.arg();
let z: u32 = args.arg();
println!("{} {} {} {}", fixed, x, y, z);
}
```

Sample C code calling that function:

```c
#include <stdint.h>

void func(uint32_t fixed, ...);

int main(void)
{
uint8_t x = 10;
uint16_t y = 15;
uint32_t z = 20;
func(5, x, y, z);
return 0;
}
```

Compiling and linking these two together will produce a program that prints:

```text
5 10 15 20
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

LLVM already provides a set of intrinsics, implementing `va_start`, `va_arg`,
`va_end`, and `va_copy`. The compiler will insert a call to the `va_start`
intrinsic at the start of the function to provide the `VaList` argument (if
used), and a matching call to the `va_end` intrinsic on any exit from the
function. The implementation of `VaList::arg` will call `va_arg`. The
implementation of `VaList::copy` wil call `va_copy`, and then `va_end` after
the closure exits.

`VaList` may become a language item (`#[lang="VaList"]`) to attach the
appropriate compiler handling.

The compiler may need to handle the type `VaList` specially, in order to
provide the desired parameter-passing semantics at FFI boundaries. In
particular, some platforms define `va_list` as a single-element array, such
that declaring a `va_list` allocates storage, but passing a `va_list` as a
function parameter occurs by pointer. The compiler must arrange to handle both
receiving and passing `VaList` parameters in a manner compatible with the C
ABI.

The C standard requires that the call to `va_end` for a `va_list` occur in the
same function as the matching `va_start` or `va_copy` for that `va_list`. Some
Copy link

@fstirlitz fstirlitz Sep 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard only says

The va_end macro facilitates a normal return from the function whose variable argument list was referred to by the expansion of the va_start macro, or the function containing the expansion of the va_copy macro, that initialized the va_list ap. [...]

[I]f the va_end macro is not invoked before the return, the behavior is undefined.

which I take to mean that you're allowed to pass a va_list *pap to a function which will call va_end(*pap) for you. Although I don't imagine this ever happening at cross-module boundary, so you might get away with ignoring this possibility.

Forget what I said.

Each invocation of the va_start and va_copy macros shall be matched by a corresponding invocation of the va_end macro in the same function.

C implementations do not enforce this requirement, allowing for functions that
call `va_end` on a passed-in `va_list` that they did not create. This RFC does
not define a means of implementing or calling non-standard functions like these.

Note that on some platforms, these LLVM intrinsics do not fully implement the
necessary functionality, expecting the invoker of the intrinsic to provide
additional LLVM IR code. On such platforms, rustc will need to provide the
appropriate additional code, just as clang does.

This RFC intentionally does not specify or expose the mechanism used to limit
the use of `VaList::arg` only to specific types. The compiler should provide
errors similar to those associated with passing types through FFI function
calls.

# Drawbacks
[drawbacks]: #drawbacks

This feature is highly unsafe, and requires carefully written code to extract
the appropriate argument types provided by the caller, based on whatever
arbitrary runtime information determines those types. However, in this regard,
this feature provides no more unsafety than the equivalent C code, and in fact
provides several additional safety mechanisms, such as automatic handling of
type promotions, lifetimes, copies, and cleanup.

# Rationale and Alternatives
[alternatives]: #alternatives

This represents one of the few C-compatible interfaces that Rust does not
provide. Currently, Rust code wishing to interoperate with C has no alternative
to this mechanism, other than hand-written C stubs. This also limits the
ability to incrementally translate C to Rust, or to bind to C interfaces that
expect variadic callbacks.

Rather than having the compiler invent an appropriate lifetime parameter, we
could simply require the unsafe code implementing a variadic function to avoid
ever allowing the `VaList` structure to outlive it. However, if we can provide
an appropriate compile-time lifetime check, doing would make it easier to
correctly write the appropriate unsafe code.

Rather than naming the argument in the variadic function signature, we could
provide a `VaList::start` function to return one. This would also allow calling
`start` more than once. However, this would complicate the lifetime handling
required to ensure that the `VaList` does not outlive the call to the variadic
function.

We could use several alternative syntaxes to declare the argument in the
signature, including `...args`, or listing the `VaList` or `VaList<'a>` type
explicitly. The latter, however, would require care to ensure that code could

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone who is not expecting to write many implementations of variadic functions, my main interest is in minimising the changes to the 'global' parts of the Rust language that this (indisputably necessary) feature involves.

I, personally, would therefore prefer taking a VaList explicitly to adding this new ... syntax.

If we take a VaList by value, there's no worry about lifetimes. Rather, there is a worry about moving that VaList to somewhere it shouldn't go. Is there something we can do to make in unmoveable, so it can't escape the function which receives it? Or is that barking up completely the wrong tree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomwhoiscontrary We'd still need a syntax that distinguishes between passing around a VaList (as in functions like vprintf) and passing variable arguments interpreted as a VaList (as in functions like printf). Given the need for both syntaxes, I feel like mut args: ... is as good a syntax as any, and I'd like to avoid bikeshedding unless there's a major motivation to change it.

Also, we do want the ability to pass around the VaList. Ideally, I'd say that it could use the 'fn lifetime, if that gets adopted. But in the absence of that, I'd like to leave it unspecified and let the compiler fill it in with an equivalent.

not reference or alias the lifetime.

# Unresolved questions
[unresolved]: #unresolved-questions

When implementing this feature, we will need to determine whether the compiler
can provide an appropriate lifetime that prevents a `VaList` from outliving its
corresponding variadic function.

Currently, Rust does not allow passing a closure to C code expecting a pointer
to an `extern "C"` function. If this becomes possible in the future, then
variadic closures would become useful, and we should add them at that time.

This RFC only supports the platform's native `"C"` ABI, not any other ABI. Code
may wish to define variadic functions for another ABI, and potentially more
than one such ABI in the same program. However, such support should not
complicate the common case. LLVM has extremely limited support for this, for
only a specific pair of platforms (supporting the Windows ABI on platforms that
use the System V ABI), with no generalized support in the underlying
intrinsics. The LLVM intrinsics only support using the ABI of the containing
function. Given the current state of the ecosystem, this RFC only proposes
supporting the native `"C"` ABI for now. Doing so will not prevent the
introduction of support for non-native ABIs in the future.