Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support defining C-compatible variadic functions in Rust #2137

Merged
merged 25 commits into from
Sep 29, 2017
Merged
Changes from 3 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
019dff4
Support defining C-compatible variadic functions in Rust
joshtriplett Sep 3, 2017
3039f17
Add VaArg for size_t and ssize_t; sort and group the VaArg impls better
joshtriplett Sep 3, 2017
12eb69f
Allow calling extern "C" functions that accept a `va_list`
joshtriplett Sep 3, 2017
e6b5cfb
Drop variadic closures
joshtriplett Sep 3, 2017
e9fe291
Fix typo in sample code
joshtriplett Sep 3, 2017
247991e
Drop impls for libc, mention that type aliases cover this
joshtriplett Sep 3, 2017
0cca2ce
Make VaArg an unsafe trait
joshtriplett Sep 3, 2017
3161c45
Fix typo in sample code
joshtriplett Sep 3, 2017
7b7b3ff
Drop VaList::start; name the VaList argument instead, to improve life…
joshtriplett Sep 3, 2017
9d060c4
Declare args as `mut`
joshtriplett Sep 3, 2017
3769670
Rework the alternatives; the new syntax isn't the "alternative" anymore
joshtriplett Sep 3, 2017
63b5545
Fix another reference to VaList::start
joshtriplett Sep 3, 2017
5437ab2
Clarify the description of argument promotion
joshtriplett Sep 5, 2017
7bd8d7d
Get rid of the VaArg trait
joshtriplett Sep 5, 2017
96f80ac
Don't impl `Drop` directly; just talk about the necessary drop semantics
joshtriplett Sep 5, 2017
18a8b36
Fix `extern "C"` function declarations
joshtriplett Sep 6, 2017
7e5698e
Allow `VaList::arg` to return any type usable in an `extern "C" fn` s…
joshtriplett Sep 6, 2017
105f764
Use `extern type` to declare `VaList` (see RFC 1861)
joshtriplett Sep 7, 2017
6bdb95c
Move the mention of language items to the reference-level explanation
joshtriplett Sep 7, 2017
1546c82
Use the rustdoc-style `/* fields omitted */` syntax to declare VaList
joshtriplett Sep 7, 2017
eca3eae
Require that the function have at least one non-variadic argument
joshtriplett Sep 8, 2017
e3d1f5c
Add some clarifications on drop handling, and non-standard behavior
joshtriplett Sep 8, 2017
b2acc33
Stop using Clone; implement copy safely via a closure
joshtriplett Sep 13, 2017
4042834
Add an unresolved question on support for non-native ABIs
joshtriplett Sep 15, 2017
eb9c392
RFC 2137: Support defining C-compatible variadic functions in Rust
aturon Sep 29, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
287 changes: 287 additions & 0 deletions text/0000-variadic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,287 @@
- Feature Name: variadic
- Start Date: 2017-08-21
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Support defining C-compatible variadic functions in Rust, via new intrinsics.
Rust currently supports declaring external variadic functions and calling them
from unsafe code, but does not support writing such functions directly in Rust.
Adding such support will allow Rust to replace a larger variety of C libraries,
avoid requiring C stubs and error-prone reimplementation of platform-specific
code, improve incremental translation of C codebases to Rust, and allow
implementation of variadic callbacks.

# Motivation
[motivation]: #motivation

Rust can currently call any possible C interface, and export *almost* any
interface for C to call. Variadic functions represent one of the last remaining
gaps in the latter. Currently, providing a variadic function callable from C
requires writing a stub function in C, linking that function into the Rust
program, and arranging for that stub to subsequently call into Rust.
Furthermore, even with the arguments packaged into a `va_list` structure by C
code, extracting arguments from that structure requires exceptionally
error-prone, platform-specific code, for which the crates.io ecosystem provides
only partial solutions for a few target architectures.

This RFC does not propose an interface intended for native Rust code to pass
variable numbers of arguments to a native Rust function, nor an interface that
provides any kind of type safety. This proposal exists primarily to allow Rust
to provide interfaces callable from C code.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

C code allows declaring a function callable with a variable number of
arguments, using an ellipsis (`...`) at the end of the argument list. For
compatibility, unsafe Rust code may export a function compatible with this
mechanism.

Such a declaration looks like this:

```rust
pub unsafe extern "C" fn func(arg: T, arg2: T2, ...) {
// implementation
}
```

The `...` at the end of the argument list declares the function as variadic.
The function must use `extern "C"`, and must use `unsafe`. To expose
such a function as a symbol for C code to call directly, the function may want
to use `#[no_mangle]` as well; however, Rust code may also pass the function to
C code expecting a function pointer to a variadic function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a requirement of at least one parameter before ...? Such a requirement exists in C (not C++ however), and considering this is a FFI feature, it sounds reasonable to copy this requirement.

Copy link
Member Author

@joshtriplett joshtriplett Sep 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLVM intrinsics don't require this, but it seems reasonable as a precaution, sure. Done.


Unsafe Rust code can also define a variadic closure:

```rust
let closure = |arg, arg2, ...| {
// implementation
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't cast a closure to extern "C" fn, so I don't think supporting variadic closure is necessary in this RFC.

fn main() {
    unsafe {
        let _x = (|| {}) as extern fn(); //~ ERROR E0605
    }
}

Copy link
Contributor

@jethrogb jethrogb Sep 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus, I think extern "C" variadic functions should always be unsafe functions, and closures currently can't be unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize this; I'd assumed that RFC 1558 made this possible, but digging through it, it doesn't allow coercing to extern fn, only to fn. I filed rust-lang/rust#44291 to allow coercing to extern fn, but in the meantime, I'll drop this for now and move it to an open for the future. Would have been nice for more convenient callbacks.

```

Rust code may pass a variadic closure to C code expecting a pointer to a
variadic function.

To access the arguments, Rust provides the following public interfaces in
`core::intrinsics` (also available via `std::intrinsics`):

```rust
/// The argument list of a C-compatible variadic function, corresponding to the
/// underlying C `va_list`. Opaque.
pub struct VaList<'a>;

impl<'a> VaList<'a> {
/// Obtain the variable arguments of the current function. Produces a
/// compile-time error if called from a non-variadic function. The compiler
/// will supply the appropriate lifetime when called, and prevent that
/// lifetime from outliving the variadic function.
pub unsafe fn start() -> VaList<'a>;

/// Extract the next argument from the argument list.
pub unsafe fn<T: VaArg> arg(&mut self) -> T;
}

impl<'a> Clone for VaList<'a>;
impl<'a> Drop for VaList<'a>;

/// The type of arguments extractable from VaList
trait VaArg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be an unsafe trait?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


impl VaArg for i8;
impl VaArg for i16;
impl VaArg for i32;
impl VaArg for i64;
impl VaArg for isize;

impl VaArg for u8;
impl VaArg for u16;
impl VaArg for u32;
impl VaArg for u64;
impl VaArg for usize;

impl VaArg for f32;
impl VaArg for f64;

impl<T> VaArg for *const T;
impl<T> VaArg for *mut T;
```

The `libc` crate additionally provides implementations of the `VaArg` trait for
the raw C types corresponding to the Rust integer and float types above:

```rust
impl VaArg for c_char;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless these type aliases became concrete types, I don't think the std and libc crates need to provide any additional impls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

impl VaArg for c_schar;
impl VaArg for c_uchar;

impl VaArg for c_short;
impl VaArg for c_ushort;

impl VaArg for c_int;
impl VaArg for c_uint;

impl VaArg for c_long;
impl VaArg for c_ulong;

impl VaArg for c_longlong;
impl VaArg for c_ulonglong;

impl VaArg for c_float;
impl VaArg for c_double;

impl VaArg for int8_t;
impl VaArg for int16_t;
impl VaArg for int32_t;
impl VaArg for int64_t;

impl VaArg for uint8_t;
impl VaArg for uint16_t;
impl VaArg for uint32_t;
impl VaArg for uint64_t;

impl VaArg for size_t;
impl VaArg for ssize_t;
```

Note that extracting an argument from a `VaList` follows the platform-specific
rules for argument passing and promotion. In particular, many platforms promote
any argument smaller than a C `int` to an `int`. On such platforms, extracting

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true for all platforms, it's required by the C standard.

If the argument has integral or enumeration type that is subject to the integral promotions, or a floating-point type that is subject to the floating-point promotion, the value of the argument is converted to the promoted type before the call.

Basically, any integral type smaller than an int is converted to int (and any floating point type smaller than a double (basically float or short float, for implementations that provide the latter, is converted to double)).

Side note, this paragraph points out something important:

After these conversions, if the argument does not have arithmetic, enumeration, pointer, pointer to member, or class type, the program is ill-formed.

emphasis on class type and enumeration - this means that it should be valid for programs to impl VaArg for their #[repr(C|uN|iN)] structure and simple enumeration types.

I'd also argue that VaArg should be implemented for &T, &mut T, Option<&T>, and Option<&mut T>. This is all unsafe anyways :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll phrase the bit about promotions more carefully, and put some thought into the structure and enum cases.

I'd prefer not to implement VaArg for references; there are plenty of ways to convert a pointer to a reference in unsafe code, if you really want a reference, and those ways allow you to inject the appropriate lifetime of that reference more easily.

Copy link

@strega-nil strega-nil Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not treating references specially - they are just pointers after all. And it just makes sense sometimes - if you're implementing printf:

#[no_mangle]
unsafe extern "C" printf(fmt: &CStr, args: ...) -> c_int {
  for ch in fmt {
    // ...
    // found a %s:
    {
      print!("{}", args.next::<&CStr>());
    }
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except &CStr is a fat pointer.

Copy link

@strega-nil strega-nil Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jethrogb for now ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ubsan If, at some point in the future, there's a reasonable possibility of extracting a reference directly from a variadic function argument and getting a reasonable result, then we can add such a mechanism at that time. Until then, though, I'd like to stick with only allowing raw pointers.

Copy link
Member Author

@joshtriplett joshtriplett Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ubsan I clarified the RFC's details about promotion to take your explanation into account, and to avoid suggesting "platform-specific".

Copy link

@strega-nil strega-nil Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshtriplett I disagree with the reasoning there. There is no reason not to allow references. You almost never actually want raw pointers when dealing with C; you want Option<&T> (and given that references are valid in C functions...)

the corresponding type will extract an `int` and convert appropriately.

Like the underlying platform `va_list` structure in C, `VaList` has an opaque,
platform-specific representation.

A variadic function may call `VaList::start` more than once, and traverse the
argument list more than once.

A variadic function may pass the `VaList` to another function. However, it may
not return the `VaList` or otherwise allow it to outlive that call to the
variadic function.

A function declared with `extern "C"` may accept a `VaList` parameter,
corresponding to a `va_list` parameter in the corresponding C function. For
instance, the `libc` crate could define the `va_list` variants of `printf` as
follows:

```rust
pub unsafe extern "C" fn vprintf(format: *const c_char, ap: VaList) -> c_int;
pub unsafe extern "C" fn vfprintf(stream: *mut FILE, format: *const c_char, ap: VaList) -> c_int;
pub unsafe extern "C" fn vsprintf(s: *mut c_char, format: *const c_char, ap: VaList) -> c_int;
pub unsafe extern "C" fn vsnprintf(s: *mut c_char, n: size_t, format: *const c_char, ap: VaList) -> c_int;
```

Defining a variadic function, or calling any of these new functions, requires a
feature-gate, `c_variadic`.

Sample Rust code exposing a variadic function:

```rust
#![feature(c_variadic)]
use std::intrinsics::{VaArg, VaList};

#[no_mangle]
pub unsafe extern "C" fn func(fixed: u32, ...) {
let args = VaList::start();
let x: u8 = args::arg();
let y: u16 = args::arg();
let z: u32 = args::arg();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean

let mut args = VaList::start();
let x: u8 = args.arg();
let y: u16 = args.arg();
let z: u32 = args.arg();

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you.

println!("{} {} {} {}", fixed, x, y, z);
}
```

Sample C code calling that function:

```c
#include <stdint.h>

void func(uint32_t fixed, ...);

int main(void)
{
uint8_t x = 10;
uint16_t y = 15;
uint32_t z = 20;
func(5, x, y, z);
return 0;
}
```

Compiling and linking these two together will produce a program that prints:

```text
5 10 15 20
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

LLVM already provides a set of intrinsics, implementing `va_start`, `va_arg`,
`va_end`, and `va_copy`. The implementation of `VaList::start` will call the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you've changed the RFC to args: ..., the VaList::start function should no longer be mentioned. Just say va_start will be automatically inserted at the start of the function.

`va_start` intrinsic. The implementation of `VaList::arg` will call `va_arg`.
The implementation of `Clone` for `VaList` wil call `va_copy`. The
implementation of `Drop` for `VaList` wil call `va_end`.

This RFC intentionally does not specify the mechanism used to implement the
`VaArg` trait, as the compiler may need to natively implement `VaList::arg`
with appropriate understanding of platform-specific conventions. Code outside
of `core`, `std`, and `libc` may not implement this trait for any other type.

Note that on some platforms, these LLVM intrinsics do not fully implement the
necessary functionality, expecting the invoker of the intrinsic to provide
additional LLVM IR code. On such platforms, rustc will need to provide the
appropriate additional code, just as clang does.

# Drawbacks
[drawbacks]: #drawbacks

This feature is highly unsafe, and requires carefully written code to extract
the appropriate argument types provided by the caller, based on whatever
arbitrary runtime information determines those types. However, in this regard,
this feature provides no more unsafety than the equivalent C code, and in fact
provides several additional safety mechanisms, such as automatic handling of
type promotions, lifetimes, copies, and destruction.

# Rationale and Alternatives
[alternatives]: #alternatives

This represents one of the few C-compatible interfaces that Rust does not
provide. Currently, Rust code wishing to interoperate with C has no alternative
to this mechanism, other than hand-written C stubs. This also limits the
ability to incrementally translate C to Rust, or to bind to C interfaces that
expect variadic callbacks.

Rather than having the compiler invent an appropriate lifetime parameter, we
could simply require the unsafe code implementing a variadic function to avoid
ever allowing the `VaList` structure to outlive it. However, if we can provide
an appropriate compile-time lifetime check, doing would make it easier to
correctly write the appropriate unsafe code.

Rather than defining a `VaList::start` function, we could require specifying a
name along with the `...`:

```rust
pub unsafe extern "C" fn func(fixed: u32, ...args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VaList::start should now be an alternative itself. This is at best an "alternative syntax".

// implementation
}
```

This might simplify the provision of an appropriate lifetime, and would avoid
the need to provide a `VaList::start` function and only allow calling it from
within a variadic function.

However, such an approach would not expose any means of calling `va_start`
multiple times in the same variadic function. Note that doing so has a
different semantic than calling `va_copy`, as calling `va_start` again iterates
over the arguments from the beginning rather than the current point. Given that
this mechanism exists for the sole purpose of interoperability with C, more
closely matching the underlying C interface seems appropriate.

# Unresolved questions
[unresolved]: #unresolved-questions

When implementing this feature, we will need to determine whether the compiler
can provide an appropriate lifetime that prevents a `VaList` from outliving its
corresponding variadic function.