Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise the FFI chapters #190

Merged

Conversation

Michael-F-Bryan
Copy link
Contributor

This is a continuation of #106 which addresses some of the items pointed out in the review.

Fixes #171.

@simonsan simonsan added the C-fix Category: Fixing broken or erroneous content label Jan 7, 2021
@Michael-F-Bryan
Copy link
Contributor Author

Michael-F-Bryan commented Jan 7, 2021

@jhwgh1968 how do you think I should approach the CString example in the "accepting strings" chapter?

I feel like the example is too contrived to show how you might use CString. It essentially reduces down to

fn mylib_log(msg: &str) {
  let msg: String = msg.to_owned();
  crate::log(&msg, ...);
}

... which isn't really code that people would write.

From my experience you don't really use CString when accepting null-terminated strings. Normally you'd keep them around as a &CStr or convert to a String with something like CStr::from_ptr(...).to_str().unwrap().to_owned().

The only example I can think of is when you need to store the string somewhere and keep it around after the function ends, but even then I'd use CStr::from_raw(...).to_owned() instead of copying to a Vec and calling CString::new(...).unwrap().

The to_owned() route also neatly side-steps the issue of not copying the null-terminator across.

@Michael-F-Bryan
Copy link
Contributor Author

Michael-F-Bryan commented Jan 7, 2021

There are actually two orthogonal concepts that we'll want to address in the error handling chapter. One is returning the equivalent of Rust's Result<T, E>, and the other is the myriad of different conventions that C uses. The current chapter only handles Result<(), E> and doesn't talk about how you return both an error code and a value.

For example, how would write FFI bindings for something like this?

fn parse_elf(filename: *const c_str) -> Result<ElfBinary, ParseError>;

struct ElfBinary { ... }
enum ParseError { ... }

Some possible solutions are:

typedef struct ElfBinary ElfBinary;

// null on parse failed
ElfBinary *parse_elf(const char *filename);

// out parameters and result enum
typedef enum ParseResult {
  OK = 0,
  ...
} ParseResult;
ParseResult parse_elf(const char *filename, ElfBinary **parsed);

// out parameters and boolean return
boolean parse_elf(const char *filename, ElfBinary **parsed);

// errno-style and null
extern int errno;
const char *last_error_msg();
ElfBinary *parse_elf(const char *filename);

// everything is out parameters
typedef enum ParseResult {
  OK = 0,
  ...
} ParseResult;
void parse_elf(const char *filename, ElfBinary **parsed, ParseResult *result);

@jhwgh1968
Copy link
Contributor

how do you think I should approach the CString example in the "accepting strings" chapter?

I feel like the example is too contrived to show how you might use CString.

It is a little contrived, but I could not do a good translation of what I had.

My goal with that alternative was to illustrate a thought process I actually had in the beginning: always copy all pointers to a Rust-owned value ASAP, and then never touch the pointers again, no matter what type they were. That is what I wanted to discourage.

The only example I can think of is when you need to store the string somewhere and keep it around after the function ends, but even then I'd use CStr::from_raw(...).to_owned() instead of copying to a Vec and calling CString::new(...).unwrap().

I had code that did a bunch of pointer arithmetic, so I put it in. But that was because I had C code which did that, and it stayed around for quite some time before I completely rewrote it. I agree that most Rust programmers wouldn't think of something that backwards, and it might not be worth putting in.

In fact, your snippet seems like a good thing to replace my alternative with: "if you really want an owned one, use this. Don't do pointer arithmetic or null byte termination yourself, because it's error prone, and this takes care of it."

There are actually two orthogonal concepts that we'll want to address in the error handling chapter. One is returning the equivalent of Rust's Result<T, E>, and the other is the myriad of different conventions that C uses. The current chapter only handles Result<(), E> and doesn't talk about how you return both an error code and a value.

I tried to suggest (and wasn't very clear) that the "value" was what was returned by db_error_description. I forgot to put in a db_parse_error_details function to match it, which would return a parse_error struct by value in order to match it.

For example, how would write FFI bindings for something like this?

fn parse_elf(filename: *const c_str) -> Result<ElfBinary, ParseError>;

struct ElfBinary { ... }
enum ParseError { ... }

The pattern I wrote would say one of two things: if it's opaque, return it as a pointer. If it's transparent, use an out parameter and an integer code (either non/zero, or an error code number).

/* elfbin.h */
// if transparent
int elf_bin_new(const char* filename, **ElfBinary out_elf);

// if opaque, returns NULL pointer on error
*ElfBinary elf_bin_new(const char *filename);

// either way, you can get more information about the error message with this
//
// pass a pointer to an ElfBinary that was constructed to get the last error from its call.
// pass NULL pointer in order to get the last error from a global call, like "new."
const char *elf_bin_error(*ElfBinary e);

Suffice it to say, I'm trying to avoid doing what some libraries do, which is to overwrite errno, or interact with any C global state.

@simonsan simonsan added A-anti_pattern Area: Content about Anti-Patterns A-pattern Area: Content about Patterns A-idiom Area: Idioms and removed A-anti_pattern Area: Content about Anti-Patterns labels Jan 23, 2021
@MarcoIeni
Copy link
Collaborator

always copy all pointers to a Rust-owned value ASAP, and then never touch the pointers again, no matter what type they were. That is what I wanted to discourage.

This is quite nice, we want to explain it in the book?

Anyway here you are talking about another part of the book (error handling), which is not really related to this PR, right?
Because if you want once we solve the comments I left we could merge this and address other problems in separate PRs/issues.

@jhwgh1968
Copy link
Contributor

always copy all pointers to a Rust-owned value ASAP, and then never touch the pointers again, no matter what type they were. That is what I wanted to discourage.

Anyway here you are talking about another part of the book (error handling), which is not really related to this PR, right?

I was talking about what motivated a previous example in Accepting Strings, which in retrospect was very convoluted.

It is a general idea, but Accepting Strings was where I put it, because that's where it was most notable in my early code. Strings come in as const c_char *, and I was copying them into Rust owned Strings. I knew it was a lot of work, but I thought it was "safer," when I was worried about splitting ownership of the old C code. That is what I wanted to discourage.

I do not know whether it is worth a mention in Accepting Strings or not, if it is moved to a general principle. I would approve this PR either way.

@simonsan
Copy link
Collaborator

@jhwgh1968 @Michael-F-Bryan Hey you both, any updates on this? Asking because we're probably changing the patterns subfolder structure so ffi has it's own subfolder, this would make a rebase needed probably with some conflicts to solve. Just for letting you know. ;)

@Michael-F-Bryan
Copy link
Contributor Author

I'm going to close this PR because I don't have the time to complete it and didn't really want to invest much more than that initial review.

@MarcoIeni
Copy link
Collaborator

Hi Michael, I reopen this and I assign it to me. I will continue the work you have started :)

@MarcoIeni MarcoIeni self-assigned this Feb 22, 2021
@MarcoIeni MarcoIeni marked this pull request as ready for review February 24, 2021 21:49
@MarcoIeni MarcoIeni merged commit 96fa89b into rust-unofficial:master Feb 24, 2021
@MarcoIeni
Copy link
Collaborator

Merged this! If you think something valuable from the above comments needs to be discussed further, please open an issue or another PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-idiom Area: Idioms A-pattern Area: Content about Patterns C-fix Category: Fixing broken or erroneous content
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FFI Best Practices Examples Contain Unsoundness and Other Not Great Things
4 participants