-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
type safety: ffi pointer manipulation API #207
Conversation
3d8c76f
to
8d0c437
Compare
8d0c437
to
edd60e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, looks good.
Rebased on master |
Previously `CassDataType` was just an enum, held inside `Arc`. User was given a pointer to `CassDataType` using `Arc::as_ptr` or `Arc::into_ptr`. There are however some functions that mutate the data - and they were given the very same pointers. Current code was most likely sound - but I'm not completely sure, Rust reference is very confusing in this aspect. It was however very confusing - when a programmer reads or writes a function that that *mut CassDataType it is not obivious that this data lies inside Arc and so has shared ownership. To make this more explicit this commit puts `CassDataType` inside UnsafeCell. Now each access needs to use `.get_unchecked()` and `.get_mut_unchecked()` methods and an unsafe block / function, so it will be easier to spot aliasing ^ mutability problems in the future. In the future we can use `Arc::get_mut_unchecked()` for this purpose, but it's not yet stabilised.
Implementation of `PartialEq` for `CassDataType` hides a possibly unsafe operation. Let's make sure that we do not depend on it in the code - only use it for test purposes.
The reasons behind the API (and this PR) are following: - reduce the lifetime of reference obtained from the pointer Current pointer-to-ref conversion API (ptr_to_ref_[mut]) returns a reference with a static lifetime. New API is parameterized by a lifetime, which is not 'static. - Ensure that IF the memory associated with the pointer was allocated a certain way, the raw pointer will be converted to the corresponding Rust pointer primitive i.e. Box or Arc (or reference if the pointer was not obtained from an explicit allocation). However, this does not give us any guarantees about the origin of the pointer. Consider following example: // Rust impl ArcFFI for Foo {} fn extern "C" f1() -> *const Foo { // a pointer to stack variable // Also applies to some valid pointer obtained from the reference // to the field of some already heap-allocated object. // Decided to go with a stack variable to keep the example simple. let foo = Foo; &foo } fn extern "C" f2(foo: *const Foo) { let foo = ArcFFI::cloned_from_ptr(foo); } // C Foo *foo = f1(); f2(foo); // Segfault. // Even if f1() returned a valid pointer, that points to some // heap-allocated memory. The pointer was not obtained from an Arc allocation. // I.e., it was not obtained via Arc::into_raw(). To guarantee this, we need to introduce a special type for pointer that would represent the pointer's properties. This will be done in a follow-up PR.
Implemented new traits for the types shared between C and Rust. Adjusted all places where ptr-to-ref (and vice-versa) conversions appear to use the new traits API.
v2: addressed @wprzytula comments |
pub unsafe extern "C" fn cass_data_type_set_type_name( | ||
data_type: *mut CassDataType, | ||
data_type: *const CassDataType, | ||
type_name: *const c_char, | ||
) -> CassError { | ||
cass_data_type_set_type_name_n(data_type, type_name, strlen(type_name)) | ||
} | ||
|
||
#[no_mangle] | ||
pub unsafe extern "C" fn cass_data_type_set_type_name_n( | ||
data_type_raw: *mut CassDataType, | ||
data_type_raw: *const CassDataType, | ||
type_name: *const c_char, | ||
type_name_length: size_t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Can you remind me / cite some source on what are the semantics of *mut
vs *const
? CassDataType* data_type
is used on C side, which is not a const object, right? If that is the case, why do we use *const
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applied to other places where *mut
was changed to *const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of *const
means that no one is going to modify the pointee through the pointer. Here, this is not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does modify mean? Let's say we get a *const Mutex<something>
. Are we allowed to lock the mutex and modify its contents? It only requires creating &
from *const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify on structs is defined by induction, by modifying one of its fields. And modyfing a value of a primitive type involves changing bits of their memory representation.
However, interior mutability is what makes an exception: if a type allows for internal mutability (in Rust, it basically means that it holds an UnsafeCell
), then mutation inside that type are not considered mutation from outside. So answering your question, we are allowed to lock the Mutex through *const Mutex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that Wojciech answered your question. Notice, that all these *const
and *mut
will not matter after a follow-up PR, since raw pointers are going to be replaced by our new type representing a pointer.
CassDataType interior mutability
CassDataType
is the only type that is both shared, and mutable. We use of interior mutability pattern for this type. Even if the current code is sound in this matter, we should wrap the underlying data inUnsafeCell
, to make (possibly)invalid accesses to it more verbose and easier to detect by forcing the user to use an explicit
unsafe
block.[Box/Arc/Ref]FFI API
The main motivation behind this PR is to:
reduce the lifetime of reference obtained from the pointer
Current pointer-to-ref conversion API (ptr_to_ref_[mut]) returns a
reference with a static lifetime. New API is parameterized by a lifetime,
which is not 'static.
ensure that IF the memory associated with the pointer was allocated a certain way,
the raw pointer will be converted to the corresponding Rust pointer primitive i.e. Box or Arc
(or reference if the pointer was not obtained from an explicit allocation).
This is achieved, by (in addition, to the new API) disallowing the usage of
[Box/Arc]
API that operates on raw pointers.However, this does not give us any guarantees about the origin of the pointer.
Consider following example:
// Rust
// C
To guarantee this, we need to introduce a special type for pointer that
would represent the pointer's properties. This will be done in a follow-up PR.
Pre-review checklist
[ ] I have implemented Rust unit tests for the features/changes introduced.[ ] I have enabled appropriate tests in.github/workflows/build.yml
ingtest_filter
.[ ] I have enabled appropriate tests in.github/workflows/cassandra.yml
ingtest_filter
.