Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use RefCell to enforce aliasing rules #342

Closed
konstin opened this issue Feb 7, 2019 · 5 comments · Fixed by #770
Closed

Use RefCell to enforce aliasing rules #342

konstin opened this issue Feb 7, 2019 · 5 comments · Fixed by #770

Comments

@konstin
Copy link
Member

konstin commented Feb 7, 2019

Currently pyo3 allows to break rust's aliasing guarantees, i.e. you can get two mutable references to the same object. This is not a big issue because you can only have those two references on the same thread, but this still needs to be fixed eventually. See this capter in wasm bindgen's guide for a good write-up on the why and how.

@athre0z
Copy link
Contributor

athre0z commented Apr 16, 2019

Another issue closely related to this are mutable Python native types where PyO3 returns references to the internal buffer, e.g.:

extern crate pyo3;
use pyo3::{prelude::*, wrap_pyfunction};

#[pyfunction]
fn sum_as_string() -> PyResult<()> {
    let gil = Python::acquire_gil();
    let py = gil.python();

    let byte_arr = pyo3::types::PyByteArray::new(py, &[1, 2, 3]);
    let byte_arr_data = byte_arr.data();
    println!("Pre mut: {:#?}", byte_arr_data);

    use pyo3::types::IntoPyDict;
    let d = [("byte_arr", byte_arr)].into_py_dict(py);
    py.run("byte_arr.clear()", None, Some(&d)).unwrap();

    println!("oh shit ...");
    println!("Post mut: {:#?}", byte_arr_data);

    Ok(())
}

/// This module is a python module implemented in Rust.
#[pymodule]
fn pyo3bug(py: Python, m: &PyModule) -> PyResult<()> {
    m.add_wrapped(wrap_pyfunction!(sum_as_string))?;

    Ok(())
}
Python 3.7.3 (default, Mar 30 2019, 03:37:43)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pyo3bug

In [2]: pyo3bug.sum_as_string()
Pre mut: [
    1,
    2,
    3,
]
oh shit ...
Post mut: [
    216,
    67,
    13,
]

I don't think this could be fixed using just a RefCell, could it? We would have to proxy the actual native type through something in our control to achieve this, which sounds like an unreasonable amount of complexity. Perhaps methods returning references into mutable Python objects should just be scrapped or replaced by visitor style APIs. Not really idiomatic, perhaps there is a better way.

@ExpHP
Copy link

ExpHP commented Jun 27, 2019

Here's what rust_cpython does:

/// Gets the buffer memory as a slice.
///
/// This function succeeds if:
/// * the buffer format is compatible with `T`
/// * alignment and size of buffer elements is matching the expectations for type `T`
/// * the buffer is C-style contiguous
///
/// The returned slice uses type `Cell<T>` because it's theoretically possible for any call into the Python runtime
/// to modify the values in the slice.
pub fn as_slice<'a, T: Element>(
    &'a self,
    _py: Python<'a>,
) -> Option<&'a [ReadOnlyCell<T>]>;

/// Gets the buffer memory as a slice.
///
/// This function succeeds if:
/// * the buffer is not read-only
/// * the buffer format is compatible with `T`
/// * alignment and size of buffer elements is matching the expectations for type `T`
/// * the buffer is C-style contiguous
///
/// The returned slice uses type `Cell<T>` because it's theoretically possible for any call into the Python runtime
/// to modify the values in the slice.
pub fn as_mut_slice<'a, T: Element>(
    &'a self,
    _py: Python<'a>,
) -> Option<&'a [cell::Cell<T>]>

@nagisa
Copy link
Contributor

nagisa commented Jul 27, 2019

Given that all python objects happen to have a reference counter in them already, this should be fairly easy to fix in the glue that exists between Python and Rust. Any attempt to downcast PyObject into &mut T should fail with some PyO3-specific exception if the reference count is not equal to 1.

@ExpHP
Copy link

ExpHP commented Jul 27, 2019

Possibly. I suspect that it would be impractical to use in practice. A refcount of 1 is nigh unattainable in any sort of realistic scenario, especially in the context of calling Rust from Python. Even something like:

def f0(x): return f1(x)
def f1(x): return f2(x)
def f2(x): return x

needs to produces additional references to x at each call level.

That said, I'm not certain what happens precisely when the interpreter calls methods and functions defined in extension modules. I know that the self object given to these methods is a borrowed reference, as are objects returned by PyArg_ParseTuple; so it is theoretically possible that the compiler does not increment refcounts at these callsites. (for methods, you'd also have to be careful to make sure a "bound method" object isn't created; e.g. f = x.method; f())

@andersk
Copy link
Contributor

andersk commented Aug 30, 2019

If we replaced the Python<'p> GIL marker with a reference &'p mut Python to a non-Cloneable type, we could safely allow operations like

impl PyByteArray {
    fn data<'p>(&'p self, py: &'p Python) -> &'p [u8];
    fn data<'p>(&'p self, py: &'p mut Python) -> &mut 'p [u8];
}

In other words,

  • Holding a shared reference to the GIL marker means that the state of the Python world is frozen and you can safely have shared references to data in the Python world.
  • Holding a unique reference to the GIL marker means that you can safely borrow data mutably from the Python world, but only once at a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants