Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document design decision for functional API #160

Open
ion-elgreco opened this issue Jan 20, 2025 · 1 comment
Open

Document design decision for functional API #160

ion-elgreco opened this issue Jan 20, 2025 · 1 comment

Comments

@ion-elgreco
Copy link

It felt a bit unintuitive to have to do provide the store each time to execute a method:

import obstore as obs
from obstore.store import AzureStore
store = AzureStore()
obs.put(store, ....)

Instead of having the methods on the store directly akin to the rust implementation:

import obstore as obs
from obstore.store import AzureStore
store = AzureStore()
store.put(....)
@kylebarron kylebarron changed the title Move api methods onto the store class Document design decision for functional API Jan 22, 2025
@kylebarron
Copy link
Member

Yes, I agree that it would be nice to have the methods directly on the class, however this is a very intentional design decision,

Simpler Rust code

On the Rust side, each Python class is a separate struct. A Pyclass can't implement a trait, so the only way to implement the same methods on multiple Rust structs without copy-pasting is by having a macro. That isn't out of the question, however it does hamper extensibility, and having one and only one way to call commands is simpler to maintain.

Simpler Middlewares

In #117 we added a binding for PrefixStore. Because we use object store classes functionally, we only needed 20 lines of Rust code:

/// A Python-facing wrapper around a [`PrefixStore`].
#[pyclass(name = "PrefixStore", frozen)]
pub struct PyPrefixStore(Arc<PrefixStore<Arc<dyn ObjectStore>>>);
impl AsRef<Arc<PrefixStore<Arc<dyn ObjectStore>>>> for PyPrefixStore {
fn as_ref(&self) -> &Arc<PrefixStore<Arc<dyn ObjectStore>>> {
&self.0
}
}
#[pymethods]
impl PyPrefixStore {
#[new]
fn new(store: PyObjectStore, prefix: String) -> Self {
Self(Arc::new(PrefixStore::new(store.into_inner(), prefix)))
}

If we exposed methods on an S3Store, then those methods would be lost whenever you apply a middleware around it, such as PrefixStore(S3Store(...)). So we'd have to ensure those same methods are also installed onto every middleware or other wrapper.

External FFI for ObjectStore

There was recently discussion on Discord about the merits of having a stable FFI for ObjectStore. If this comes to fruition in the future, then by having a functional API we could seamlessly use third party ObjectStore implementations or middlewares, with no Python overhead.

I use a similar functional API in other Python bindings, especially in cases with zero-copy FFI, such as https://kylebarron.dev/geo-index/latest/api/rtree/#geoindex_rs.rtree.search (where the spatial index is passed in as the first argument instead) and https://kylebarron.dev/arro3/latest/api/compute/#arro3.compute.cast where the cast is not a method on the Arrow Array.

Smaller core for third-party Rust bindings

This repo has twin goals:

  1. Provide bindings to object_store for Python users who want a Python API.
  2. Make it easier for other Rust developers who are making Python bindings, who are using object_store on the Rust side already, and who want to expose ObjectStore bindings to Python in their own projects.

The first goal is served by the obstore Python package and the second is served by the pyo3-object_store Rust crate. The latter provides builders for S3Store, AzureStore, GCSStore, which means that those third party Rust-Python bindings can have code as simple as:

#[pyfunction]
fn use_object_store(store: PyObjectStore) {
    let store: Arc<dyn ObjectStore> = store.into_inner();
}

Those third party bindings don't need the Python bindings to perform arbitrary get, list, put from Python. Instead, they use this to access a raw Arc<dyn ObjectStore> from the Rust side.

You'll notice that S3Store, GCSStore, and AzureStore aren't in the obstore library; they're in pyo3-object_store. We can't add methods to a pyclass from an external crate, so we couldn't leave those builders in pyo3_object_store while having the Python-facing operations live in obstore. Instead we'd have to put the entire content of the Python bindings in the pyo3-object_store crate. Then this would expose whatever class methods from the obstore Python API onto any external Rust-Python library that uses pyo3-object_store. I don't want to leak this abstraction nor make that public to other Rust consumers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants