Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/211 bool8 changelog #214

Merged
merged 2 commits into from
Aug 8, 2024
Merged

Feature/211 bool8 changelog #214

merged 2 commits into from
Aug 8, 2024

Conversation

chmp
Copy link
Owner

@chmp chmp commented Aug 8, 2024

No description provided.

@chmp chmp merged commit 9162036 into main Aug 8, 2024
1 check passed
@chmp chmp deleted the feature/211-bool8-changelog branch August 8, 2024 12:37
@chmp chmp mentioned this pull request Aug 8, 2024
@v1gnesh
Copy link

v1gnesh commented Aug 17, 2024

Hey, a qq - how do I use this via the derive attr, i.e., what do I define my field as?

EDIT:
From the PRs, I see it's like this

/// ```rust
/// # use serde_json::json;
/// # use serde_arrow::{Result, schema::{SerdeArrowSchema, SchemaLike, TracingOptions, ext::Bool8Field}};
/// # use serde::Deserialize;
/// # fn main() -> Result<()> {
/// ##[derive(Deserialize)]
/// struct Record {
///     int_field: i32,
///     nested: Nested,
/// }
///
/// ##[derive(Deserialize)]
/// struct Nested {
///     bool_field: bool,
/// }
///
/// let tracing_options = TracingOptions::default()
///     .overwrite("nested.bool_field", Bool8Field::new("bool_field"))?;
///
/// let schema = SerdeArrowSchema::from_type::<Record>(tracing_options)?;

In order for this to be usable "transparently", I think arrow-rs needs to add a DataType impl for Bool8, right?
Only then, serde_arrow will be able to support something like below?

#[derive(Debug, Serialize)]
struct Ye {
    a: Bool8,
}

@chmp
Copy link
Owner Author

chmp commented Aug 17, 2024

The issue is the Serde data model and that there is, afaik, no way to add additional metadata.

You could write your own serialize, deserialize logic that goes over int8 (that would be, what your Bool8 type would do), but you still need to annotate the field with the Bool8 metadata to be correctly interpreted by other tools (eg. pyarrow).

I do have an idea around to address this issue overall, but it will take some time to implement. In the mean time, manually setting the type is the easiest way.

One idea for an optional serde arrow feature could also be to add a tracing option to convert all bool fields to Bool8.

@v1gnesh
Copy link

v1gnesh commented Aug 17, 2024

One idea for an optional serde arrow feature could also be to add a tracing option to convert all bool fields to Bool8.

Or at least for a newtype over [u8; T], where all items in this array are needed as Bool8.

pub struct Bool8Wrap<const T: usize>(pub [u8; T]);

initialized as:

pub struct BigOne {
    // The following 5 bytes need to be 5 Bool8Arrays
    a: Bool8Wrap<5>, 
}

@chmp
Copy link
Owner Author

chmp commented Aug 22, 2024

@v1gnesh Thanks for sticking with it. The newtype idea is very promising. I haven't thought it through completely, but I can see the outlines of a solution. I started a discussion here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants