Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP and RFC: Initial support for Serde serialization #61

Closed
jmcnamara opened this issue Dec 3, 2023 · 5 comments
Closed

WIP and RFC: Initial support for Serde serialization #61

jmcnamara opened this issue Dec 3, 2023 · 5 comments

Comments

@jmcnamara
Copy link
Owner

jmcnamara commented Dec 3, 2023

I've added initial support for serialization of Serde structures to rust_xlsxwriter main under the serde feature flag.

The support works like this:

  1. Prepare a standard Serde #[derive(Serialize)] struct.
  2. Serialize the header to a location in the worksheet. The fields of the struct will become headers.
  3. Call worksheet.serialize() repeatedly to write data, without having to specify the row/col position.

For example:

use rust_xlsxwriter::{Format, Workbook, XlsxError};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Add a simple format for the headers.
    let format = Format::new().set_bold();

    // Create a serializable test struct.
    #[derive(Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: &'static str,
        cost: f64,
    }

    // Create some data instances.
    let item1 = Produce {
        fruit: "Peach",
        cost: 1.05,
    };
    let item2 = Produce {
        fruit: "Plum",
        cost: 0.15,
    };
    let item3 = Produce {
        fruit: "Pear",
        cost: 0.75,
    };

    // Set up the start location and headers of the data to be serialized using
    // any temporary or valid instance.
    worksheet.serialize_headers_with_format(0, 0, &item1, &format)?;

    // Serialize the data.
    worksheet.serialize(&item1)?;
    worksheet.serialize(&item2)?;
    worksheet.serialize(&item3)?;

    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

Which would give this output:
worksheet_serialize

The output can be positioned anywhere in the worksheet. For example if we change this line in the previous example:

    worksheet.serialize_headers_with_format(1, 3, &item1, &format)?;

We get this output:

screenshot

It will also serialize vectors in the struct:

    // Create a serializable test struct with vector fields.
    #[derive(Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: Vec<&'static str>,
        cost: Vec<f64>,
    }

    // Create some data instances.
    let item = Produce {
        fruit: vec!["Peach", "Plum", "Pear"],
        cost: vec![1.05, 0.15, 0.75],
    };

    // Set up the start location and headers of the data to be serialized using
    // any temporary or valid instance.
    worksheet.serialize_headers_with_format(0, 0, &item, &format)?;

    // Serialize the data.
    worksheet.serialize(&item)?;

This gives the same output as the first example.

From the docs:

This method can be used, with some limitations, to serialize (i.e., convert automatically) structs that are serializable by Serde into cells on a worksheet.

The limitations are that the primary data type to be serialized must be a struct and its fields must be either primitive types (strings, chars, numbers, booleans) or vector/array types. Compound types such as enums, tuples or maps aren't supported. The reason for this is that the output data must fit in the 2D cell format of an Excel worksheet. (Note: this limitation is open for debate if it makes sense, see below.)

In order to serialize an instance of a data structure you must first define the fields/headers and worksheet location that the serialization will refer to. You can do this with the Worksheet::serialize_headers() or Worksheet::serialize_headers_with_format() methods. Any subsequent call to serialize() will write the serialized data below the headers and below any previously serialized data.

I am looking for feedback on the workability of this technique for any serialization use case that people may have. In particular I'd appreciate feedback from trying to make it work with existing serialize structures (within reason of what could be applied to a worksheet). Please leave comments below.

The code is on main. You will need to enable it in a local project with the following or similar:

cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git -F serde

There is some ongoing work to add support for ExcelDateTime and Chrono date/times. I will also be adding a method for adding formatting to each field value, and options to ignore fields apart from Serde #[serde(skip_serializing)], and also to reorder the fields. I may also change the error handling to just ignore unknown structs/fields.

@Xydez
Copy link

Xydez commented Dec 4, 2023

Wouldn't it be nicer to have serialize_vec(&items) where items is a Vec<Produce>? That is, instead of having the vecs inside the struct.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Dec 4, 2023

Wouldn't it be nicer to have serialize_vec(&items) where items is a Vec<Produce>? That is, instead of having the vecs inside the struct.

Thanks for the input. That is probably a common use case in JSON style responses.

The good news is that this would work with the current API once the serialize_headers() step is completed.

For example:

use rust_xlsxwriter::{Format, Workbook, XlsxError};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Add a simple format for the headers.
    let format = Format::new().set_bold();

    // Create a serializable test struct.
    #[derive(Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: &'static str,
        cost: f64,
    }

    // Create some data instances.
    let items = vec![
        Produce {
            fruit: "Peach",
            cost: 1.05,
        },
        Produce {
            fruit: "Plum",
            cost: 0.15,
        },
        Produce {
            fruit: "Pear",
            cost: 0.75,
        },
    ];

    // Set up the start location and headers of the data to be serialized.
    worksheet.serialize_headers_with_format(0, 0, &items.get(0).unwrap(), &format)?;

    // Serialize the data.
    worksheet.serialize(&items)?;


    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

This would also work for a tuple of structs:

    // Create some data instances.
    let items = (
        Produce {
            fruit: "Peach",
            cost: 1.05,
        },
        Produce {
            fruit: "Plum",
            cost: 0.15,
        },
        Produce {
            fruit: "Pear",
            cost: 0.75,
        },
    );

    // Set up the start location and headers of the data to be serialized.
    worksheet.serialize_headers_with_format(0, 0, &items.1, &format)?;

    // Serialize the data.
    worksheet.serialize(&items)?;

Both examples produces the same output as the first example above. Would that work for you?

@Mingun
Copy link

Mingun commented Dec 4, 2023

Because serialize_headers does not require temporary item for its work, I think, it is better to provide only the type:

worksheet.serialize_headers::<Produce>(0, 0)?;
worksheet.serialize_headers_with_format::<Produce>(0, 0, &format)?;

@jmcnamara
Copy link
Owner Author

jmcnamara commented Dec 4, 2023

I think, it is better to provide only the type:

@Mingun That would be nice.

However, I think an instance is required, and not just the type, since it needs to be serialized to find the fields to convert them to worksheet headers. Or am I missing something?

For reference the code of the method is here:

pub fn serialize_headers_with_format<T>(
&mut self,
row: RowNum,
col: ColNum,
data_structure: &T,
format: &Format,
) -> Result<&mut Worksheet, XlsxError>
where
T: Serialize,
{
// Check row and columns are in the allowed range.
if !self.check_dimensions_only(row, col) {
return Err(XlsxError::RowColumnLimitError);
}
let mut headers = SerializerHeader {
struct_name: String::new(),
field_names: vec![],
};
data_structure.serialize(&mut headers)?;

@jmcnamara
Copy link
Owner Author

I have added support for Serde serialization in v0.57.0. See Working with Serde in the rust_xlsxwriter docs.

Some additional serialisation features and helpers will be added in upcoming releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants