Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for JLD2 I/O #3365

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

alex-s-gardner
Copy link

Addresses #3364

@bkamins bkamins added the doc label Jul 29, 2023
@bkamins bkamins added this to the 1.7 milestone Jul 29, 2023
@bkamins
Copy link
Member

bkamins commented Aug 1, 2023

I have left some stylistic comments.

Additionally, since I am not using JLD2 format myself. The question is how does it handle saving SubDataFrame, or GroupedDataFrame (I assume that it saves them correctly by strong both the view object and the underlying DataFrame?).

@alex-s-gardner
Copy link
Author

The question is how does it handle saving SubDataFrame, or GroupedDataFrame (I assume that it saves them correctly by strong both the view object and the underlying DataFrame?).

I tested saving and loading both SubDataFrame and GroupedDataFrame and they seem to work just fine... is this something that we should mention?

maybe here:

.... , JLD2 preserves custom Types including SubDataFrame and GroupedDataFrame.

@bkamins
Copy link
Member

bkamins commented Sep 25, 2023

Sorry for a delay. I think it is OK the way you proposed. Just can you please move JLD2 after CSV (as CSV is for sure more popular).

Also @nalimilan - could you please have a look at wording of the PR?

@bkamins bkamins requested a review from nalimilan September 25, 2023 21:17
Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay!


If you have not used the FileIO and JLD2 packages before then you may need to install it first:
```julia
using Pkg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using Pkg;
using Pkg

Comment on lines +123 to +124
A data frame can be saved as a JLD2 file output.jld2 using

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A data frame can be saved as a JLD2 file output.jld2 using

using JLD2
```

We can now create a simple data frame and save it as a jld2 file using `save`. `save`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We can now create a simple data frame and save it as a jld2 file using `save`. `save`
We can now create a simple data frame and save it as a JLD2 file using `save`. `save`

```

We can now create a simple data frame and save it as a jld2 file using `save`. `save`
accepts an AbstractDict yielding the key/value pairs, where the key is a string representing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
accepts an AbstractDict yielding the key/value pairs, where the key is a string representing
accepts an `AbstractDict` yielding the key/value pairs, where the key is a string representing

Comment on lines +128 to +129
Pkg.add("FileIO")
Pkg.add("JLD2")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will typically be faster:

Suggested change
Pkg.add("FileIO")
Pkg.add("JLD2")
Pkg.add(["FileIO", "JLD2"])

A jld2 file can be read in using `load`. If `load` is called with a single dataset name,
load returns the contents of that dataset from the file:
```julia
df = load("output.jld2", "df")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
df = load("output.jld2", "df")
julia> df = load("output.jld2", "df")

Comment on lines +161 to +163
df = load("output.jld2")
Dict{String, Any} with 1 entry:
"df" => 1×2 DataFrame…
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
df = load("output.jld2")
Dict{String, Any} with 1 entry:
"df" => 1×2 DataFrame…
julia> dict = load("output.jld2")
Dict{String, Any} with 1 entry:
"df" => 1×2 DataFrame…
julia> df = dict.df
1×2 DataFrame
Row │ x y
│ Int64 Int64
─────┼──────────────
1 │ 1 2

JLD2 is a HDF5-compatible file format that allows reading and writing data frames.
A valuable feature of JLD2 format is that it preserves custom column types of the stored data frame.

The `save` and `load` functions, provided by FileIO.jl, allow to read/write a data frame from/to a JLD2 file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `save` and `load` functions, provided by FileIO.jl, allow to read/write a data frame from/to a JLD2 file.
The `save` and `load` functions, provided by [FileIO.jl](https://github.com/JuliaIO/FileIO.jl/),
provides convenience functions to read/write objects from/to a JLD2 file.

## JLD2 Files

JLD2 is a HDF5-compatible file format that allows reading and writing data frames.
A valuable feature of JLD2 format is that it preserves custom column types of the stored data frame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A valuable feature of JLD2 format is that it preserves custom column types of the stored data frame.
A valuable feature of the JLD2 format is that it preserves custom column types of the stored data frame.

@@ -113,6 +113,56 @@ As you can see the code required to transform `iris` into a proper input to the
format is not easy. Therefore CSV.jl is the preferred package to write CSV files
for data stored in data frames.

## JLD2 Files

JLD2 is a HDF5-compatible file format that allows reading and writing data frames.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
JLD2 is a HDF5-compatible file format that allows reading and writing data frames.
[JLD2](https://github.com/JuliaIO/JLD2.jl) is a HDF5-compatible file format that allows
reading and writing data frames (and any Julia object).

@bkamins bkamins modified the milestones: 1.7, 1.x Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants