Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5DataLayer source now takes list of filenames, loads one at a time. #203

Merged
merged 6 commits into from
Mar 18, 2014
Merged

HDF5DataLayer source now takes list of filenames, loads one at a time. #203

merged 6 commits into from
Mar 18, 2014

Conversation

sergeyk
Copy link
Contributor

@sergeyk sergeyk commented Mar 11, 2014

Also test for loading GZIP-compressed HDF5 files. With sparse data, this flies!

@sergeyk sergeyk added this to the 0.99 milestone Mar 13, 2014
LOG(INFO) << "Number of files: " << num_files;

// Load the first HDF5 file and initialize the counter.
// TODO: make this a function, since we also call it in Forward (in .cu also)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it will be cleaner to have a function to load a HDF5 and read data and label from it. From line 51 until 58

@sguada
Copy link
Contributor

sguada commented Mar 15, 2014

In other parts of the code (*top)[0]->mutable_cpu_data() is defined as Dtype* top_data = (*top)[0]->mutable_cpu_data(); which make a little bit clear what is going on.
https://github.com/sergeyk/caffe/blob/hdf5_data/src/caffe/layers/hdf5_data_layer.cpp#L92

Similar for top_label
https://github.com/sergeyk/caffe/blob/hdf5_data/src/caffe/layers/hdf5_data_layer.cpp#L96

Also using top_data+(*top)[0]->offset(i) instead of &(*top)[0]->mutable_gpu_data()[i * data_dims[1]] in could help clarity https://github.com/sergeyk/caffe/blob/hdf5_data/src/caffe/layers/hdf5_data_layer.cu#L47

@sergeyk I like the layer, and the current test pass. An example of a simple multi-class dataset will be great.
Let me know what you think about my comments.

@sguada
Copy link
Contributor

sguada commented Mar 15, 2014

@sergeyk I think you could easily expand hdf5_data_layer to take 3D or 4D matrices.
According to http://www.hdfgroup.org/HDF5/doc/HL/RM_H5LT.html#H5LTread_dataset_double
datasets can have up to 7 dimensions

It seems to me that overwriting status in io.cpp load_2d_dataset could hide and hdf5 error
https://github.com/sergeyk/caffe/blob/hdf5_data/src/caffe/util/io.cpp#L108
https://github.com/sergeyk/caffe/blob/hdf5_data/src/caffe/util/io.cpp#L112
https://github.com/sergeyk/caffe/blob/hdf5_data/src/caffe/util/io.cpp#L117

@sergeyk
Copy link
Contributor Author

sergeyk commented Mar 17, 2014

@sergio multi-dim data was enabled by #217, which I merged into dev. I'll rebase this PR on dev now. As for overwriting status, that's the way HDF5 examples do it, eg. http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-examples/1_8/C/H5D/h5ex_d_shuffle.c

Other comments are good, will incorporate all in the next commit.

@sergeyk
Copy link
Contributor Author

sergeyk commented Mar 17, 2014

@sergio good for review again

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

@sergeyk Thanks for taking care of the comments. However, now looking at the code, it seems that the function that load_hdf_file is magic, since it use global variables to pass the data and labels around.
I think it would be better to define it this way

virtual void load_hdf5_file(const char* filename, Blob<Dtype>* data, Blob<Dtype>* label) {
  LOG(INFO) << "Loading HDF5 file" << filename;
  hid_t file_id = H5Fopen(filename, H5F_ACC_RDONLY, H5P_DEFAULT);
  if (file_id < 0) {
    LOG(ERROR) << "Failed opening HDF5 file" << filename;
    return;
  }
  const int MIN_DATA_DIM = 2;
  const int MAX_DATA_DIM = 4;
  const int MIN_LABEL_DIM = 1;
  const int MAX_LABEL_DIM = 2;
  hdf5_load_nd_dataset(file_id, "data",  MIN_DATA_DIM, MAX_DATA_DIM, data);
  hdf5_load_nd_dataset(file_id, "label", MIN_LABEL_DIM, MAX_LABEL_DIM, label);

  herr_t status = H5Fclose(file_id);
  CHECK_EQ(data->num(), label->num());
  LOG(INFO) << "Successfully loaded " << data->num() << " samples";
}

And change hdf5_load_nd_dataset to return a Blob instead

@sergeyk
Copy link
Contributor Author

sergeyk commented Mar 17, 2014

At first I agreed with you about hdf5_load_nd_dataset returning a Blob, and started making the requisite changes. However, I realized that this would mean that in GPU mode, HDF data would be loaded directly into GPU memory. Is that correct, or is there a way to specify that a Blob is to live in CPU memory, no matter what Caffe is set to? Loading HDF data into GPU memory is not desired behavior: we want that to be controlled by batch size -- and we don't want to re-open the HDF file on every Forward, which is why we first load all data into memory, and then copy to Blobs on Forward as necessary.

Without the switch to blobs, I don't agree that load_hdf_file should take output arguments, as it would need four of them. Seems better to keep it as a class method, and load data into class properties (hardly magic -- simple OOP).

Unless you can further justify why hdf5_load_nd_dataset should return Blob, I think this should be merged.

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

Using Blobs will not automatically mean that the data would reside in GPU, even in GPU mode.
Thanks to syncmem the data moves from CPU to GPU only when it is requested by gpu_data or mutable_gpu_data. This is the way DataLayer is implemented, while reading Datum it copies to a prefetch Blob which lives always in CPU.

Here is when the data moves from CPU to GPU,
https://github.com/BVLC/caffe/blob/dev/src/caffe/layers/data_layer.cu#L24

So it will be safe for you to use a Blob to store the data read it by hdf5_load_nd_dataset, even if you want to keep it as a member of the HDF5_Layer you could, and then just copy batchsize samples to top Blob in Forwad_cpu or Forward_gpu

Regarding simple OOP, I think in some cases one can twist classes to behave as imperative code and global variables. It would be different if load_hdf_file were part of another class and one need to use getters and setters to get data and label.

The other reason to make hdf5_load_nd_dataset return Blobs is for consistency with other DataLayers, so all the sources should return Blobs that could be later pre-processed before passed to top. But this could wait until #148

@sergeyk
Copy link
Contributor Author

sergeyk commented Mar 17, 2014

Okay, your proposed refactor makes sense then. Stand by for update.

@sergeyk
Copy link
Contributor Author

sergeyk commented Mar 17, 2014

@sguada check it out.

current_row_ = 0;
}

CUDA_CHECK(cudaMemcpy(
&(*top)[0]->mutable_gpu_data()[i * data_count],
&(data_.get()[current_row_ * data_count]),
&data_blob_.mutable_cpu_data()[current_row_ * data_count],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_blob_ shouldn't be mutable here. You are just reading from it. Just use
&data_blob_.cpu_data()[current_row_ * data_count]

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

@sergeyk nice job! All test pass, but make lint complains about test_hdf5data_layer.cpp, io.cpp, vision_layers.hpp and io.hpp

Once you address my latest comments, and fix lint complains in the files you have changed, it will be ready to merge.

You may want to look in #120 and include a shuffle_files after you reach the end of the list. But it is not necessary.

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

@sergeyk There are still several lint errors

src/caffe/test/test_hdf5data_layer.cpp:81:  Missing spaces around =  [whitespace/operators] [4]
src/caffe/test/test_hdf5data_layer.cpp:81:  Missing spaces around <  [whitespace/operators] [3]
src/caffe/test/test_hdf5data_layer.cpp:112:  Lines should be <= 80 characters long  [whitespace/line_length] [2]
src/caffe/util/io.cpp:106:  Lines should be <= 80 characters long  [whitespace/line_length] [2]
src/caffe/util/io.cpp:106:  Is this a non-const reference? If so, make const or use a pointer: Blob<Dtype>& blob  [runtime/references] [2]
src/caffe/util/io.cpp:126:  Closing ) should be moved to the previous line  [whitespace/parens] [2]
src/caffe/util/io.cpp:131:  Is this a non-const reference? If so, make const or use a pointer: Blob<float>& blob  [runtime/references] [2]
src/caffe/util/io.cpp:139:  Is this a non-const reference? If so, make const or use a pointer: Blob<double>& blob  [runtime/references] [2]
include/caffe/vision_layers.hpp:392:  Do not leave a blank line after "protected:"  [whitespace/blank_line] [3]
include/caffe/vision_layers.hpp:404:  Add #include <string> for string  [build/include_what_you_use] [4]
include/caffe/util/io.hpp:56:  Is this a non-const reference? If so, make const or use a pointer: Blob<Dtype>& blob  [runtime/references] [2]
include/caffe/util/io.hpp:61:  Is this a non-const reference? If so, make const or use a pointer: Blob<Dtype>& blob  [runtime/references] [2]

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

Do you really need to read 100 times the same files to test HDF5DataLayerTest? It produces too many logs that make hard to read the test status.

@sergeyk
Copy link
Contributor Author

sergeyk commented Mar 17, 2014

Now good.

@sguada
Copy link
Contributor

sguada commented Mar 18, 2014

@sergeyk good job!!

sguada added a commit that referenced this pull request Mar 18, 2014
HDF5DataLayer source now takes list of filenames, loads one at a time.
@sguada sguada merged commit 33acedb into BVLC:dev Mar 18, 2014
@shelhamer shelhamer mentioned this pull request Mar 18, 2014
@ashafaei ashafaei mentioned this pull request Aug 10, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
HDF5DataLayer source now takes list of filenames, loads one at a time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants