Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VFD plugins #602

Merged
merged 6 commits into from
Sep 29, 2021
Merged

VFD plugins #602

merged 6 commits into from
Sep 29, 2021

Conversation

jhendersonHDF
Copy link
Collaborator

@jhendersonHDF jhendersonHDF commented Apr 28, 2021

This PR should be ready to review now. The only piece missing from this feature is the support for configuration strings. The API routines to set a VFD configuration string are included here, but parsing of the strings isn't implemented yet and is being developed in a separate branch.

The main piece that might be controversial here is all of the changes to testing code to deal with avoiding running certain tests with specific VFDs. We already do this in a subset of HDF5's tests, but reimplementing the HDF5_DRIVER environment variable to be interpreted by the library rather than the testing code means that more tests need to be aware of the environment variable. A lot of these checks can go away once HDF5's test suite can be re-engineered to support checking what features a VFD is compatible with, but there are currently a lot of edge cases. For example, the Direct VFD has alignment requirements, which can be problematic for tests that check the resulting file size.

src/H5E.c Outdated Show resolved Hide resolved
src/H5FDint.c Outdated Show resolved Hide resolved
src/H5PLplugin_cache.c Outdated Show resolved Hide resolved
if (new_num_errors > num_errors) {
new_num_errors -= num_errors;
if (H5Epop(H5E_DEFAULT, (size_t)new_num_errors) < 0)
HGOTO_ERROR(H5E_ERROR, H5E_CANTRELEASE, H5_ITER_ERROR, "can't sanitize error stack")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of using H5Epop to remove expected error messages generated from the H5E_BEGIN/END_TRY block above so they don't get included in the eventual error stack printing.

@@ -1456,40 +1488,10 @@ main(int argc, const char *argv[])
/* Initialize indexing options */
h5trav_set_index(sort_by, sort_order);

if (driver_name_g != NULL) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whole block is duplicated by the call to h5tools_get_fapl below.

@@ -2747,9 +2757,6 @@ main(int argc, const char *argv[])
else if (!HDstrcmp(argv[argno], "--string")) {
string_g = TRUE;
}
else if (!HDstrncmp(argv[argno], "--vfd=", (size_t)6)) {
preferred_driver = argv[argno] + 6;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old "drivername" parameter is retained, but looped into the new way of loading VFDs in the tools below.

error_msg("failed to setup file access property list (fapl) for file\n");
leave(EXIT_FAILURE);
}
}

if (preferred_driver) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another block duplicated by h5tools_get_fapl.

@byrnHDF byrnHDF added the WIP Work in progress (please also convert PRs to draft PRs) label May 6, 2021
@jhendersonHDF jhendersonHDF force-pushed the H5FD_dynamic branch 2 times, most recently from 1bacb6f to 1799052 Compare June 11, 2021 20:32
@jhendersonHDF jhendersonHDF force-pushed the H5FD_dynamic branch 3 times, most recently from baf7d6e to 4c95bd3 Compare July 18, 2021 04:34
if (search_params->key->id == H5PL_cache_g[u].key.id)
/* Determine if the plugin types match */
if (search_params->type != H5PL_cache_g[u].type)
continue;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file fix some caching issues with VOL/VFD plugins.

@@ -523,6 +553,28 @@ H5P__init_package(void)
HDassert(tot_init == NELMTS(init_class));

done:
if (ret_value < 0 && tot_init > 0) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to correctly bring down the H5P interface if it failed to initialize correctly.

./test/family_v16_00000.h5
./test/family_v16_00001.h5
./test/family_v16_00002.h5
./test/family_v16_00003.h5
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed these files to match the family VFD's new default printf config, which is to append a %06d suffix to the given filename.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are five-digit filename extensions a problem by default now? If so, that change seems likely to anger existing family driver users. If not, we should keep some five-digit files around for backward compatibility testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue, no. It's just that the old testing code was using a 5-digit filename extension and I use a 6-digit filename extension as a new default in the family driver to allow for a larger family size. The family driver never used to have a default, so this shouldn't affect user code in any way since users usually specify their own printf-style filename. Having this default just makes it easier to generate testing code for VFDs because the filename doesn't need to have a printf-style format conditionally added on. When it's missing, the family driver will tack on a "-%06d" extension so unique filenames get generated for each family member.

*-------------------------------------------------------------------------
*/
herr_t
h5_get_vfd_fapl(hid_t fapl)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is now part of the library itself rather than test code. Setting HDF5_DRIVER will replace the default VFD on the default FAPL, so h5_get_vfd_fapl isn't needed for tests.

@jhendersonHDF jhendersonHDF changed the title DO NOT MERGE: VFD plugins VFD plugins Jul 19, 2021
@jhendersonHDF jhendersonHDF removed the WIP Work in progress (please also convert PRs to draft PRs) label Jul 19, 2021
@qkoziol
Copy link
Contributor

qkoziol commented Jul 19, 2021

@jhendersonHDF - Some of these sound useful now, would you consider making some smaller PRs out of them?

@jhendersonHDF
Copy link
Collaborator Author

@jhendersonHDF - Some of these sound useful now, would you consider making some smaller PRs out of them?

I'm guessing you're referring to changes like the H5P package termination stuff? I can definitely make some small PRs out of changes like that.

@qkoziol
Copy link
Contributor

qkoziol commented Jul 19, 2021

@jhendersonHDF - Some of these sound useful now, would you consider making some smaller PRs out of them?

I'm guessing you're referring to changes like the H5P package termination stuff? I can definitely make some small PRs out of changes like that.

Yes, super, thanks!

@@ -24,11 +24,29 @@ set (VFD_LIST
split
multi
family
splitter
#log - log VFD currently has file space allocation bugs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the issues with the log VFD? Perhaps more to the point, is there are Jira issue on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe there are any JIRA issues for the problems I saw; I should create those. Out of all of the VFDs, the log VFD seemed to be the one that was most problematic in terms of testing. The testhdf5, stab, farray, earray, btree2, dsets, links, unlink, freespace, mf and fheap tests all failed in ways that seemed related to file space allocation and/or checking of file size. The ohdr test failed because it failed to expunge a cache entry that was pinned. Note that these testing failures occurred whether the VFD's logging capabilities were actually enabled or not.

Since the log VFD appears to just try to emulate the sec2 VFD with the logging capabilities on top (but separate from the normal I/O), it seems like these are bugs in that the log VFD does not pass the same tests as the sec2 VFD.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log VFD should be a clone of the sec2 VFD, but with log messages. I could see it failing tests that check stdout (like in the tools, etc.), but it's weird that it's failing with respect to file space allocation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have a suspicion that there are some sec2-specific things in the library because the log VFD is indeed just a clone of the sec2 VFD with the logging capability added in, but separate from the regular I/O done. We definitely have a problem somewhere here. Nothing obvious, such as missing VFD feature flags, stuck out to me.

src/H5FDdirect.c Outdated Show resolved Hide resolved
src/H5Pfapl.c Show resolved Hide resolved
src/H5FDcore.c Outdated
static inline const H5FD_core_fapl_t *
H5FD__core_get_default_config(void)
{
char *driver = HDgetenv("HDF5_DRIVER");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it would be better to use a macro instead of "HDF5_DRIVER" everywhere. Just in case that environment variable has to be renamed one day or someone messes up the environment variable name, this way the compiler would be able to detect it for us.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few macros to H5public.h for this and some other related environment variables.

env_h5_drvr = HDgetenv("HDF5_DRIVER");
if (env_h5_drvr == NULL)
env_h5_drvr = "nomatch";

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here and in the tests in general, I would avoid using strings for const names, e.g. "multi" etc

endif ()
if (H5_HAVE_WINDOWS)
set (VFD_LIST ${VFD_LIST} windows)
endif ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think expanding this list is a great idea. It's for the CMake equivalent of check-vfd, IIRC and that is going to have all sorts of problems with the VFDs listed here. Have you run the CMake VFD tests? Do they pass? I would be really surprised if they did for, say, S3 since we often check for very "native file format" things that will not be true for object store files.

Also, the Windows VFD doesn't really exist, so that just runs the sec2 tests twice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of this PR was to go through all of the tests run under "check vfd" and the cmake equivalent and work around native file-specific things by checking if a VFD is compatible with these types of tests. The last I remember, the only VFDs I didn't test with were the S3 and HDFS VFDs, since I didn't currently have a platform to test on. For the Windows VFD, I don't see any reason NOT to also test it for the sake of completeness. Yes, it might just be the sec2 VFD, but it's still worth making sure it works the way we expect it to.

endif
if HDFS_VFD_CONDITIONAL
VFD_LIST += hdfs
endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the CMake comments above.

done:
FUNC_LEAVE_API(ret_value)
} /* end H5FDis_driver_registered_by_value() */

Copy link
Member

@derobins derobins Sep 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wish we'd set the VOL connector calls that these mirror up as herr_t calls that set an out parameter instead of returning htri_t...

@lrknox lrknox merged commit 3da0802 into HDFGroup:develop Sep 29, 2021
@jhendersonHDF jhendersonHDF deleted the H5FD_dynamic branch January 12, 2022 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants