-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image initialization speedup #684
Conversation
were missing in PR 671
…forceskip_reset_missed
opening and closing files again and again - Runs successfully without MPI specified - Error with MPI
…it_speedup Conflicts: vic/drivers/image/src/vic_force.c
Some speed tests (times shown below are all walltime): Test 1 (a very small domain): Stehekin (16 grid cells), 10 day, hourly timestep; hydra master node, 1 processor:
Test 2 (a regional-scale domain): Arkansas Red (3999 grid cells), 10 years, 3 hourly timestep; hyak, 1 node (16 processors total):
Test 2 (continental-scale domain): whole CONUS (333579 grid cells), 1 year, 3 hourly timestep; hyak, 2 nodes (32 processors total):
It seems like the initialization time is improved quite a lot; the run time is also improved since now we don't open and close the forcing file for each variable at each timestep. The speedup is more significant for smaller basins. |
@jhamman @bartnijssen @dgergel This PR can use a code review now. The CESM test is failing right now since corresponding changes in CESM driver are needed. Other tests have passed, so I think I can incorporate your comments/suggestions first, and then @dgergel can work on the CESM driver update. |
char result_dir[MAXSTRING]; /**< directory where results will be written */ | ||
char statefile[MAXSTRING]; /**< name of file in which to store model state */ | ||
char log_path[MAXSTRING]; /**< Location to write log file to */ | ||
} filenames_struct; | ||
|
||
void add_nveg_to_global_domain(char *nc_name, domain_struct *global_domain); | ||
void add_nveg_to_global_domain(nameid_struct nc_nameid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should pass pointers to structures rather than structures. Passing structures means a copy which is more expensive. If you are concerned that the values are changed you can use const
s.
So
void add_nveg_to_global_domain(nameid_struct *nc_nameid,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I did see from somewhere else that we should always pass pointer of a structure, but didn't do it in the initial attempt. Will make the change.
void free_force(force_data_struct *force); | ||
void free_veg_hist(veg_hist_struct *veg_hist); | ||
void get_domain_type(char *cmdstr); | ||
size_t get_global_domain(char *domain_nc_name, char *param_nc_name, | ||
size_t get_global_domain(nameid_struct domain_nc_nameid, | ||
nameid_struct param_nc_nameid, | ||
domain_struct *global_domain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these should be nameid_struct *
. Within the function you would then use a ->
rather than a .
* * @brief This structure stores netCDF file name and corresponding nc_id | ||
* *****************************************************************************/ | ||
typedef struct { | ||
char nc_file[MAXSTRING]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is a nameid
struct, I would call use nc_filename
rather than nc_file
. The latter is ambiguous, since I would have expect nc_file
to be of type FILE *
rather than char *
* @brief Open a netCDF file | ||
*****************************************************************************/ | ||
int | ||
open_nc(char *nc_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change function names - this is too close to nc_open
and nc_close
. Also I don't quite understand why we need the wrapper functions, since not much happens here other than a call to nc_open
. So
a) motivate why we need these functions
b) if we need them, change their names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't think we really need these functions - will delete.
if (mpi_rank == VIC_MPI_ROOT) { | ||
// Close domain file | ||
close_nc(filenames.domain); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear why we are doing this here rather than in vic_start()
. Can we not do this inside the VIC_MPI_ROOT
block in vic_start
: https://github.com/UW-Hydro/VIC/blob/develop/vic/drivers/shared_image/src/vic_start.c#L69
I really want to avoid loading this information here if it is not needed at this level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can do this opening and closing domain file thing entirely inside vic_start
. Do you think we should do the same thing for opening and closing parameter file as well (although we do need to open the parameter file in vic_start
, and close it in vic_init
, if that's what we want to do, since the parameter file variables are used in both functions)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically if the information is not needed at a higher level in the code then it is better to hide it (put it "lower" in the code) and put the code where it is used. We want to keep the high level functions as clean as possible. Also keep in mind that opening the parameter file once or twice is not all that expensive. It's expensive when we have to open it for every read.
This PR should be merged after PR #685 |
Conflicts: vic/drivers/image/src/get_global_param.c
including: - pass the pointer of the new nameid_struct variables into functions (instead of passing in the structure itself) - rename the filename element in the nameid_struct from "nc_file" to "nc_filename" - remove the "open_nc" and "close_nc" wrapping functions - move the opening and closing of the domain and parameter nc files into either "vic_start" or "vic_init"
Have addressed @bartnijssen 's comments |
@@ -89,7 +104,7 @@ vic_force(void) | |||
for (j = 0; j < NF; j++) { | |||
d3start[0] = global_param.forceskip[0] + global_param.forceoffset[0] + | |||
j; | |||
get_scatter_nc_field_double(filenames.forcing[0], | |||
get_scatter_nc_field_double(&(filenames.forcing[0]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you doing &(filenames.forcing[0])
? Isn't filenames.forcing[0]
already a string (char *
)? I think that you can revert all these to filenames.forcing[0]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. filenames.forcing[0]
is of type nameid_struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right!
char nc_filename[MAXSTRING]; | ||
int nc_id; | ||
} nameid_struct; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this needs to be moved to vic_driver_shared_image.h
where all the other netcdf related structures are defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I put this here is that vic_mpi.h
is lower level than vic_driver_shared_image.h
(i.e., vic_driver_shared_image.h
includes vic_mpi.h
, but not vice versa); but the get_scatter_nc_field_XX
functions in vic_mpi.h
needs to use the nameid_struct structure. I agree that it's not super clean - do you have any suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well - the nameid_struct
is not specific to the use of mpi
so it does not really belong there. It's specific to the use of netcdf
. Let me think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I see what you are saying - I will accept the PR and then quickly create a new one in which all the netcdf function definitions, etc are collected in a separate vic_netcdf.h header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a few things but nothing works out nicely. I think this is a bit of a design flaw here, but I don't see an easy fix. I still don't quite think it belongs where we put it, but we'll leave it be for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - thanks for looking into this.
New speed tests with time for Test 1 (a very small domain): Stehekin (16 grid cells), 10 day, hourly timestep; hydra master node, 1 processor:
Test 2 (a regional-scale domain): Arkansas Red (3999 grid cells), 10 years, 3 hourly timestep; hyak, 1 node (16 processors total):
Test 2 (continental-scale domain): whole CONUS (333579 grid cells), 1 year, 3 hourly timestep; hyak, 2 nodes (32 processors total):
The times in this set of tests are somewhat different from last test because of variation of different VIC run realizations (even if we use the same specs and machines). But note that the "before" time for the 10-year Arkansas-Red run in our last test showed a much longer running time - I might have made some mistakes last time. |
This PR speeds up image driver initialization by only opening and closing each input netCDF file once (instead of opening and closing when reading every variable). This PR is related to issue #538 , issue #380 and PR #564 .
Specifically:
filenames
structure, all input file information is saved as filenames only. Now in this PR, for those input files in netCDF format (domain, parameter, initial state and forcings), both filename and nc_id are saved in a newnameid_struct
in thefilenames
structure. This change allows nc_id to be stored alongside with filename once an input nc file is open, and be directly used later.get_nc_field_<TYPE>
,get_nc_var_attr
,get_nc_var_type
,get_nc_vardimensions
. Instead, these functions assume that the nc file is already open beforehand, and takenameid_struct
(which includes both filename and nc_id) directly as input argument. Created separate functions for opening and closing nc files.get_global_param
, and kept open; then invic_force
, close the previous-year forcing file and open a new-year file if going into a new year; the last-year file is closed after the last simulation timestep.Further things to do:
Note: