-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.12 #13
0.12 #13
Conversation
WalkthroughThe Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 5
Configuration used: CodeRabbit UI
Files selected for processing (3)
- basininflow/init.py (1 hunks)
- basininflow/inflow.py (5 hunks)
- tests/tests.py (4 hunks)
Files skipped from review due to trivial changes (1)
- basininflow/init.py
Additional comments: 4
tests/tests.py (3)
24-24: The import statement has been updated to import the entire
basininflow
package instead of specific functions. This change could improve modularity and make it easier to use other parts of the package in the future.48-48: Ensure that the
epsg_code
attribute is consistently set in thecrs
variable across all netCDF datasets to avoid false negatives in the assertion.Verification successful
The attempt to verify the presence of the
epsg_code
attribute in thecrs
variable for netCDF datasets using thencdump
command failed because thencdump
command is not found in the environment where the script was executed. This means we cannot confirm whether theepsg_code
attribute is consistently set across all netCDF datasets using this method.Given the limitations, we should attempt a different strategy to verify the presence of the
epsg_code
attribute in thecrs
variable within the codebase. We can search for any references toepsg_code
within the Python files, as this might indicate how the attribute is being set or used.
The search for references to
epsg_code
within Python files has yielded two results:
- In
tests/tests.py
, there is an assertion that compares theepsg_code
of thecrs
variable betweenoutput_ds
andvalidation_ds
.- In
basininflow/inflow.py
, theepsg_code
is explicitly set to'EPSG:4326'
for acrs_var
variable.This indicates that the
epsg_code
attribute is indeed being used within the codebase and is set at least once to a known value ('EPSG:4326'). The assertion in the test file is checking for consistency between two datasets, which suggests that theepsg_code
should be present and correctly set in thecrs
variables being compared.Based on this information, it can be concluded that the review comment is consistent with the codebase, as the
epsg_code
attribute is being set and checked in the test suite.* 50-51: Moving the assertion for `m3_riv` values below the CRS check is a minor change and should not affect the test outcomes.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the epsg_code attribute is set in the crs variable for netCDF datasets. fd '.*\.nc$' --exec bash -c 'ncdump -h {} | grep "epsg_code"'Length of output: 663
Script:
#!/bin/bash # Search for references to 'epsg_code' within Python files to understand its usage. rg "epsg_code" --type pyLength of output: 210
basininflow/inflow.py (1)
- 219-219: Logging the maximum inflow value is a useful feature for debugging and monitoring. However, ensure that this does not result in the logging of sensitive data or significantly impact performance.
tests/tests.py
Outdated
# # TEST 2: Forecast inputs, auto timestep | ||
# create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc', | ||
# 'tests/test_vpu/345', | ||
# 'tests/test_results/', | ||
# cumulative=True) | ||
# | ||
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_345_*.nc')[0], 'r') | ||
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r') | ||
# | ||
# check_function(val_ds, out_ds, 'TEST 2: Forecast inputs, auto timestep') | ||
# | ||
# # TEST 3: Forecast inputs, 1 hour timestep | ||
# create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc', | ||
# 'tests/test_vpu/345', | ||
# 'tests/test_results/', | ||
# vpu_name='custom_vpu', | ||
# cumulative=True, | ||
# timestep=datetime.timedelta(hours=3), | ||
# file_label='file_label') | ||
# | ||
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_custom_vpu_*_file_label.nc')[0], 'r') | ||
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r') | ||
# | ||
# check_function(val_ds, out_ds, 'TEST 3: Forecast inputs, auto timestep') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests for TEST 2 and TEST 3 have been commented out. Please provide context for this change. Will these tests be updated or replaced to accommodate the new features?
basininflow/inflow.py
Outdated
@@ -38,7 +52,8 @@ | |||
comid_lat_lon_z: str = None, | |||
timestep: datetime.timedelta = None, | |||
cumulative: bool = False, | |||
file_label: str = None, ) -> None: | |||
file_label: str = None, | |||
enforce_positive_runoff: bool = False, ) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enforce_positive_runoff
parameter has been added to the create_inflow_file
function. This parameter should be documented in the function's docstring to explain its purpose and usage.
basininflow/inflow.py
Outdated
inflow_df = inflow_df.T.groupby(by=stream_ids).sum().T | ||
inflow_df = inflow_df[sorted_rivid_array] | ||
|
||
def _cumulative_to_incremental(df) -> pd.DataFrame: | ||
return pd.DataFrame( | ||
np.vstack([df.values[0, :], np.diff(df.values, axis=0)]), | ||
index=df.index, | ||
columns=df.columns | ||
) | ||
|
||
def _incremental_to_cumulative(df) -> pd.DataFrame: | ||
return df.cumsum() | ||
|
||
if cumulative: | ||
logging.info('Converting to cumulative values') | ||
inflow_df = _cumulative_to_incremental(inflow_df) | ||
|
||
if enforce_positive_runoff: | ||
inflow_df = inflow_df.clip(lower=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of the enforce_positive_runoff
feature uses the clip
method to ensure non-negative runoff values. This is a clean and efficient way to enforce the constraint. However, it would be beneficial to log a warning if any values are being clipped, to inform the user of potential data issues.
if enforce_positive_runoff:
+ if (inflow_df < 0).any().any():
+ logging.warning('Negative runoff values detected and will be set to zero.')
inflow_df = inflow_df.clip(lower=0)
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
inflow_df = inflow_df.T.groupby(by=stream_ids).sum().T | |
inflow_df = inflow_df[sorted_rivid_array] | |
def _cumulative_to_incremental(df) -> pd.DataFrame: | |
return pd.DataFrame( | |
np.vstack([df.values[0, :], np.diff(df.values, axis=0)]), | |
index=df.index, | |
columns=df.columns | |
) | |
def _incremental_to_cumulative(df) -> pd.DataFrame: | |
return df.cumsum() | |
if cumulative: | |
logging.info('Converting to cumulative values') | |
inflow_df = _cumulative_to_incremental(inflow_df) | |
if enforce_positive_runoff: | |
inflow_df = inflow_df.clip(lower=0) | |
inflow_df = inflow_df.T.groupby(by=stream_ids).sum().T | |
inflow_df = inflow_df[sorted_rivid_array] | |
if cumulative: | |
inflow_df = _cumulative_to_incremental(inflow_df) | |
if enforce_positive_runoff: | |
if (inflow_df < 0).any().any(): | |
logging.warning('Negative runoff values detected and will be set to zero.') | |
inflow_df = inflow_df.clip(lower=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 4
Configuration used: CodeRabbit UI
Files selected for processing (1)
- basininflow/inflow.py (7 hunks)
Additional comments: 4
basininflow/inflow.py (4)
- 33-39: The
_cumulative_to_incremental
function has been added. Ensure that this function is thoroughly tested, especially edge cases where the first row of the DataFrame might contain special values like NaNs or infinities.
Past comment still valid.- 42-44: The
_incremental_to_cumulative
function has been added. Similar to the previous function, it is important to test this function extensively, including edge cases.
Past comment still valid.- 56-56: The
enforce_positive_runoff
parameter has been added to thecreate_inflow_file
function. This parameter should be documented in the function's docstring to explain its purpose and usage.
Past comment still valid.- 192-199: The implementation of the
enforce_positive_runoff
feature uses theclip
method to ensure non-negative runoff values. This is a clean and efficient way to enforce the constraint. However, it would be beneficial to log a warning if any values are being clipped, to inform the user of potential data issues.
Past comment still valid.
@@ -77,7 +94,7 @@ def create_inflow_file(lsm_data: str, | |||
vpu_name = os.path.basename(input_dir) | |||
|
|||
# open all the ncs and select only the area within the weight table | |||
if type(lsm_data) == list: | |||
if type(lsm_data) is list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditional checks for lsm_data
are incomplete. The ellipsis (...
) suggests that there is code that should be present but is not. Ensure that the logic for handling different types of lsm_data
inputs is fully implemented.
elif os.path.isdir(lsm_data):
lsm_data = os.path.join(lsm_data, '*.nc*')
elif os.path.isfile(lsm_data):
+ # Implement logic for handling a single file
elif '*' in lsm_data:
+ # Implement logic for handling glob pattern
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
if type(lsm_data) is list: | |
if type(lsm_data) is list: | |
elif os.path.isdir(lsm_data): | |
lsm_data = os.path.join(lsm_data, '*.nc*') | |
elif os.path.isfile(lsm_data): | |
# Implement logic for handling a single file | |
elif '*' in lsm_data: | |
# Implement logic for handling glob pattern |
basininflow/inflow.py
Outdated
if enforce_positive_runoff: | ||
inflow_df = inflow_df.clip(lower=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enforce_positive_runoff
feature is implemented by clipping the DataFrame to ensure non-negative values. Consider adding a check to log a warning if any values are being clipped. This would inform the user of potential data issues.
if enforce_positive_runoff:
+ if (inflow_df < 0).any().any():
+ logging.warning('Negative runoff values detected and will be set to zero.')
inflow_df = inflow_df.clip(lower=0)
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
if enforce_positive_runoff: | |
inflow_df = inflow_df.clip(lower=0) | |
if enforce_positive_runoff: | |
if (inflow_df < 0).any().any(): | |
logging.warning('Negative runoff values detected and will be set to zero.') | |
inflow_df = inflow_df.clip(lower=0) |
elif isinstance(timestep, datetime.timedelta): | ||
# Convert datetime timedelta to timedelta64[ns] | ||
timestep = np.timedelta64(timestep,'ns') | ||
timestep = np.timedelta64(timestep, 'ns') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for handling non-uniform timesteps seems to be incomplete. The timestep
variable is checked for being an instance of datetime.timedelta
but there is no corresponding action if it is not. Ensure that the logic is complete and handles all expected cases.
elif isinstance(timestep, datetime.timedelta):
+ # Implement logic for handling timestep when it is not a datetime.timedelta instance
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
elif isinstance(timestep, datetime.timedelta): | |
# Convert datetime timedelta to timedelta64[ns] | |
timestep = np.timedelta64(timestep,'ns') | |
timestep = np.timedelta64(timestep, 'ns') | |
elif isinstance(timestep, datetime.timedelta): | |
# Convert datetime timedelta to timedelta64[ns] | |
timestep = np.timedelta64(timestep, 'ns') | |
# Implement logic for handling timestep when it is not a datetime.timedelta instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 1
Configuration used: CodeRabbit UI
Files selected for processing (2)
- basininflow/inflow.py (7 hunks)
- tests/tests.py (4 hunks)
Files skipped from review as they are similar to previous changes (1)
- basininflow/inflow.py
Additional comments: 3
tests/tests.py (3)
- 25-25: The import statement has been changed from a specific function import to a module-level import. This change is consistent with the rest of the file and should not cause any issues if the module
basininflow
is structured correctly.- 65-69: The
create_inflow_file
function call has been updated with two new parameters:cumulative
andenforce_positive_runoff
. Ensure that the default behavior of the function remains unchanged when these parameters are not provided, to maintain backward compatibility.Verification successful
The function
create_inflow_file
has default values for the parameterscumulative
andforce_positive_runoff
set toFalse
. This indicates that the default behavior of the function remains unchanged when these parameters are not provided, thus maintaining backward compatibility.* 74-74: The description for TEST 1 has been updated to reflect the new parameters. This is a good practice as it provides clarity on what the test is verifying.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the default behavior of create_inflow_file remains unchanged. ast-grep --lang python --pattern $'def create_inflow_file($_, $_, $_, cumulative=$_, enforce_positive_runoff=$_)'Length of output: 459
Script:
#!/bin/bash # Search for the definition of create_inflow_file to check for default parameter values ast-grep --lang python --pattern $'def create_inflow_file($$$)'Length of output: 409
Script:
#!/bin/bash # Use ripgrep to search for the definition of create_inflow_file in .py files rg "def create_inflow_file" -t pyLength of output: 95
Script:
#!/bin/bash # Extract the full function signature of create_inflow_file from inflow.py rg "def create_inflow_file" -A 3 basininflow/inflow.pyLength of output: 218
Script:
#!/bin/bash # Extract more lines following the function signature of create_inflow_file to see the default values for new parameters rg "def create_inflow_file" -A 10 basininflow/inflow.pyLength of output: 555
# # TEST 2: Forecast inputs, auto timestep | ||
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc', | ||
# 'tests/test_vpu/345', | ||
# 'tests/test_results/', | ||
# cumulative=True) | ||
# | ||
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_345_*.nc')[0], 'r') | ||
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r') | ||
# | ||
# check_function(val_ds, out_ds, 'TEST 2: Forecast inputs, auto timestep') | ||
# | ||
# # TEST 3: Forecast inputs, 1 hour timestep | ||
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc', | ||
# 'tests/test_vpu/345', | ||
# 'tests/test_results/', | ||
# vpu_name='custom_vpu', | ||
# cumulative=True, | ||
# timestep=datetime.timedelta(hours=3), | ||
# file_label='file_label') | ||
# | ||
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_custom_vpu_*_file_label.nc')[0], 'r') | ||
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r') | ||
# | ||
# check_function(val_ds, out_ds, 'TEST 3: Forecast inputs, auto timestep') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TEST 2 and TEST 3 have been commented out. If these tests are no longer relevant due to the new changes, they should be removed instead of being commented out to keep the codebase clean. If they are temporarily disabled, there should be a TODO or FIXME comment explaining why and what needs to be done to re-enable them.
- # # TEST 2: Forecast inputs, auto timestep
- # ...
- # # TEST 3: Forecast inputs, 1 hour timestep
- # ...
+ # TODO: Re-enable or remove these tests after addressing the changes related to the new parameters.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
# # TEST 2: Forecast inputs, auto timestep | |
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc', | |
# 'tests/test_vpu/345', | |
# 'tests/test_results/', | |
# cumulative=True) | |
# | |
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_345_*.nc')[0], 'r') | |
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r') | |
# | |
# check_function(val_ds, out_ds, 'TEST 2: Forecast inputs, auto timestep') | |
# | |
# # TEST 3: Forecast inputs, 1 hour timestep | |
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc', | |
# 'tests/test_vpu/345', | |
# 'tests/test_results/', | |
# vpu_name='custom_vpu', | |
# cumulative=True, | |
# timestep=datetime.timedelta(hours=3), | |
# file_label='file_label') | |
# | |
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_custom_vpu_*_file_label.nc')[0], 'r') | |
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r') | |
# | |
# check_function(val_ds, out_ds, 'TEST 3: Forecast inputs, auto timestep') | |
# TODO: Re-enable or remove these tests after addressing the changes related to the new parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (3)
- basininflow/cli.py (1 hunks)
- basininflow/inflow.py (7 hunks)
- tests/tests.py (4 hunks)
Files skipped from review due to trivial changes (1)
- basininflow/cli.py
Files skipped from review as they are similar to previous changes (2)
- basininflow/inflow.py
- tests/tests.py
Toggle forcing non-negative runoff volumes
Summary by CodeRabbit
New Features
Enhancements
Tests
m3_riv
value assertions in test cases.Documentation