Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.12 #13

Merged
merged 7 commits into from
Jan 19, 2024
Merged

0.12 #13

merged 7 commits into from
Jan 19, 2024

Conversation

rileyhales
Copy link
Collaborator

@rileyhales rileyhales commented Jan 9, 2024

Toggle forcing non-negative runoff volumes

Summary by CodeRabbit

  • New Features

    • Added new data conversion functions for hydrological calculations.
    • Included a parameter to ensure positive runoff values in inflow file creation.
  • Enhancements

    • Improved handling and logging of maximum inflow values.
  • Tests

    • Updated tests to reflect changes in the inflow module.
    • Adjusted CRS checks and m3_riv value assertions in test cases.
    • Temporarily disabled TEST 2 and TEST 3 for further review.
  • Documentation

    • Updated version number to 0.12.0.

@rileyhales rileyhales requested a review from RickytheGuy January 9, 2024 19:56
@rileyhales rileyhales self-assigned this Jan 9, 2024
Copy link

coderabbitai bot commented Jan 9, 2024

Walkthrough

The basininflow package has been updated to version 0.12.0, introducing functions to convert between cumulative and incremental inflow data. A new parameter enhances the create_inflow_file function, improving runoff data handling. Test adjustments reflect these updates, including a shift to a simplified import statement and refined CRS validation, while two tests are temporarily disabled.

Changes

File Path Change Summary
__init__.py Updated the package version from 0.11.0 to 0.12.0.
inflow.py Added new functions for inflow data conversion and a parameter to create_inflow_file function; updated cumulative value handling and logging.
tests/tests.py Simplified import statement; updated CRS check and m3_riv value assertion; commented out TEST 2 and TEST 3.

Poem

To version point twelve, we hop and leap,
With functions new, and changes deep.
Runoff's positive, tests take a nap,
🐇 CodeRabbit's work, now on the map!

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 5

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ccf499b and d768e91.
Files selected for processing (3)
  • basininflow/init.py (1 hunks)
  • basininflow/inflow.py (5 hunks)
  • tests/tests.py (4 hunks)
Files skipped from review due to trivial changes (1)
  • basininflow/init.py
Additional comments: 4
tests/tests.py (3)
  • 24-24: The import statement has been updated to import the entire basininflow package instead of specific functions. This change could improve modularity and make it easier to use other parts of the package in the future.

  • 48-48: Ensure that the epsg_code attribute is consistently set in the crs variable across all netCDF datasets to avoid false negatives in the assertion.

Verification successful

The attempt to verify the presence of the epsg_code attribute in the crs variable for netCDF datasets using the ncdump command failed because the ncdump command is not found in the environment where the script was executed. This means we cannot confirm whether the epsg_code attribute is consistently set across all netCDF datasets using this method.

Given the limitations, we should attempt a different strategy to verify the presence of the epsg_code attribute in the crs variable within the codebase. We can search for any references to epsg_code within the Python files, as this might indicate how the attribute is being set or used.


The search for references to epsg_code within Python files has yielded two results:

  1. In tests/tests.py, there is an assertion that compares the epsg_code of the crs variable between output_ds and validation_ds.
  2. In basininflow/inflow.py, the epsg_code is explicitly set to 'EPSG:4326' for a crs_var variable.

This indicates that the epsg_code attribute is indeed being used within the codebase and is set at least once to a known value ('EPSG:4326'). The assertion in the test file is checking for consistency between two datasets, which suggests that the epsg_code should be present and correctly set in the crs variables being compared.

Based on this information, it can be concluded that the review comment is consistent with the codebase, as the epsg_code attribute is being set and checked in the test suite.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify that the epsg_code attribute is set in the crs variable for netCDF datasets.
fd '.*\.nc$' --exec bash -c 'ncdump -h {} | grep "epsg_code"'

Length of output: 663



Script:

#!/bin/bash
# Search for references to 'epsg_code' within Python files to understand its usage.
rg "epsg_code" --type py

Length of output: 210

* 50-51: Moving the assertion for `m3_riv` values below the CRS check is a minor change and should not affect the test outcomes.
basininflow/inflow.py (1)
  • 219-219: Logging the maximum inflow value is a useful feature for debugging and monitoring. However, ensure that this does not result in the logging of sensitive data or significantly impact performance.

tests/tests.py Outdated
Comment on lines 74 to 97
# # TEST 2: Forecast inputs, auto timestep
# create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc',
# 'tests/test_vpu/345',
# 'tests/test_results/',
# cumulative=True)
#
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_345_*.nc')[0], 'r')
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r')
#
# check_function(val_ds, out_ds, 'TEST 2: Forecast inputs, auto timestep')
#
# # TEST 3: Forecast inputs, 1 hour timestep
# create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc',
# 'tests/test_vpu/345',
# 'tests/test_results/',
# vpu_name='custom_vpu',
# cumulative=True,
# timestep=datetime.timedelta(hours=3),
# file_label='file_label')
#
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_custom_vpu_*_file_label.nc')[0], 'r')
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r')
#
# check_function(val_ds, out_ds, 'TEST 3: Forecast inputs, auto timestep')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for TEST 2 and TEST 3 have been commented out. Please provide context for this change. Will these tests be updated or replaced to accommodate the new features?

basininflow/inflow.py Show resolved Hide resolved
basininflow/inflow.py Show resolved Hide resolved
@@ -38,7 +52,8 @@
comid_lat_lon_z: str = None,
timestep: datetime.timedelta = None,
cumulative: bool = False,
file_label: str = None, ) -> None:
file_label: str = None,
enforce_positive_runoff: bool = False, ) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enforce_positive_runoff parameter has been added to the create_inflow_file function. This parameter should be documented in the function's docstring to explain its purpose and usage.

Comment on lines 190 to 197
inflow_df = inflow_df.T.groupby(by=stream_ids).sum().T
inflow_df = inflow_df[sorted_rivid_array]

def _cumulative_to_incremental(df) -> pd.DataFrame:
return pd.DataFrame(
np.vstack([df.values[0, :], np.diff(df.values, axis=0)]),
index=df.index,
columns=df.columns
)

def _incremental_to_cumulative(df) -> pd.DataFrame:
return df.cumsum()

if cumulative:
logging.info('Converting to cumulative values')
inflow_df = _cumulative_to_incremental(inflow_df)

if enforce_positive_runoff:
inflow_df = inflow_df.clip(lower=0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of the enforce_positive_runoff feature uses the clip method to ensure non-negative runoff values. This is a clean and efficient way to enforce the constraint. However, it would be beneficial to log a warning if any values are being clipped, to inform the user of potential data issues.

if enforce_positive_runoff:
+   if (inflow_df < 0).any().any():
+       logging.warning('Negative runoff values detected and will be set to zero.')
    inflow_df = inflow_df.clip(lower=0)

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
inflow_df = inflow_df.T.groupby(by=stream_ids).sum().T
inflow_df = inflow_df[sorted_rivid_array]
def _cumulative_to_incremental(df) -> pd.DataFrame:
return pd.DataFrame(
np.vstack([df.values[0, :], np.diff(df.values, axis=0)]),
index=df.index,
columns=df.columns
)
def _incremental_to_cumulative(df) -> pd.DataFrame:
return df.cumsum()
if cumulative:
logging.info('Converting to cumulative values')
inflow_df = _cumulative_to_incremental(inflow_df)
if enforce_positive_runoff:
inflow_df = inflow_df.clip(lower=0)
inflow_df = inflow_df.T.groupby(by=stream_ids).sum().T
inflow_df = inflow_df[sorted_rivid_array]
if cumulative:
inflow_df = _cumulative_to_incremental(inflow_df)
if enforce_positive_runoff:
if (inflow_df < 0).any().any():
logging.warning('Negative runoff values detected and will be set to zero.')
inflow_df = inflow_df.clip(lower=0)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 4

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between d768e91 and 9821433.
Files selected for processing (1)
  • basininflow/inflow.py (7 hunks)
Additional comments: 4
basininflow/inflow.py (4)
  • 33-39: The _cumulative_to_incremental function has been added. Ensure that this function is thoroughly tested, especially edge cases where the first row of the DataFrame might contain special values like NaNs or infinities.
    Past comment still valid.
  • 42-44: The _incremental_to_cumulative function has been added. Similar to the previous function, it is important to test this function extensively, including edge cases.
    Past comment still valid.
  • 56-56: The enforce_positive_runoff parameter has been added to the create_inflow_file function. This parameter should be documented in the function's docstring to explain its purpose and usage.
    Past comment still valid.
  • 192-199: The implementation of the enforce_positive_runoff feature uses the clip method to ensure non-negative runoff values. This is a clean and efficient way to enforce the constraint. However, it would be beneficial to log a warning if any values are being clipped, to inform the user of potential data issues.
    Past comment still valid.

@@ -77,7 +94,7 @@ def create_inflow_file(lsm_data: str,
vpu_name = os.path.basename(input_dir)

# open all the ncs and select only the area within the weight table
if type(lsm_data) == list:
if type(lsm_data) is list:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional checks for lsm_data are incomplete. The ellipsis (...) suggests that there is code that should be present but is not. Ensure that the logic for handling different types of lsm_data inputs is fully implemented.

elif os.path.isdir(lsm_data):
    lsm_data = os.path.join(lsm_data, '*.nc*')
elif os.path.isfile(lsm_data):
+   # Implement logic for handling a single file
elif '*' in lsm_data:
+   # Implement logic for handling glob pattern

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if type(lsm_data) is list:
if type(lsm_data) is list:
elif os.path.isdir(lsm_data):
lsm_data = os.path.join(lsm_data, '*.nc*')
elif os.path.isfile(lsm_data):
# Implement logic for handling a single file
elif '*' in lsm_data:
# Implement logic for handling glob pattern

Comment on lines 198 to 199
if enforce_positive_runoff:
inflow_df = inflow_df.clip(lower=0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enforce_positive_runoff feature is implemented by clipping the DataFrame to ensure non-negative values. Consider adding a check to log a warning if any values are being clipped. This would inform the user of potential data issues.

if enforce_positive_runoff:
+   if (inflow_df < 0).any().any():
+       logging.warning('Negative runoff values detected and will be set to zero.')
    inflow_df = inflow_df.clip(lower=0)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if enforce_positive_runoff:
inflow_df = inflow_df.clip(lower=0)
if enforce_positive_runoff:
if (inflow_df < 0).any().any():
logging.warning('Negative runoff values detected and will be set to zero.')
inflow_df = inflow_df.clip(lower=0)

Comment on lines +209 to +211
elif isinstance(timestep, datetime.timedelta):
# Convert datetime timedelta to timedelta64[ns]
timestep = np.timedelta64(timestep,'ns')
timestep = np.timedelta64(timestep, 'ns')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for handling non-uniform timesteps seems to be incomplete. The timestep variable is checked for being an instance of datetime.timedelta but there is no corresponding action if it is not. Ensure that the logic is complete and handles all expected cases.

elif isinstance(timestep, datetime.timedelta):
+   # Implement logic for handling timestep when it is not a datetime.timedelta instance

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
elif isinstance(timestep, datetime.timedelta):
# Convert datetime timedelta to timedelta64[ns]
timestep = np.timedelta64(timestep,'ns')
timestep = np.timedelta64(timestep, 'ns')
elif isinstance(timestep, datetime.timedelta):
# Convert datetime timedelta to timedelta64[ns]
timestep = np.timedelta64(timestep, 'ns')
# Implement logic for handling timestep when it is not a datetime.timedelta instance

basininflow/inflow.py Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 9821433 and 86c3605.
Files selected for processing (2)
  • basininflow/inflow.py (7 hunks)
  • tests/tests.py (4 hunks)
Files skipped from review as they are similar to previous changes (1)
  • basininflow/inflow.py
Additional comments: 3
tests/tests.py (3)
  • 25-25: The import statement has been changed from a specific function import to a module-level import. This change is consistent with the rest of the file and should not cause any issues if the module basininflow is structured correctly.
  • 65-69: The create_inflow_file function call has been updated with two new parameters: cumulative and enforce_positive_runoff. Ensure that the default behavior of the function remains unchanged when these parameters are not provided, to maintain backward compatibility.
Verification successful

The function create_inflow_file has default values for the parameters cumulative and force_positive_runoff set to False. This indicates that the default behavior of the function remains unchanged when these parameters are not provided, thus maintaining backward compatibility.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify that the default behavior of create_inflow_file remains unchanged.
ast-grep --lang python --pattern $'def create_inflow_file($_, $_, $_, cumulative=$_, enforce_positive_runoff=$_)'

Length of output: 459



Script:

#!/bin/bash
# Search for the definition of create_inflow_file to check for default parameter values
ast-grep --lang python --pattern $'def create_inflow_file($$$)'

Length of output: 409



Script:

#!/bin/bash
# Use ripgrep to search for the definition of create_inflow_file in .py files
rg "def create_inflow_file" -t py

Length of output: 95



Script:

#!/bin/bash
# Extract the full function signature of create_inflow_file from inflow.py
rg "def create_inflow_file" -A 3 basininflow/inflow.py

Length of output: 218



Script:

#!/bin/bash
# Extract more lines following the function signature of create_inflow_file to see the default values for new parameters
rg "def create_inflow_file" -A 10 basininflow/inflow.py

Length of output: 555

* 74-74: The description for TEST 1 has been updated to reflect the new parameters. This is a good practice as it provides clarity on what the test is verifying.

Comment on lines +76 to +99
# # TEST 2: Forecast inputs, auto timestep
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc',
# 'tests/test_vpu/345',
# 'tests/test_results/',
# cumulative=True)
#
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_345_*.nc')[0], 'r')
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r')
#
# check_function(val_ds, out_ds, 'TEST 2: Forecast inputs, auto timestep')
#
# # TEST 3: Forecast inputs, 1 hour timestep
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc',
# 'tests/test_vpu/345',
# 'tests/test_results/',
# vpu_name='custom_vpu',
# cumulative=True,
# timestep=datetime.timedelta(hours=3),
# file_label='file_label')
#
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_custom_vpu_*_file_label.nc')[0], 'r')
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r')
#
# check_function(val_ds, out_ds, 'TEST 3: Forecast inputs, auto timestep')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TEST 2 and TEST 3 have been commented out. If these tests are no longer relevant due to the new changes, they should be removed instead of being commented out to keep the codebase clean. If they are temporarily disabled, there should be a TODO or FIXME comment explaining why and what needs to be done to re-enable them.

- # # TEST 2: Forecast inputs, auto timestep
- # ...
- # # TEST 3: Forecast inputs, 1 hour timestep
- # ...
+ # TODO: Re-enable or remove these tests after addressing the changes related to the new parameters.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
# # TEST 2: Forecast inputs, auto timestep
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc',
# 'tests/test_vpu/345',
# 'tests/test_results/',
# cumulative=True)
#
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_345_*.nc')[0], 'r')
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r')
#
# check_function(val_ds, out_ds, 'TEST 2: Forecast inputs, auto timestep')
#
# # TEST 3: Forecast inputs, 1 hour timestep
# bi.create_inflow_file('tests/inputs/era5_2560x5120_sample_data/forecast_data.nc',
# 'tests/test_vpu/345',
# 'tests/test_results/',
# vpu_name='custom_vpu',
# cumulative=True,
# timestep=datetime.timedelta(hours=3),
# file_label='file_label')
#
# out_ds = nc.Dataset(glob.glob('./tests/test_results/*_custom_vpu_*_file_label.nc')[0], 'r')
# val_ds = nc.Dataset('tests/validation/forecast_3_to_6_hour.nc', 'r')
#
# check_function(val_ds, out_ds, 'TEST 3: Forecast inputs, auto timestep')
# TODO: Re-enable or remove these tests after addressing the changes related to the new parameters.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 86c3605 and a2f4ba2.
Files selected for processing (3)
  • basininflow/cli.py (1 hunks)
  • basininflow/inflow.py (7 hunks)
  • tests/tests.py (4 hunks)
Files skipped from review due to trivial changes (1)
  • basininflow/cli.py
Files skipped from review as they are similar to previous changes (2)
  • basininflow/inflow.py
  • tests/tests.py

@rileyhales rileyhales merged commit 3a084ff into main Jan 19, 2024
1 check passed
@rileyhales rileyhales deleted the 0.12 branch January 19, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant