Skip to content

Commit

Permalink
Renaming to hbc (#26)
Browse files Browse the repository at this point in the history
  • Loading branch information
rileyhales authored Dec 1, 2021
1 parent 52b1497 commit 7ee8d79
Show file tree
Hide file tree
Showing 20 changed files with 106 additions and 103 deletions.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
The Clear BSD License
https://choosealicense.com/licenses/bsd-3-clause-clear/

Copyright (c) 2020 Riley Chad Hales
Copyright (c) 2021 Riley Chad Hales
All rights reserved.

Redistribution and use in source and binary forms, with or without
Expand Down
69 changes: 35 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Regional Bias Correction of Large Hydrological Models
# Hydrological Bias Correction on Large Mode
This repository contains Python code which can be used to calibrate biased, non-gridded hydrologic models. Most of the
code in this repository will work on any model's results. The data preprocessing and automated calibration functions
are programmed to expect data following the GEOGloWS ECMWF Streamflow Service's structure and format.
Expand All @@ -17,6 +17,7 @@ to tweak. We do this in an attempt to avoid requiring the source model data or t
Both of those conditions are not always available or practical when dealing with large scale models or datasets.

## Python environment
See requirements.txt
- python >= 3.7
- numpy
- pandas
Expand Down Expand Up @@ -50,10 +51,10 @@ file formats are acceptable
### 1 Create a Working Directory

```python
import rbc
import hbc

path_to_working_directory = '/my/file/path'
rbc.prep.scaffold_workdir(path_to_working_directory)
hbc.prep.scaffold_workdir(path_to_working_directory)
```

Your working directory should exactly like this.
Expand Down Expand Up @@ -169,9 +170,9 @@ unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_
... | ... | ... | ... | ...

```python
import rbc
import hbc
workdir = '/path/to/project/directory/'
rbc.prep.gen_assignments_table(workdir)
hbc.prep.gen_assignments_table(workdir)
```

Your project's working directory now looks like
Expand Down Expand Up @@ -233,22 +234,22 @@ month | model_id_1 | model_id_2 | model_id_3
1 | 60 | 60 | 60
2 | 30 | 30 | 30
3 | 70 | 70 | 70
... | ... | ... | ...
... | ... | ... | ...

```python
import rbc
import hbc

workdir = '/path/to/working/directory'

rbc.prep.historical_simulation(
hbc.prep.historical_simulation(
workdir,
'/path/to/historical/simulation/netcdf.nc' # optional - if nc not stored in data_inputs folder
)
rbc.prep.hist_sim_table(
hbc.prep.hist_sim_table(
workdir,
'/path/to/historical/simulation/netcdf.nc' # optional - if nc not stored in data_inputs folder
)
rbc.prep.observed_data(
hbc.prep.observed_data(
workdir,
'/path/to/obs/csv/directory' # optional - if csvs not stored in workdir/data_inputs/obs_csvs
)
Expand Down Expand Up @@ -295,10 +296,10 @@ For each of the following, generate and store clusters for many group sizes- bet
Use this code:

```python
import rbc
import hbc

workdir = '/path/to/project/directory/'
rbc.cluster.generate(workdir)
hbc.cluster.generate(workdir)
```

This function creates trained kmeans models saved as pickle files, plots (from matplotlib) of what each of the clusters
Expand Down Expand Up @@ -353,16 +354,16 @@ The justification for this is obvious. The observations are the actual streamflo
- The reason listed for this assignment is "gauged"

```python
import rbc
import hbc

# assign_table = pandas DataFrame (see rbc.table module)
# assign_table = pandas DataFrame (see hbc.table module)
workdir = '/path/to/project/directory/'
assign_table = rbc.table.read(workdir)
rbc.assign.gauged(assign_table)
assign_table = hbc.table.read(workdir)
hbc.assign.gauged(assign_table)
```

### 7 Assign basins by Propagation (hydraulically connected to a gauge)
This step involves editing the `assign_table.csv` and but does not change the file structure of the project.
This step involves editing the `assign_table.csv` and does not change the file structure of the project.

Theory: being up/down stream of the gauge but on the same stream order probably means that the seasonality of the flow is
probably the same (same FDC), but the monthly average may change depending on how many streams connect with/diverge from the stream.
Expand All @@ -374,12 +375,12 @@ be less sensitive to changes in flows up stream, may connect basins with differe
i is the number of stream segments up/down from the gauge the river is.

```python
import rbc
import hbc

# assign_table = pandas DataFrame (see rbc.table module)
# assign_table = pandas DataFrame (see hbc.table module)
workdir = '/path/to/project/directory/'
assign_table = rbc.table.read(workdir)
rbc.assign.propagation(assign_table)
assign_table = hbc.table.read(workdir)
hbc.assign.propagation(assign_table)
```

### 8 Assign basins by Clusters (hydrologically similar basins)
Expand All @@ -390,12 +391,12 @@ Using the results of the optimal clusters
- Review assignments spatially. Run tests and view improvements. Adjust clusters and reassign as necessary.

```python
import rbc
import hbc

# assign_table = pandas DataFrame (see rbc.table module)
# assign_table = pandas DataFrame (see hbc.table module)
workdir = '/path/to/project/directory/'
assign_table = rbc.table.read(workdir)
rbc.assign.clusters_by_dist(assign_table)
assign_table = hbc.table.read(workdir)
hbc.assign.clusters_by_dist(assign_table)
```

### 9 Generate GIS files of the assignments
Expand All @@ -404,18 +405,18 @@ use to visualize the results of this process. These GIS files help you investiga
used at each step. Use this to monitor the results.

```python
import rbc
import hbc

workdir = '/path/to/project/directory/'
assign_table = rbc.table.read(workdir)
assign_table = hbc.table.read(workdir)
drain_shape = '/my/file/path/'
rbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
rbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
rbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
hbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
hbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
hbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)

# or if you have a specific set of ID's to check on
list_of_model_ids = [123, 456, 789]
rbc.gis.clip_by_ids(workdir, list_of_model_ids, drain_shape)
hbc.gis.clip_by_ids(workdir, list_of_model_ids, drain_shape)
```

After this step, your project directory should look like this:
Expand Down Expand Up @@ -508,13 +509,13 @@ excluded each time. The code provided will help you partition your gauge table i
against the observed data which was withheld from the bias correction process.

```python
import rbc
import hbc
workdir = '/path/to/project/directory'
drain_shape = '/path/to/drainageline/gis/file.shp'
obs_data_dir = '/path/to/obs/data/directory' # optional - if data not in workdir/data_inputs/obs_csvs

rbc.validate.sample_gauges(workdir)
rbc.validate.run_series(workdir, drain_shape, obs_data_dir)
hbc.validate.sample_gauges(workdir)
hbc.validate.run_series(workdir, drain_shape, obs_data_dir)
```

After this step your working directory should look like this:
Expand Down
46 changes: 23 additions & 23 deletions examples/colombia-magdalena/magdalena_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import numpy as np

import rbc
import hbc


np.seterr(all="ignore")
Expand All @@ -13,47 +13,47 @@
obs_data_dir = os.path.join(workdir, 'data_inputs', 'obs_csvs')

# Only need to do this step 1x ever
# rbc.prep.scaffold_working_directory(workdir)
# hbc.prep.scaffold_working_directory(workdir)

# Create the gauge_table and drain_table.csv
# Scripts not provided, check readme for instructions

# Generate the assignments table
# assign_table = rbc.table.gen(workdir)
# rbc.table.cache(workdir, assign_table)
# assign_table = hbc.table.gen(workdir)
# hbc.table.cache(workdir, assign_table)
# Or read the existing table
# assign_table = rbc.table.read(workdir)
# assign_table = hbc.table.read(workdir)

# Prepare the observation and simulation data
# Only need to do this step 1x ever
# rbc.prep.historical_simulation(os.path.join(workdir, 'data_simulated', 'south_america_era5_qout.nc'), workdir)
# rbc.prep.observation_data(workdir)
# hbc.prep.historical_simulation(os.path.join(workdir, 'data_simulated', 'south_america_era5_qout.nc'), workdir)
# hbc.prep.observation_data(workdir)

# Generate the clusters using the historical simulation data
# rbc.cluster.generate(workdir)
# assign_table = rbc.cluster.summarize(workdir, assign_table)
# rbc.table.cache(workdir, assign_table)
# hbc.cluster.generate(workdir)
# assign_table = hbc.cluster.summarize(workdir, assign_table)
# hbc.table.cache(workdir, assign_table)

# Assign basins which are gauged and propagate those gauges
# assign_table = rbc.assign.gauged(assign_table)
# assign_table = rbc.assign.propagation(assign_table)
# assign_table = rbc.assign.clusters_by_dist(assign_table)
# todo assign_table = rbc.assign.clusters_by_monavg(assign_table)
# assign_table = hbc.assign.gauged(assign_table)
# assign_table = hbc.assign.propagation(assign_table)
# assign_table = hbc.assign.clusters_by_dist(assign_table)
# todo assign_table = hbc.assign.clusters_by_monavg(assign_table)

# Cache the assignments table with the updates
# rbc.table.cache(workdir, assign_table)
# hbc.table.cache(workdir, assign_table)

# Generate GIS files so you can go explore your progress graphically
# rbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
# rbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
# rbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
# hbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
# hbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
# hbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)

# Compute the corrected simulation data
# assign_table = rbc.table.read(workdir)
# rbc.calibrate_region(workdir, assign_table)
# vtab = rbc.validate.gen_val_table(workdir)
rbc.gis.validation_maps(workdir, gauge_shape)
rbc.analysis.plot(workdir, obs_data_dir, 9007721)
# assign_table = hbc.table.read(workdir)
# hbc.calibrate_region(workdir, assign_table)
# vtab = hbc.validate.gen_val_table(workdir)
hbc.gis.validation_maps(workdir, gauge_shape)
hbc.analysis.plot(workdir, obs_data_dir, 9007721)


# import pandas as pd
Expand Down
40 changes: 20 additions & 20 deletions examples/example_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import numpy as np

import rbc
import hbc


np.seterr(all="ignore")
Expand All @@ -14,49 +14,49 @@
hist_sim_nc = ''

# Prepare the working directory - only need to do this step 1x ever
# rbc.prep.scaffold_working_directory(workdir)
# hbc.prep.scaffold_working_directory(workdir)
# Scripts not provided. Consult README.md for instructions
# Create the gauge_table.csv and drain_table.csv
# Put the historical simulation netCDF in the right folder
# Put the observed data csv files in the data_inputs/obs_csvs folder

# Prepare the observation and simulation data - Only need to do this step 1x ever
print('Preparing data')
rbc.prep.historical_simulation(workdir)
hbc.prep.historical_simulation(workdir)

# Generate the assignments table
print('Generate Assignment Table')
assign_table = rbc.table.gen(workdir)
rbc.table.cache(workdir, assign_table)
assign_table = hbc.table.gen(workdir)
hbc.table.cache(workdir, assign_table)

# Generate the clusters using the historical simulation data
print('Generate Clusters')
rbc.cluster.generate(workdir)
assign_table = rbc.cluster.summarize(workdir, assign_table)
rbc.table.cache(workdir, assign_table)
hbc.cluster.generate(workdir)
assign_table = hbc.cluster.summarize(workdir, assign_table)
hbc.table.cache(workdir, assign_table)

# Assign basins which are gauged and propagate those gauges
print('Making Assignments')
assign_table = rbc.assign.gauged(assign_table)
assign_table = rbc.assign.propagation(assign_table)
assign_table = rbc.assign.clusters_by_dist(assign_table)
assign_table = hbc.assign.gauged(assign_table)
assign_table = hbc.assign.propagation(assign_table)
assign_table = hbc.assign.clusters_by_dist(assign_table)

# Cache the assignments table with the updates
rbc.table.cache(workdir, assign_table)
hbc.table.cache(workdir, assign_table)

# Generate GIS files so you can go explore your progress graphically
print('Generate GIS files')
rbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
rbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
rbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
hbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
hbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
hbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)

# Compute the corrected simulation data
print('Starting Calibration')
rbc.calibrate_region(workdir, assign_table)
hbc.calibrate_region(workdir, assign_table)

# run the validation study
print('Performing Validation')
rbc.validate.sample_gauges(workdir, overwrite=True)
rbc.validate.run_series(workdir, drain_shape, obs_data_dir)
vtab = rbc.validate.gen_val_table(workdir)
rbc.gis.validation_maps(workdir, gauge_shape, vtab)
hbc.validate.sample_gauges(workdir, overwrite=True)
hbc.validate.run_series(workdir, drain_shape, obs_data_dir)
vtab = hbc.validate.gen_val_table(workdir)
hbc.gis.validation_maps(workdir, gauge_shape, vtab)
19 changes: 19 additions & 0 deletions hbc/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from hbc._workflow import prep_region, analyze_region
from hbc._calibrate import calibrate_stream, calibrate_region

import hbc.table
import hbc.prep
import hbc.cluster
import hbc.assign
import hbc.gis
import hbc.utils
import hbc.validate
import hbc.analysis


__all__ = ['prep_region', 'analyze_region',
'calibrate_stream', 'calibrate_region',
'table', 'prep', 'assign', 'gis', 'cluster', 'utils', 'validate', 'analysis']
__author__ = 'Riley Hales'
__version__ = '0.2.0'
__license__ = 'BSD 3 Clause Clear'
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion rbc/gis.py → hbc/gis.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ def validation_maps(workdir: str, gauge_shape: str, val_table: pd.DataFrame = No
Args:
workdir: path to the project directory
val_table: the validation table produced by rbc.validate
val_table: the validation table produced by hbc.validate
gauge_shape: path to the gauge locations shapefile
prefix: optional, a prefix to prepend to each created file's name
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion rbc/validate.py → hbc/validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def sample_gauges(workdir: str, overwrite: bool = False) -> None:

def run_series(workdir: str, drain_shape: str, obs_data_dir: str = None) -> None:
"""
Runs rbc.analyze_region on each project in the validation_runs directory
Runs hbc.analyze_region on each project in the validation_runs directory
Args:
workdir: the project working directory
Expand Down
Loading

0 comments on commit 7ee8d79

Please sign in to comment.