Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature 240 reformat tcdiag #287

Merged
merged 50 commits into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
43ba750
issue #240 added specific constants for TCDiag reformatting from TC-P…
bikegeek Mar 26, 2024
91c7cbe
Issue #240 support to reformat the TCDiag linetype output from TC-Pai…
bikegeek Mar 26, 2024
c9a5097
added test for TCDIAG reformatting
bikegeek Mar 27, 2024
61e2c52
Data for testing TCDiag reformatting
bikegeek Mar 27, 2024
337fbae
config file for testing tcdiag reformatting
bikegeek Mar 27, 2024
1314e88
Removed hard-coded paths
bikegeek Mar 28, 2024
25522df
Update unit_tests.yml
bikegeek Mar 29, 2024
fcbffc8
Added test for TCDiag reformatting. Added benchmarking for ECNT and …
bikegeek Mar 29, 2024
681bdeb
Merge branch 'develop' of https://github.com/dtcenter/METdataio into …
bikegeek Mar 29, 2024
7797ac9
Merge branch 'feature_240_reformat_tcdiag' of https://github.com/dtce…
bikegeek Mar 29, 2024
9ac13f0
add the -v -s options to view the benchmark results
bikegeek Mar 29, 2024
13268da
Remove print statement and writing temporary output to file
bikegeek Mar 29, 2024
fc17165
Experimenting with saving benchmark results with autosave and setting…
bikegeek Mar 29, 2024
67795a8
Explicitly setting location of benchmark in artifact directive
bikegeek Mar 29, 2024
57f2947
Change artifact syntax to check for subdirectories
bikegeek Mar 29, 2024
9af3449
Fixed type with path name in artifact instructions
bikegeek Mar 29, 2024
147c531
set output directory value
bikegeek Mar 29, 2024
9163dba
Additional test with pandas 2.2
bikegeek Mar 29, 2024
a394003
Remove auto save option
bikegeek Mar 30, 2024
eb3bc56
Check for artifacts under the test directory
bikegeek Mar 30, 2024
1951048
add pip upgrade command before checking out source
bikegeek Mar 30, 2024
f8f8b96
try different syntax for path in artifacts commands
bikegeek Mar 30, 2024
f60ed40
add coverage command and modify path to search for any json
bikegeek Mar 30, 2024
7b7aeb4
add pandas 2.2 to description
bikegeek Mar 30, 2024
bf8274f
add autosave to benchmark command
bikegeek Mar 30, 2024
07fd9ee
Search only under the test path for artifact saving
bikegeek Mar 30, 2024
31e3500
Code coverage and benchmark results not saved to artifacts. Comment …
bikegeek Mar 31, 2024
742a21b
move benchmarking code to test_benchmark.py
bikegeek Mar 31, 2024
799f9bb
initial checkin of benchmarking for ECNT and TCDiag linetypes
bikegeek Mar 31, 2024
d5b0a1b
Remove benchmark instructions
bikegeek Mar 31, 2024
748224c
Updated comments
bikegeek Mar 31, 2024
97d3087
Updated comments
bikegeek Mar 31, 2024
f301ad2
Fix coverage commands
bikegeek Mar 31, 2024
c95836b
Delete .github/workflows/unit_tests_pandas.yml
bikegeek Mar 31, 2024
b7e80fc
change syntax for pytest command from py.test to pytest
bikegeek Mar 31, 2024
642f149
replace cut-and-paste from pandas2_2 versions with 1.5x
bikegeek Mar 31, 2024
6a70280
Moved test_benchmark.py to the test/benchmarks directory
bikegeek Mar 31, 2024
a50ccd9
Files for running benchmarking
bikegeek Mar 31, 2024
aff4dae
Delete METreformat/test/test_benchmark.py
bikegeek Mar 31, 2024
b0e04a8
replaces test_benchmark.py
bikegeek Mar 31, 2024
0973cf9
update filename of benchmark test file from test_benchmark to run_ben…
bikegeek Mar 31, 2024
676531e
Merge branch 'feature_240_reformat_tcdiag' of https://github.com/dtce…
bikegeek Mar 31, 2024
ac0f0c5
Delete METreformat/test/benchmarks/ECNT_for_agg.yaml
bikegeek Mar 31, 2024
0364f70
Delete METreformat/test/benchmarks/test_benchmark.py
bikegeek Mar 31, 2024
b1623c4
Delete METreformat/test/benchmarks/test_reformat_tcdiag.yaml
bikegeek Mar 31, 2024
dda9f5a
Removed the benchmarks directory from instructions
bikegeek Mar 31, 2024
3c50549
Merge branch 'feature_240_reformat_tcdiag' of https://github.com/dtce…
bikegeek Mar 31, 2024
4b3438a
change coverage command
bikegeek Mar 31, 2024
59ca0a4
Added information about TCDIAG linetype from TC-Pairs output
bikegeek Mar 31, 2024
565a82c
Added sentence about .tcst files
bikegeek Mar 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions .github/workflows/benchmark_pandas1_5.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# This workflow will install Python dependencies, run tests the specified Python version
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: Benchmark for pandas 1.5x

on:
push:
branches:
- develop
- develop-ref
- feature_*
- main_*
- bugfix_*
- issue_*
- gha_*
paths-ignore:
- 'docs/**'
- '.github/pull_request_template.md'
- '.github/ISSUE_TEMPLATE/**'
- '**/README.md'
- '**/LICENSE.md'


pull_request:
types: [ opened, reopened, synchronize ]


jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: [ "3.10" ]

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} for pandas1.5x benchmarking
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Retrieve METcalcpy repository develop branch
run: |
/usr/bin/git clone https://github.com/dtcenter/METcalcpy
cd METcalcpy
/usr/bin/git checkout develop
python -m pip install -e .


- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install pytest>=7.1.1
python -m pip install pytest_benchmark
python -m pip install netcdf4==1.6.2
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi


# Checking the branch name, not necessary but useful when setting things up.
# - name: Extract branch name
# shell: bash
# run: echo "##[set-output name=branch;]$(echo ${GITHUB_REF#refs/heads/})"
# id: extract_branch


- name: Test with pytest with pandas 1.5x
run: |
echo "GITHUB wspace: $GITHUB_WORKSPACE"
export PYTHONPATH=$GITHUB_WORKSPACE/:$GITHUB_WORKSPACE/METdbLoad:$GITHUB_WORKSPACE/METdbLoad/ush:$GITHUB_WORKSPACE/METreformat:$GITHUB_WORKSPACE/METreadnc
echo "PYTHONPATH is $PYTHONPATH"
cd $GITHUB_WORKSPACE/METreformat
cd test
py.test run_benchmark.py -v -s --benchmark-autosave

echo "Finished benchmarking for pandas 1.5x "

# - name: Archive benchmark results
# uses: actions/upload-artifact@v4
# with:
# name: benchmark-report
# path: /home/runner/work/METdataio/METdatio/METreformat/test/**
85 changes: 85 additions & 0 deletions .github/workflows/benchmark_pandas2_2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# This workflow will install Python dependencies, run tests the specified Python version
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: Benchmark for pandas 2.2

on:
push:
branches:
- develop
- develop-ref
- feature_*
- main_*
- bugfix_*
- issue_*
- gha_*
paths-ignore:
- 'docs/**'
- '.github/pull_request_template.md'
- '.github/ISSUE_TEMPLATE/**'
- '**/README.md'
- '**/LICENSE.md'


pull_request:
types: [ opened, reopened, synchronize ]


jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: [ "3.10" ]

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} for pandas2.2 testing
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Retrieve METcalcpy repository develop branch
run: |
/usr/bin/git clone https://github.com/dtcenter/METcalcpy
cd METcalcpy
/usr/bin/git checkout develop
python -m pip install -e .


- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install pytest>=7.1.1
python -m pip install pytest_benchmark
python -m pip install netcdf4==1.6.2
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
python -m pip install pandas==2.2


# Checking the branch name, not necessary but useful when setting things up.
# - name: Extract branch name
# shell: bash
# run: echo "##[set-output name=branch;]$(echo ${GITHUB_REF#refs/heads/})"
# id: extract_branch


- name: Test with pytest with pandas 2.2
run: |
echo "GITHUB wspace: $GITHUB_WORKSPACE"
export PYTHONPATH=$GITHUB_WORKSPACE/:$GITHUB_WORKSPACE/METdbLoad:$GITHUB_WORKSPACE/METdbLoad/ush:$GITHUB_WORKSPACE/METreformat:$GITHUB_WORKSPACE/METreadnc
echo "PYTHONPATH is $PYTHONPATH"
cd $GITHUB_WORKSPACE/METreformat
cd test
py.test run_benchmark.py -v -s --benchmark-autosave


echo "Finished benchmark for pandas 2.2"

# - name: Archive benchmark results
# uses: actions/upload-artifact@v4
# with:
# name: benchmark-report
# path: /home/runner/work/METdataio/METdatio/METreformat/test/**
79 changes: 43 additions & 36 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ on:


pull_request:
types: [opened, reopened, synchronize]
types: [ opened, reopened, synchronize ]


jobs:
Expand All @@ -32,49 +32,56 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.10"]
python-version: [ "3.10" ]

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Retrieve METcalcpy repository develop branch
run: |
/usr/bin/git clone https://github.com/dtcenter/METcalcpy
cd METcalcpy
/usr/bin/git checkout develop
python -m pip install -e .


- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install pytest>=7.1.1
python -m pip install netcdf4==1.6.2
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi


# Checking the branch name, not necessary but useful when setting things up.
# - name: Extract branch name
# shell: bash
# run: echo "##[set-output name=branch;]$(echo ${GITHUB_REF#refs/heads/})"
# id: extract_branch


- name: Test with pytest
run: |
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Retrieve METcalcpy repository develop branch
run: |
/usr/bin/git clone https://github.com/dtcenter/METcalcpy
python -m pip install --upgrade pip
cd METcalcpy
/usr/bin/git checkout develop
python -m pip install -e .


- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install pytest>=7.1.1
python -m pip install netcdf4==1.6.2
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
python -m pip install pytest-cov

# Checking the branch name, not necessary but useful when setting things up.
# - name: Extract branch name
# shell: bash
# run: echo "##[set-output name=branch;]$(echo ${GITHUB_REF#refs/heads/})"
# id: extract_branch


- name: Test with pytest
run: |
echo "GITHUB wspace: $GITHUB_WORKSPACE"
export PYTHONPATH=$GITHUB_WORKSPACE/:$GITHUB_WORKSPACE/METdbLoad:$GITHUB_WORKSPACE/METdbLoad/ush:$GITHUB_WORKSPACE/METreformat:$GITHUB_WORKSPACE/METreadnc
echo "PYTHONPATH is $PYTHONPATH"

# test reformatter
cd $GITHUB_WORKSPACE/METreformat
cd test

pytest test_reformatting.py
pytest test_reformatting.py
pytest --cov

# test NetCDF reader
cd $GITHUB_WORKSPACE/METreadnc
cd test
pytest test_readnc.py
pytest --cov

echo "Finished unit tests and coverage"

echo "Finished unit tests"
25 changes: 23 additions & 2 deletions METdbLoad/ush/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@
PJC = "PJC" # Joint and Conditional Factorization for Probabilistic Forecasts
PRC = "PRC" # Receiver Operating Characteristic for Probabilistic Forecasts
ECLV = "ECLV" # Economic Cost/Loss Value derived from CTC and PCT lines
MPR = "MPR" # Matched Pair Data
MPR = "MPR" # Matched Pair TCDiag_Data
NBRCTC = "NBRCTC" # Neighborhood Contingency Table Counts
NBRCTS = "NBRCTS" # Neighborhood Contingency Table Statistics
NBRCNT = "NBRCNT" # Neighborhood Continuous Statistics
Expand All @@ -130,7 +130,7 @@
RPS = "RPS" # Ranked Probability Score
SSIDX = "SSIDX" # SKILL SCORE INDEX
SEEPS = "SEEPS" # Stable Equitable Error in Probability Space (SEEPS) score
SEEPS_MPR = "SEEPS_MPR" # SEEPS Matched Pair Data
SEEPS_MPR = "SEEPS_MPR" # SEEPS Matched Pair TCDiag_Data

# VSDB line types
BSS = "BSS" # same as PSTD
Expand Down Expand Up @@ -1337,6 +1337,12 @@
# N_THRESH value.
NUM_STATIC_PCT_COLS = 26

# Number of columns BEFORE the "variable" fields (i.e. the VERSION, MODEL, ..., LINETYPE, TOTAL,
# INDEX precede the variable columns DIAG_i and VALUE_i).
# The "variable" fields are DIAG_i and VALUE_i, the number of these is dictated by the N_DIAG
# value.
NUM_STATIC_TCDIAG_COLS = 20

# Number of columns before the RANK_i variable columns (i.e. VERSION, MODEL,...,LINETYPE, TOTAL, N_RANK), including
# the FCST_INIT_BEG and TOTAL columns.
NUM_STATIC_RHIST_COLS = 26
Expand Down Expand Up @@ -1642,5 +1648,20 @@
#### RHIST line type ####
LC_RHIST_VARIABLE_HEADERS = ['rank']

#### TCDIAG line type ####
# These are the header names assigned by the METdbLoad
# read_data_files module
TC_DIAG_TOTAL_COLNAME = '0'
TC_DIAG_INDEX_COLNAME = '1'
TCDIAG_DIAG_SOURCE_COLNAME = '2'
TCDIAG_TRACK_SOURCE_COLNAME = '3'
TCDIAG_FIELD_SOURCE_COLNAME = '4'
TCDIAG_N_DIAG_COLNAME = LINE_VAR_COUNTER[TCDIAG]
# Common name for identical fields with different identifiers (due to different DIAG_SOURCE)
# Used for cleaning up reformatted output
SHMAG = 'SHEAR_MAGNITUDE'
DLAND = 'DIST_TO_LAND'
SSPEED = 'STORM_SPEED'
TCDIAG_COMMON_NAMES = {'SHR_MAG':SHMAG, 'SHRD':SHMAG, 'LAND':DLAND, 'DTL':DLAND, 'STM_SPD':SSPEED }


2 changes: 1 addition & 1 deletion METreformat/test/RHIST.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ output_filename: reformatted_rhist_output.txt
input_data_dir: ./data/rhist_phist_relp_orank

# Set log_filename to STDOUT (or stdout, case-insensitive) if no log file is to be saved
log_directory: /path/to/log-dir
log_directory: ./output
log_filename: stdout
# most verbose is info, less verbose is error
log_level: info
Expand Down
Loading
Loading