Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stage-level classes to default.vw_pin_history #338

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion aws-athena/views/default-vw_pin_history.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,18 @@
SELECT
vwpv.pin,
vwpv.year,
par.class,
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
Copy link
Member Author

@wrridgeway wrridgeway Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are classes in pardat with hyphens in them. This new syntax should avoid stripping out A & B suffixes from our class codes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Are other class fields in other views treated the same way? If not, we need to standardize this across views.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will need to update class code cleaning in other views. I've already made an issue for it.

leg.user1 AS township_code,
town.township_name,
vwpv.mailed_class,
vwpv.mailed_bldg,
vwpv.mailed_land,
vwpv.mailed_tot,
vwpv.certified_class,
vwpv.certified_bldg,
vwpv.certified_land,
vwpv.certified_tot,
vwpv.board_class,
vwpv.board_bldg,
vwpv.board_land,
vwpv.board_tot,
Expand Down
39 changes: 30 additions & 9 deletions aws-athena/views/default-vw_pin_value.sql
Original file line number Diff line number Diff line change
Expand Up @@ -11,57 +11,78 @@ WITH stage_values AS (
parid AS pin,
taxyr AS year,
-- Mailed values
MAX(
ARBITRARY(
CASE
WHEN
procname = 'CCAOVALUE'
THEN REGEXP_REPLACE(class, '[^[:alnum:]]', '')
END
) AS mailed_class,
ARBITRARY(
CASE
WHEN procname = 'CCAOVALUE' AND taxyr < '2020' THEN ovrvalasm2
WHEN procname = 'CCAOVALUE' AND taxyr >= '2020' THEN valasm2
END
) AS mailed_bldg,
MAX(
ARBITRARY(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dfsnow for the suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about all the other crazy Trino aggregate functions, like max_by().

CASE
WHEN procname = 'CCAOVALUE' AND taxyr < '2020' THEN ovrvalasm1
WHEN procname = 'CCAOVALUE' AND taxyr >= '2020' THEN valasm1
END
) AS mailed_land,
MAX(
ARBITRARY(
CASE
WHEN procname = 'CCAOVALUE' AND taxyr < '2020' THEN ovrvalasm3
WHEN procname = 'CCAOVALUE' AND taxyr >= '2020' THEN valasm3
END
) AS mailed_tot,
-- Assessor certified values
MAX(
ARBITRARY(
CASE
WHEN
procname = 'CCAOFINAL'
THEN REGEXP_REPLACE(class, '[^[:alnum:]]', '')
END
) AS certified_class,
ARBITRARY(
CASE
WHEN procname = 'CCAOFINAL' AND taxyr < '2020' THEN ovrvalasm2
WHEN procname = 'CCAOFINAL' AND taxyr >= '2020' THEN valasm2
END
) AS certified_bldg,
MAX(
ARBITRARY(
CASE
WHEN procname = 'CCAOFINAL' AND taxyr < '2020' THEN ovrvalasm1
WHEN procname = 'CCAOFINAL' AND taxyr >= '2020' THEN valasm1
END
) AS certified_land,
MAX(
ARBITRARY(
CASE
WHEN procname = 'CCAOFINAL' AND taxyr < '2020' THEN ovrvalasm3
WHEN procname = 'CCAOFINAL' AND taxyr >= '2020' THEN valasm3
END
) AS certified_tot,
-- Board certified values
MAX(
ARBITRARY(
CASE
WHEN
procname = 'BORVALUE'
THEN REGEXP_REPLACE(class, '[^[:alnum:]]', '')
END
) AS board_class,
ARBITRARY(
CASE
WHEN procname = 'BORVALUE' AND taxyr < '2020' THEN ovrvalasm2
WHEN procname = 'BORVALUE' AND taxyr >= '2020' THEN valasm2
END
) AS board_bldg,
MAX(
ARBITRARY(
CASE
WHEN procname = 'BORVALUE' AND taxyr < '2020' THEN ovrvalasm1
WHEN procname = 'BORVALUE' AND taxyr >= '2020' THEN valasm1
END
) AS board_land,
MAX(
ARBITRARY(
CASE
WHEN procname = 'BORVALUE' AND taxyr < '2020' THEN ovrvalasm3
WHEN procname = 'BORVALUE' AND taxyr >= '2020' THEN valasm3
Expand Down
6 changes: 6 additions & 0 deletions dbt/models/default/schema/default.vw_pin_history.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,16 @@ models:
columns:
- name: board_bldg
description: '{{ doc("shared_column_board_bldg") }}'
- name: board_class
description: '{{ doc("shared_column_board_class") }}'
- name: board_land
description: '{{ doc("shared_column_board_land") }}'
- name: board_tot
description: '{{ doc("shared_column_board_tot") }}'
- name: certified_bldg
description: '{{ doc("shared_column_certified_bldg") }}'
- name: certified_class
description: '{{ doc("shared_column_certified_class") }}'
- name: certified_land
description: '{{ doc("shared_column_certified_land") }}'
- name: certified_tot
Expand All @@ -21,6 +25,8 @@ models:
description: '{{ doc("shared_column_class") }}'
- name: mailed_bldg
description: '{{ doc("shared_column_mailed_bldg") }}'
- name: mailed_class
description: '{{ doc("shared_column_mailed_class") }}'
- name: mailed_land
description: '{{ doc("shared_column_mailed_land") }}'
- name: mailed_tot
Expand Down
46 changes: 46 additions & 0 deletions dbt/models/default/schema/default.vw_pin_value.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,16 @@ models:
columns:
- name: board_bldg
description: '{{ doc("shared_column_board_bldg") }}'
- name: board_class
description: '{{ doc("shared_column_board_class") }}'
- name: board_land
description: '{{ doc("shared_column_board_land") }}'
- name: board_tot
description: '{{ doc("shared_column_board_tot") }}'
- name: certified_bldg
description: '{{ doc("shared_column_certified_bldg") }}'
- name: certified_class
description: '{{ doc("shared_column_certified_class") }}'
- name: certified_land
description: '{{ doc("shared_column_certified_land") }}'
- name: certified_tot
Expand All @@ -19,6 +23,8 @@ models:
description: '{{ doc("shared_column_change_reason") }}'
- name: mailed_bldg
description: '{{ doc("shared_column_mailed_bldg") }}'
- name: mailed_class
description: '{{ doc("shared_column_mailed_class") }}'
- name: mailed_land
description: '{{ doc("shared_column_mailed_land") }}'
- name: mailed_tot
Expand All @@ -43,8 +49,48 @@ models:
description: '{{ doc("shared_column_year") }}'

tests:
- not_accepted_values:
name: default_vw_pin_value_mailed_class_no_hyphens
column_name: mailed_class
values: "2-99"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure if wrapping this in double quotes is correct. The test ran without erring with no quotes as well...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a dash in it then it's going to be converted into a string anyways, so this should be fine.

- unique_combination_of_columns:
name: default_vw_pin_value_unique_by_14_digit_pin_and_year
combination_of_columns:
- pin
- year
- not_null:
name: default_vw_pin_value_mailed_class_not_null
column_name: mailed_class
config:
where: CAST(year AS int) < {{ var('test_qc_year_start') }}
error_if: ">289" # as of 2024-03-15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this system of ignoring nulls. Is there are better way to handle them? Should we include them in the QC sheets? @jeancochrane do you have thoughts here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfsnow If the failure is caused by errors in an underlying iasWorld table, I do think it's worth adding a test to that table so that it gets added to the worksheet and fixed. It's a little bit harder to think about how to handle tests that are intended to test for incorrect logic in our views, but that can be led to fail based on issues in the underlying data; ideally we would be able to fully separate logic tests from data tests, probably by running our logic tests using a set of fixtures that populate tables with test data, but that's a much bigger lift and likely not possible in some cases. My first thought at an 80/20 solution would be to restrict the date range for the test to a time period where we know there are no errors, but we can discuss more during our infrastructure discussion if that option doesn't work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's discuss more during Infrastructure Week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wrridgeway On second thought, let's split this out into a separate issue.

- not_null:
name: default_vw_pin_value_certified_class_not_null
column_name: certified_class
config:
where: CAST(year AS int) < {{ var('test_qc_year_start') }}
error_if: ">13" # as of 2024-03-15
- not_null:
name: default_vw_pin_value_board_class_not_null
column_name: board_class
config:
where: CAST(year AS int) < {{ var('test_qc_year_start') }}
error_if: ">1260" # as of 2024-03-15
- not_null:
name: default_vw_pin_value_mailed_tot_not_null
column_name: mailed_tot
config:
where: CAST(year AS int) < {{ var('test_qc_year_start') }}
error_if: ">310" # as of 2024-03-15
- not_null:
name: default_vw_pin_value_certified_tot_not_null
column_name: certified_tot
config:
where: CAST(year AS int) < {{ var('test_qc_year_start') }}
error_if: ">13" # as of 2024-03-15
- not_null:
name: default_vw_pin_value_board_tot_not_null
column_name: board_tot
config:
where: CAST(year AS int) < {{ var('test_qc_year_start') }}
error_if: ">1260" # as of 2024-03-15
79 changes: 56 additions & 23 deletions dbt/models/shared_columns.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,17 @@ Board of Review assessed value of building from year specified by column
prefix (or year of observation if not prefixed)
{% enddocs %}

## board_class
wrridgeway marked this conversation as resolved.
Show resolved Hide resolved

{% docs shared_column_board_class %}
Stage-level property type and/or use at the time of BOR certification.

Designates the property type, such as vacant, residential, multi-family,
agricultural, commercial or industrial. The classification determines the
percentage of fair cash value at which a property is assessed for taxing
purposes. See `ccao.class_dict` for more information
{% enddocs %}

## board_land

{% docs shared_column_board_land %}
Expand All @@ -218,6 +229,40 @@ Calculation parameter.
If present, must be `E`.
{% enddocs %}

## certified_bldg

{% docs shared_column_certified_bldg %}
Certified assessed value of building from year specified by column
prefix (or year of observation if not prefixed)
{% enddocs %}

## certified_class
wrridgeway marked this conversation as resolved.
Show resolved Hide resolved

{% docs shared_column_certified_class %}
Stage-level property type and/or use at the time of CCAO certification.

Designates the property type, such as vacant, residential, multi-family,
agricultural, commercial or industrial. The classification determines the
percentage of fair cash value at which a property is assessed for taxing
purposes. See `ccao.class_dict` for more information
{% enddocs %}

## certified_land

{% docs shared_column_certified_land %}
Certified assessed value of land from year specified by column
prefix (or year of observation if not prefixed)
{% enddocs %}

## certified_tot

{% docs shared_column_certified_tot %}
Certified total assessed value from year specified by column
prefix (or year of observation if not prefixed).

This is the value after the first round of appeals at the Assessor's Office.
{% enddocs %}

## change_reason

{% docs shared_column_change_reason %}
Expand Down Expand Up @@ -279,29 +324,6 @@ Reason for change in assessed value. Possible values for this variable are:
- `92` = Flood Debasement
{% enddocs %}

## certified_bldg

{% docs shared_column_certified_bldg %}
Certified assessed value of building from year specified by column
prefix (or year of observation if not prefixed)
{% enddocs %}

## certified_land

{% docs shared_column_certified_land %}
Certified assessed value of land from year specified by column
prefix (or year of observation if not prefixed)
{% enddocs %}

## certified_tot

{% docs shared_column_certified_tot %}
Certified total assessed value from year specified by column
prefix (or year of observation if not prefixed).

This is the value after the first round of appeals at the Assessor's Office.
{% enddocs %}

## external_calc_rcnld

{% docs shared_column_external_calc_rcnld %}
Expand All @@ -324,6 +346,17 @@ Mailed assessed value of building from year specified by column
prefix (or year of observation if not prefixed)
{% enddocs %}

## mailed_class
wrridgeway marked this conversation as resolved.
Show resolved Hide resolved

{% docs shared_column_mailed_class %}
Stage-level property type and/or use at the time of CCAO mailing.

Designates the property type, such as vacant, residential, multi-family,
agricultural, commercial or industrial. The classification determines the
percentage of fair cash value at which a property is assessed for taxing
purposes. See `ccao.class_dict` for more information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mega-nitpick: Multi-line doc strings should have a period ending:

Suggested change
purposes. See `ccao.class_dict` for more information
purposes. See `ccao.class_dict` for more information.

{% enddocs %}

## mailed_land

{% docs shared_column_mailed_land %}
Expand Down
Loading