Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update class code cleaning to allow new omitted assessment class codes #353

Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
be6eb26
Update old class cleaning syntax
wrridgeway Mar 20, 2024
0d7ed11
Add class cleaning dbt test for pin_universe
wrridgeway Mar 20, 2024
2886755
Update raw instances of pardat.class with cleaning syntax
wrridgeway Mar 21, 2024
269d2da
Correct vw_pin_exempt typo
wrridgeway Mar 21, 2024
61dc0a6
Remove need to reference incorrect class '2-99' in view syntax
wrridgeway Mar 21, 2024
a2b6b36
Use correct cleaning syntax
wrridgeway Mar 21, 2024
28039b6
Add class hyphen tests
wrridgeway Mar 21, 2024
1a80875
Adjust test failure limit
wrridgeway Mar 21, 2024
487a884
Revert some test changes
wrridgeway Mar 21, 2024
052b7fc
Add tests to detect nulls for case whens that don't have else statements
wrridgeway Apr 16, 2024
e7643c0
Merge branch 'master' into 341-update-class-code-cleaning-to-allow-ne…
wrridgeway Apr 16, 2024
2619a1f
Remove white space
wrridgeway Apr 16, 2024
041646c
Correct yaml indenation
wrridgeway Apr 16, 2024
2523b3c
Update card logic in vw_pin_condo_char
wrridgeway Apr 16, 2024
08d6620
Correct table ref
wrridgeway Apr 16, 2024
9142abe
Make card and card_proration columns as inclusive as possible to mini…
wrridgeway Apr 16, 2024
5fe28ec
Merge branch 'master' into 341-update-class-code-cleaning-to-allow-ne…
wrridgeway Apr 17, 2024
6671671
Merge branch 'master' into 341-update-class-code-cleaning-to-allow-ne…
wrridgeway Apr 18, 2024
30bce22
Remove card_proration_rate dbt test for condos since it's always erro…
wrridgeway Apr 18, 2024
56a7f0a
Merge branch 'master' into 341-update-class-code-cleaning-to-allow-ne…
wrridgeway Apr 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions aws-athena/views/default-vw_pin_appeal.sql
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
-- View containing appeals by PIN
SELECT
htpar.parid AS pin,
pardat.class AS class,
REGEXP_REPLACE(pardat.class, '[^[:alnum:]]', '') AS class,
dfsnow marked this conversation as resolved.
Show resolved Hide resolved
legdat.user1 AS township_code,
htpar.taxyr AS year,
vwpv.mailed_bldg,
Expand All @@ -15,7 +15,9 @@ SELECT
WHEN htpar.user38 = 'CC'
OR (
htpar.user38 IN ('RS', 'IC')
AND pardat.class IN ('213', '297', '299', '399', '599')
AND REGEXP_REPLACE(pardat.class, '[^[:alnum:]]', '') IN (
'213', '297', '299', '399', '599'
)
) THEN 'condo/coop'
WHEN htpar.user38 = 'CE' THEN 'c of e - exempt'
WHEN htpar.user38 = 'CI' THEN 'c of e - incentive'
Expand Down
34 changes: 20 additions & 14 deletions aws-athena/views/default-vw_pin_condo_char.sql
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ oby_filtered AS (
-- We don't include DEACTIVAT IS NULL here since it can disagree with
-- DEACTIVAT in iasworld.pardat and we'll defer to that table
WHERE cur = 'Y'
AND class IN ('299', '2-99', '399')
AND class IN ('299', '399')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'2-99' is only an issue in pardat.

),

comdat_filtered AS (
Expand All @@ -50,7 +50,7 @@ comdat_filtered AS (
-- We don't include DEACTIVAT IS NULL here since it can disagree with
-- DEACTIVAT in iasworld.pardat and we'll defer to that table
WHERE cur = 'Y'
AND class IN ('299', '2-99', '399')
AND class IN ('299', '399')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'2-99' is only an issue in pardat.

),

-- All characteristics associated with condos in
Expand All @@ -60,7 +60,12 @@ chars AS (
SELECT
par.parid AS pin,
CASE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (blocking): For anything like this, where there's a CASE WHEN without an ELSE, we should add a dbt test for not null.

WHEN par.class IN ('299', '2-99') THEN oby.card
WHEN
(oby.card IS NULL OR com.card IS NULL)
THEN COALESCE(oby.card, com.card)
WHEN
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') = '299'
THEN oby.card
WHEN par.class = '399' THEN com.card
END AS card,
-- Proration related fields from PARDAT
Expand All @@ -75,8 +80,10 @@ chars AS (
ELSE 1.0
END AS tieback_proration_rate,
CASE
WHEN (oby.user20 IS NULL OR com.user24 IS NULL)
THEN CAST(COALESCE(oby.user20, com.user24) AS DOUBLE) / 100.0
WHEN
par.class IN ('299', '2-99')
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') = '299'
THEN CAST(oby.user20 AS DOUBLE) / 100.0
WHEN
par.class = '399'
Expand All @@ -85,12 +92,12 @@ chars AS (
oby.lline,
COALESCE(oby.num_lines, com.num_lines) AS num_lines,
SUBSTR(par.parid, 1, 10) AS pin10,
par.class,
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
par.taxyr AS year,
leg.user1 AS township_code,
CASE
WHEN
par.class IN ('299', '2-99')
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') = '299'
THEN oby.user16
WHEN
par.class = '399' AND nonlivable.flag != '399 GR'
Expand All @@ -102,7 +109,7 @@ chars AS (
-- Very rarely use 'effyr' rather than 'yrblt' when 'yrblt' is NULL
CASE
WHEN
par.class IN ('299', '2-99')
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') = '299'
THEN COALESCE(
oby.yrblt, oby.effyr, com.yrblt, com.effyr
)
Expand All @@ -115,7 +122,7 @@ chars AS (
MAX(
CASE
WHEN
par.class IN ('299', '2-99')
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') = '299'
THEN COALESCE(
oby.yrblt, oby.effyr, com.yrblt, com.effyr
)
Expand All @@ -141,7 +148,9 @@ chars AS (
COALESCE(SUM(
CASE
WHEN
par.class NOT IN ('299', '2-99', '399')
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') NOT IN (
'299', '399'
)
THEN 1
ELSE 0
END
Expand Down Expand Up @@ -219,7 +228,7 @@ filled AS (
OVER (PARTITION BY pin10, year)
AS building_pins
FROM chars
WHERE class IN ('299', '2-99', '399')
WHERE class IN ('299', '399')
)

SELECT DISTINCT
Expand All @@ -228,10 +237,7 @@ SELECT DISTINCT
filled.card,
filled.lline,
filled.year,
CASE WHEN filled.class = '2-99'
THEN '299'
ELSE filled.class
END AS class,
filled.class,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed in the CTE above.

filled.township_code,
filled.num_lines > 1 AS pin_is_multilline,
filled.num_lines AS pin_num_lline,
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/default-vw_pin_exempt.sql
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ SELECT
leg.user1 AS township_code,
own.own1 AS owner_name,
own.ownnum AS owner_num,
par.class,
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
vpa.prop_address_full AS property_address,
vpa.prop_address_city_name AS property_city,
--- Forward fill lat and long
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/default-vw_pin_sale.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
WITH town_class AS (
SELECT
par.parid,
par.class,
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
par.taxyr,
leg.user1 AS township_code,
CONCAT(
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/default-vw_pin_universe.sql
dfsnow marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to future selves that this is just a line ending change

Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ SELECT
par.parid AS pin,
SUBSTR(par.parid, 1, 10) AS pin10,
par.taxyr AS year,
REGEXP_REPLACE(par.class, '([^0-9EXR])', '') AS class,
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
twn.triad_name,
twn.triad_code,
twn.township_name,
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/model-vw_pin_shared_input.sql
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ WITH uni AS (
par.parid AS pin,
SUBSTR(par.parid, 1, 10) AS pin10,
par.taxyr AS year,
REGEXP_REPLACE(par.class, '([^0-9EXR])', '') AS class,
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
twn.triad_name,
twn.triad_code,
twn.township_name,
Expand Down
6 changes: 4 additions & 2 deletions aws-athena/views/reporting-vw_ratio_stats.sql
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ WITH classes AS (
SELECT
par.parid,
par.taxyr,
par.class,
CASE WHEN par.class IN ('299', '399') THEN 'CONDO'
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') AS class,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Likewise on adding a not null test here.

Copy link
Member Author

@wrridgeway wrridgeway Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property_group is actually supposed to come back NULL for classes not in our regression classes.

CASE WHEN
REGEXP_REPLACE(par.class, '[^[:alnum:]]', '') IN ('299', '399')
THEN 'CONDO'
WHEN par.class IN ('211', '212') THEN 'MF'
WHEN
par.class IN (
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/reporting-vw_top_5.sql
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ classes AS (
SELECT
parid,
taxyr,
class,
REGEXP_REPLACE(class, '[^[:alnum:]]', '') AS class,
NULLIF(CONCAT_WS(
' ',
adrpre, CAST(adrno AS VARCHAR),
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/rpie-vw_code_retrieval.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ SELECT
pin_codes.pin,
pin_codes.year,
pin_codes.rpie_code,
pardat.class,
REGEXP_REPLACE(pardat.class, '[^[:alnum:]]', '') AS class,
owndat.own1 AS mailing_name,
owndat.addr1 AS mailing_addr1,
CONCAT_WS(
Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/rpie-vw_pin_class_flatfile.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ WITH classes AS (
SELECT
parid AS pin,
CAST(CAST(taxyr AS INT) + 1 AS VARCHAR) AS year,
class
REGEXP_REPLACE(class, '[^[:alnum:]]', '') AS class
FROM {{ source('iasworld', 'pardat') }}
)

Expand Down
2 changes: 1 addition & 1 deletion aws-athena/views/rpie-vw_pin_mailers.sql
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ WITH parcel_addressess AS (
NULLIF(zip1, '00000'),
NULLIF(zip2, '0000')
) AS property_zip,
class,
REGEXP_REPLACE(class, '[^[:alnum:]]', '') AS class,
SUBSTR(class, 1, 1) AS major_class,
CASE
WHEN SUBSTR(
Expand Down
4 changes: 4 additions & 0 deletions dbt/models/default/schema/default.vw_pin_appeal.yml
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,7 @@ models:
- row_count:
name: default_vw_pin_appeal_row_count
above: 8407667 # as of 2023-11-22
- not_accepted_values:
name: default_vw_pin_appeal_class_no_hyphens
column_name: class
values: "2-99"
14 changes: 14 additions & 0 deletions dbt/models/default/schema/default.vw_pin_condo_char.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,20 @@ models:
- year
config:
error_if: ">70297"
- not_null:
name: default_vw_pin_condo_char_card_not_null
column_name: card
dfsnow marked this conversation as resolved.
Show resolved Hide resolved
- not_null:
name: default_vw_pin_condo_char_card_proration_rate_not_null
column_name: card_proration_rate
config:
error_if: ">49999" # as of 2024-04-16
- not_null:
name: default_vw_pin_condo_char_char_yrblt_not_null
column_name: char_yrblt
- not_null:
name: default_vw_pin_condo_char_bldg_is_mixed_use_not_null
column_name: bldg_is_mixed_use
# TODO: Non-liveable unit heuristics
# TODO: Row count matches PARDAT for condo classes
# TODO: Non-unit and unit parcel count add up to total parcels per
Expand Down
4 changes: 4 additions & 0 deletions dbt/models/default/schema/default.vw_pin_exempt.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,7 @@ models:
expression: REGEXP_COUNT(pin, '[0-9]') = 14 AND LENGTH(pin) = 14
additional_select_columns:
- pin
- not_accepted_values:
name: default_vw_pin_exempt_class_no_hyphens
column_name: class
values: "2-99"
4 changes: 4 additions & 0 deletions dbt/models/default/schema/default.vw_pin_sale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,10 @@ models:
additional_select_columns:
- num_parcels_sale
- is_multisale
- not_accepted_values:
name: default_vw_pin_sale_class_no_hyphens
column_name: class
values: "2-99"
- expression_is_true:
name: default_vw_pin_sale_sale_filter_deed_type
expression: NOT sale_filter_deed_type OR deed_type IN ('03', '04', '06')
Expand Down
4 changes: 4 additions & 0 deletions dbt/models/default/schema/default.vw_pin_universe.yml
Original file line number Diff line number Diff line change
Expand Up @@ -259,5 +259,9 @@ models:
expression: REGEXP_COUNT(pin, '[0-9]') = 14 AND LENGTH(pin) = 14
additional_select_columns:
- pin
- not_accepted_values:
name: default_vw_pin_universe_class_no_hyphens
column_name: class
values: "2-99"
# TODO: Data completeness correlates with availability of spatial data
# by year
4 changes: 4 additions & 0 deletions dbt/models/reporting/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,10 @@ models:
AND rank IS NOT NULL
AND total_av IS NOT NULL
AND stage_used IS NOT NULL
- not_accepted_values:
name: reporting_vw_top_5_class_no_hyphens
column_name: class
values: "2-99"

- name: reporting.vw_town_sale_history
description: '{{ doc("view_vw_town_sale_history") }}'
Expand Down
Loading