Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add YTD total to fct_monthly_active_students #242

Merged
merged 8 commits into from
Nov 27, 2024

Conversation

jordan-springer
Copy link
Collaborator

@jordan-springer jordan-springer commented Nov 25, 2024

Description

Add school_year to date total to rows in fct_monthly_active_students

  1. calculate first activity month for each student
  2. count distinct students for each first activity month
  3. sum the total unique first activities over the window for school year

Links

Jira ticket(s): DATAOPS-1068

Testing story

Example for this school_year, US:
image

cc @coryamanda

  • Does your change include appropriate tests on key columns?
    eg.
    - not_null
    - unique
    - `dbt_utils.unique_combination_of_columns: , ["value","value","value"...]

Note: when submitting a new model for review please make sure the following have been tested:

  1. The model compiles (dbt build -m 'your_model')
    or: has the dbt Cloud job succeeded?
  2. The model runs (dbt run -m 'your_model')
  3. The model produces accessible data in the DW (select 1 from 'your_model')

Privacy

  • 1. Does this change involve the collection, use, or sharing of new Personal Data?
  • 2. Do these data exist in the appropriate schema(s)?
  • 3. Does this change involve a new or changed use or sharing of existing Personal Data?
  • 4. Consider: will this data be visible on Tableau? will this data be surfaced in a report exported from Trevor?
  • 5. If yes to any of the above, please list the models, columns, and justification below:
    i.
    ii.
    iii.

PR Checklist:

--> Note: if these are not all checked, the PR will be sent back.

  • Tests provide adequate coverage
  • Privacy and Security impacts have been assessed
  • Code adheres to style guide👀 and is DRY
  • Code is well-commented (**please do not leave extraneous commentary in model code, if it is for the purpose of documentation, please relocate accordingly)
  • Appropriate documentation has been provided (see .yml., did dbt docs generate succeed?)
  • New features are translatable or updates will not break up/downstream models
  • Relevant documentation has been added or updated (i.e. dbt docs has been updated successfully on Github Pages
  • Pull Request is labeled appropriately (eg. chore/, feature/, fix/)
  • Follow-up work items (including potential tech debt) are tracked and linked (if applicable)

@jordan-springer jordan-springer changed the title stashing Add YTD totals to active student fact models Nov 25, 2024
@jordan-springer jordan-springer self-assigned this Nov 25, 2024
@jordan-springer jordan-springer changed the title Add YTD totals to active student fact models Add YTD total to fct_monthly_active_students Nov 25, 2024
@jordan-springer jordan-springer marked this pull request as ready for review November 25, 2024 19:43
@coryamanda
Copy link
Collaborator

Just for the sake of comprehensiveness - this is double-counting students who use the platform on consecutive months. I need the school year to date numbers for unique students.

So something like:
Month | Country | num_active_students | sytd_active_students
7/1/2024 | US | 1M | 1M <- In July they're equal because it's the beginning of the school year
8/1/2024 | US | 2M | 2.75M <- In August the number of YTD will be lower than the sum of July + August because there will be some duplication
9/1/2024 | US | 4M | 6.25M (edited)
...
6/1/2025 | US | 2M | 9M >- The June school year to date number will be equivalent to the 2024-25 school year-end count of unique students

This is answering this question:
"We're flat October 2024 compared to October 2023. We were 3% down September 2024 compared to September 2023. How are we doing YOY overall?"

@coryamanda
Copy link
Collaborator

I found two issues in the previous version:

  • It was still double counting students (the sum over partition was on num_active_students, not the num_active_students_ytd field)
  • First active month didn't include the school year, so it was taking their first active month ever in the account rather than the first month that school year.

I pushed a commit to fix those two things. It was a breaking change, as my version does not run. There's an error with the last CTE, I think with the partition formatting. But the logic up in the CTEs above (up through final) should now be working, so hopefully the issue in the rolling_final an easy fix. Can you look at this again and see if you can figure out the partition logic in the last step?

Copy link
Collaborator

@coryamanda coryamanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great - thanks!

@coryamanda coryamanda merged commit be87b1f into main Nov 27, 2024
1 check passed
@coryamanda coryamanda deleted the feature/add_ytd_to_fct_active_students branch November 27, 2024 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants