Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Historical Trend Speed up #3665

Merged
merged 5 commits into from
Jul 28, 2023
Merged

Historical Trend Speed up #3665

merged 5 commits into from
Jul 28, 2023

Conversation

seanmarcia
Copy link
Member

@seanmarcia seanmarcia commented Jun 17, 2023

This is a somewhat substantial PR that resolves #3592

It attempts to resolve some of the current problems with the historical data pages as well as speed them up through caching.

Things this PR does:

  • Makes the chart go back a full year and make it relative to the current month rather than to just the start of the year for all banks.
  • It only shows the items that are active on a bank rather than all their items regardless if they use them or not.
  • Updates the query to only look at the last year rather than all time -- this should speed up when an ad hoc query needs to be run.
  • Creates a rake task cache_historical_data that will cache the historical data for all banks. This should be run once a day.
  • Created a background job that the rake task will run so that it can run in the background.
  • Updated the buttons on the historical data page to be more consistent with the rest of the site.
  • Made the charts all deselected by default.
  • Adds an is_active scope to the organization model to determine if an organization is active. It does this by querying if any of the org users have logged on in the past 4 months.
  • Most importantly it makes the loading of the charts really, really fast. 👍

Things to consider:

  • The rake task will need to be scheduled.
  • There wasn't a caching strategy defined in the production.rb so I used :redis_cache_store which is the most common and we already have redis installed.
  • I saw we use url: ENV['REDIS_URL'] elsewhere on the site (for coverband) so I assumed that was the correct environment variable to use for redis.

Screenshots

Before:

image

After:

image image

@seanmarcia seanmarcia force-pushed the historical_trends_caching branch 2 times, most recently from 43ebc9c to 9616a9a Compare June 17, 2023 16:22
@cielf
Copy link
Collaborator

cielf commented Jun 22, 2023

"Updates the query to only look at the last year rather than all time -- this should speed up when an ad hoc query needs to be run." -- Ultimately I'd like to see this going further back in time, so they can see trends. But I think we need some more thought, and some people actually using it at all before that's an issue.

Copy link
Collaborator

@dorner dorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall technical approach looks good to me! I had some suggestions to firm up the safety and performance around this.

@@ -0,0 +1,16 @@
desc "This task is run by a scheduling tool nightly to cache the processor intensive queries"
task :cache_historical_data => :environment do
puts "Caching historical data"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using Rails.logger here instead of puts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to logger with info

puts "Caching historical data"
DATA_TYPES = ['Distribution', 'Purchase', 'Donation']

orgs = Organization.all
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way for us to filter out old / inactive organizations from this list?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we got a definition of what an inactive organization is? (Have we marked them as such somehow?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, but we could define one as (say) one where no one has logged in for over 3 months.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say 4, just in case there's someone working on a quarterly basis. I doubt it, but I haven't ruled it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Created an is_active scope that looks to see if the organization has any users that have logged in in the past 4 months.

end

it "caches the historical data" do
expect(Rails.cache).to receive(:write).with("#{organization.short_name}-historical-#{type}-data", anything)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a more thorough test that shows the actual data being written? Even if it means stubbing out the HistoricalTrendService.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Stubbed out the response.

{name: "Item 2", data: [0, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0], visible: false}
]
expect(service.series).to eq(expected_result)
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a couple more tests here where we time travel to different months? Just to make sure nothing weird happens around the year end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. On the first item I created a donation line item for each of the 12 months in the series with the line item 12 months ago having 120, 11 having 110, etc.

…gned in in the past 4 months. Fleshed out tests, updated rake task to use logger.
Copy link
Collaborator

@dorner dorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@seanmarcia @cielf is there anything we need to do from a migration perspective here? Or are we good to merge it as is?

@cielf
Copy link
Collaborator

cielf commented Jun 30, 2023

@dorner Well, we have to set up the job to run once a day, right? How do we make that happen a) on staging and then, of course, remembering that it has to also be set up in prod.

@dorner
Copy link
Collaborator

dorner commented Jun 30, 2023

We can do something super dumb with e.g. https://github.com/jmettraux/rufus-scheduler. Or we can use https://github.com/sidekiq-cron/sidekiq-cron .

@cielf
Copy link
Collaborator

cielf commented Jun 30, 2023

We have other periodic jobs, right (like the backups?) Makes sense to me to use the same method on all of them.

@cielf
Copy link
Collaborator

cielf commented Jul 22, 2023

@dorner Let's talk about the actual procedure for getting this in with the periodic job too on Sunday

@scooter-dangle
Copy link
Collaborator

scooter-dangle commented Jul 23, 2023

@seanmarcia, per Sunday planning, would this be better to use a delayed job or the cron capability in Cloud66? (And can you do this plumbing? 🪠)

@seanmarcia
Copy link
Member Author

seanmarcia commented Jul 23, 2023

Happy to help with whatever. I was thinking of using the delay_job_recurring gem to handle this (and other similar) tasks.

Went with Rufus-scheduler since it is active an maintained.

@awwaiid awwaiid merged commit 0971ee0 into main Jul 28, 2023
12 checks passed
@awwaiid awwaiid deleted the historical_trends_caching branch July 28, 2023 19:58
@awwaiid
Copy link
Collaborator

awwaiid commented Jul 28, 2023

Thank you @seanmarcia !

@cielf
Copy link
Collaborator

cielf commented Jul 28, 2023

Cool. Thanks, @seanmarcia !

@awwaiid awwaiid added the Ruby for Good 2023 DC Created for Ruby for Good 2023 DC label Jul 30, 2023
@awwaiid
Copy link
Collaborator

awwaiid commented Jul 30, 2023

Tagged as RFG only because we pushed it the last bit while being together!

@cielf
Copy link
Collaborator

cielf commented Jul 30, 2023

I'm certainly for counting it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ruby for Good 2023 DC Created for Ruby for Good 2023 DC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Forecasting -- Improve initial load speed
5 participants