Output request queue time for Django #55

OldSneerJaw · 2019-10-04T23:37:00Z

HireFire includes a strategy for scaling web dynos based on Heroku request queue times. This change adds a second Django middleware with support for queue time scaling adhering to the official docs: https://help.hirefire.io/article/49-logplex-queue-time. I've verified that it works correctly end-to-end as a dependency in a WIP branch of a production application (closed source, so I can't link to it).

codecov · 2019-10-04T23:39:00Z

Codecov Report

Merging #55 into master will increase coverage by 1.02%.
The diff coverage is 85.71%.

@@            Coverage Diff             @@
##           master      #55      +/-   ##
==========================================
+ Coverage   54.63%   55.66%   +1.02%     
==========================================
  Files          11       11              
  Lines         410      424      +14     
==========================================
+ Hits          224      236      +12     
- Misses        186      188       +2

Impacted Files	Coverage Δ
hirefire/contrib/django/middleware.py	`90.56% <85.71%> (-1.75%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf8686d...241a0cc. Read the comment docs.

HireFire includes a strategy for scaling web dynos based on Heroku request queue times. This change adds Django support for queue time scaling according to the official docs: https://help.hirefire.io/article/49-logplex-queue-time.

ryanhiebert

This is neat. I don't use the HireFire service anymore for my projects (though I do still use this package to get metrics), but this seems like a cool way to scale the web dynos.

Are you up for maintaining the code in this PR for this package? I don't plan on using it, so I won't easily be able to tell if something's not working with it.

ryanhiebert · 2019-10-07T11:49:29Z

hirefire/contrib/django/middleware.py

+                    now_timestamp_ms - request_start_timestamp_ms
+                    if now_timestamp_ms >= request_start_timestamp_ms
+                    else 0
+                )


Does HireFire not understand clock drift? It seems pretty apparent to me that this is throwing away data, and I hate to see that happen.

No, I believe that, unfortunately, the HireFire service will happily ingest a negative queue time value, which affects the aggregate. I think that's less of an issue when one is using a percentile for scaling (e.g. we use 95th percentile queue time in production), but if one uses an average queue time, it could potentially produce an average queue time that is a negative value. Since HireFire has a minimum value of 1ms for the up/down threshold settings, that seems a bit problematic.

Arguably some smarts could be implemented here to add an offset equal to the absolute value of the smallest negative value it has encountered to all future timestamps from the current dyno, but I had hoped to keep things simple. I don't think there's a perfect solution though, since there's always the possibility that the system clock on a dyno will be off compared to the clock on the Heroku router. Anecdotally, however, I've noticed the clock difference between the Heroku router and any given dyno to remain generally within +/-10ms.

All that said, I'm open to suggestions.

It's my suggestion that the times be passed along exactly as calculated by default. However, I'd be fine with an option to round off, as long as there's a documentation note about what it does and the trade-offs associated with it.

Does that sound alright to you? I just hate to have a default of losing data, even if the data is less than ideal.

OldSneerJaw · 2019-10-07T18:56:03Z

This is neat. I don't use the HireFire service anymore for my projects (though I do still use this package to get metrics), but this seems like a cool way to scale the web dynos.

Are you up for maintaining the code in this PR for this package? I don't plan on using it, so I won't easily be able to tell if something's not working with it.

Sure, I can look after this code. I don't expect it will evolve much, if at all, but I will be using it in production, so I don't mind.

- Remove an annoying warning (#57) - Output request queue time for Django (#55)

Output request queue time for Django

241a0cc

HireFire includes a strategy for scaling web dynos based on Heroku request queue times. This change adds Django support for queue time scaling according to the official docs: https://help.hirefire.io/article/49-logplex-queue-time.

ryanhiebert reviewed Oct 7, 2019

View reviewed changes

OldSneerJaw requested a review from ryanhiebert October 21, 2019 17:42

ryanhiebert merged commit df7f650 into ryanhiebert:master Aug 14, 2020

ryanhiebert added a commit that referenced this pull request Aug 14, 2020

Release version 1.0

665217e

- Remove an annoying warning (#57) - Output request queue time for Django (#55)

OldSneerJaw deleted the request-queue-time branch October 30, 2020 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output request queue time for Django #55

Output request queue time for Django #55

OldSneerJaw commented Oct 4, 2019

codecov bot commented Oct 4, 2019 •

edited

Loading

ryanhiebert left a comment

ryanhiebert Oct 7, 2019

OldSneerJaw Oct 7, 2019

ryanhiebert Oct 21, 2019

ryanhiebert Oct 21, 2019

OldSneerJaw commented Oct 7, 2019

Output request queue time for Django #55

Output request queue time for Django #55

Conversation

OldSneerJaw commented Oct 4, 2019

codecov bot commented Oct 4, 2019 • edited Loading

Codecov Report

ryanhiebert left a comment

Choose a reason for hiding this comment

ryanhiebert Oct 7, 2019

Choose a reason for hiding this comment

OldSneerJaw Oct 7, 2019

Choose a reason for hiding this comment

ryanhiebert Oct 21, 2019

Choose a reason for hiding this comment

ryanhiebert Oct 21, 2019

Choose a reason for hiding this comment

OldSneerJaw commented Oct 7, 2019

codecov bot commented Oct 4, 2019 •

edited

Loading