-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add max speed & speed histogram to drive stats #4253
feat: add max speed & speed histogram to drive stats #4253
Conversation
✅ Deploy Preview for teslamate ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Hi @js94x - thanks for this suggestions. to ensure dashboards & panels are loading quick could you add an |
Hi @swiffer , sure, I just added max speed queryteslamate=# explain analyze SELECT max(speed) AS speed_kmh
FROM positions
WHERE car_id = '1' AND date BETWEEN '2023-10-09T20:20:50.849Z' AND '2024-10-09T20:20:50.849Z';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=39555.95..39555.96 rows=1 width=2) (actual time=119.718..123.784 rows=1 loops=1)
-> Gather (cost=39555.73..39555.94 rows=2 width=2) (actual time=119.612..123.776 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=38555.73..38555.74 rows=1 width=2) (actual time=109.068..109.069 rows=1 loops=3)
-> Parallel Seq Scan on positions (cost=0.00..36665.64 rows=756037 width=2) (actual time=0.008..87.262 rows=604294 loops=3)
Filter: ((date >= '2023-10-09 20:20:50.849'::timestamp without time zone) AND (date <= '2024-10-09 20:20:50.849'::timestamp without time zone) AND (car_id = '1'::smallint))
Planning Time: 0.323 ms
Execution Time: 123.813 ms
(9 rows) histogram query--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=207701.97..207702.47 rows=200 width=96) (actual time=3389.331..3389.333 rows=20 loops=1)
Sort Key: drivedata.speed_section_km
Sort Method: quicksort Memory: 26kB
-> HashAggregate (cost=207687.83..207694.33 rows=200 width=96) (actual time=3389.295..3389.320 rows=20 loops=1)
Group Key: drivedata.speed_section_km
Batches: 1 Memory Usage: 48kB
-> Subquery Scan on drivedata (cost=0.72..203151.61 rows=604829 width=96) (actual time=10.814..3013.453 rows=1697374 loops=1)
Filter: (drivedata.speed_section_km > '0'::numeric)
Rows Removed by Filter: 85841
-> WindowAgg (cost=0.72..180470.51 rows=1814488 width=104) (actual time=10.806..2883.487 rows=1783215 loops=1)
-> Nested Loop (cost=0.72..116963.43 rows=1814488 width=26) (actual time=10.784..1254.866 rows=1783215 loops=1)
-> Index Scan using positions_date_index on positions p (cost=0.43..71381.04 rows=1814488 width=14) (actual time=0.014..611.027 rows=1812882 loops=1)
-> Memoize (cost=0.29..0.31 rows=1 width=20) (actual time=0.000..0.000 rows=1 loops=1812882)
Cache Key: p.drive_id
Cache Mode: logical
Hits: 1811893 Misses: 989 Evictions: 0 Overflows: 0 Memory Usage: 116kB
-> Index Scan using trips_pkey on drives d (cost=0.28..0.30 rows=1 width=20) (actual time=0.002..0.002 rows=1 loops=989)
Index Cond: (id = p.drive_id)
Filter: ((start_date >= '2023-10-09 20:16:27.717'::timestamp without time zone) AND (start_date <= '2024-10-09 20:16:27.717'::timestamp without time zone) AND (car_id = '1'::smallint))
Planning Time: 0.388 ms
JIT:
Functions: 24
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 0.995 ms, Inlining 0.000 ms, Optimization 0.439 ms, Emission 10.316 ms, Total 11.751 ms
Execution Time: 3390.428 ms
(25 rows)
|
I made a fix to the query to display real percentage values and changed color to a static one. New query: queryWITH TotalDuration AS ( SELECT SUM(seconds_elapsed) AS total_elapsed_time FROM ( SELECT ROUND(convert_km(p.speed::numeric, '$length_unit') / 10, 0) * 10 AS speed_section_$length_unit, EXTRACT(EPOCH FROM (LEAD(p."date") OVER (ORDER BY p."date") - p."date")) AS seconds_elapsed FROM drives d INNER JOIN positions p ON p.drive_id = d.id WHERE d.car_id = $car_id AND $__timeFilter(d.start_date) ) AS drivedata WHERE speed_section_$length_unit > 0 ) SELECT speed_section_$length_unit AS "Speed", SUM(seconds_elapsed) * 100 / (SELECT total_elapsed_time FROM TotalDuration) AS "Elapsed", -- Normalized percentage TO_CHAR((SUM(seconds_elapsed) || ' second')::interval, 'HH24:MI:SS') AS "Time" FROM ( SELECT ROUND(convert_km(p.speed::numeric, '$length_unit') / 10, 0) * 10 AS speed_section_$length_unit, EXTRACT(EPOCH FROM (LEAD(p."date") OVER (ORDER BY p."date") - p."date")) AS seconds_elapsed, EXTRACT(EPOCH FROM (end_date - start_date)) AS duration FROM drives d INNER JOIN positions p ON p.drive_id = d.id WHERE d.car_id = $car_id AND $__timeFilter(d.start_date) ) AS drivedata WHERE speed_section_$length_unit > 0 GROUP BY 1 ORDER BY 1 Unfortunately the overall query time increased a bit (checked with explain analyze). @swiffer I saw you tuned some queries here #4252, maybe you can give me a hint how to optimize my speed histogram query? |
Just changed the query again (after asked chatgpt to optimize it); now it is much shorter and results in better performance. WITH drivedata AS (
SELECT
ROUND(convert_km(p.speed::numeric, '$length_unit') / 10, 0) * 10 AS speed_section_$length_unit,
EXTRACT(EPOCH FROM (LEAD(p."date") OVER (ORDER BY p."date") - p."date")) AS seconds_elapsed
FROM drives d
INNER JOIN positions p ON p.drive_id = d.id
WHERE d.car_id = $car_id AND $__timeFilter(d.start_date)
),
TotalDuration AS (
SELECT SUM(seconds_elapsed) AS total_elapsed_time
FROM drivedata
WHERE speed_section_$length_unit > 0
)
SELECT
speed_section_$length_unit AS "Speed",
SUM(seconds_elapsed) * 100 / (SELECT total_elapsed_time FROM TotalDuration) AS "Elapsed",
TO_CHAR((SUM(seconds_elapsed) || ' second')::interval, 'HH24:MI:SS') AS "Time"
FROM drivedata
WHERE speed_section_$length_unit > 0
GROUP BY speed_section_$length_unit
ORDER BY speed_section_$length_unit; explain analyzeQUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=257600.42..257600.92 rows=200 width=96) (actual time=3958.840..3958.844 rows=20 loops=1) Sort Key: drivedata.speed_section_km Sort Method: quicksort Memory: 26kB CTE drivedata -> WindowAgg (cost=0.72..171398.07 rows=1814488 width=72) (actual time=11.939..2646.071 rows=1783215 loops=1) -> Nested Loop (cost=0.72..116963.43 rows=1814488 width=10) (actual time=11.921..1248.612 rows=1783215 loops=1) -> Index Scan using positions_date_index on positions p (cost=0.43..71381.04 rows=1814488 width=14) (actual time=0.019..616.673 rows=1812882 loops=1) -> Memoize (cost=0.29..0.31 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1812882) Cache Key: p.drive_id Cache Mode: logical Hits: 1811893 Misses: 989 Evictions: 0 Overflows: 0 Memory Usage: 101kB -> Index Scan using trips_pkey on drives d (cost=0.28..0.30 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=989) Index Cond: (id = p.drive_id) Filter: ((start_date >= '2023-10-10 05:09:50.766'::timestamp without time zone) AND (start_date <= '2024-10-10 05:09:50.766'::timestamp without time zone) AND (car_id = '1'::smallint)) InitPlan 2 (returns $2) -> Aggregate (cost=42338.06..42338.07 rows=1 width=32) (actual time=349.595..349.595 rows=1 loops=1) -> CTE Scan on drivedata drivedata_1 (cost=0.00..40825.98 rows=604829 width=32) (actual time=0.018..230.080 rows=1697374 loops=1) Filter: (speed_section_km > '0'::numeric) Rows Removed by Filter: 85841 -> HashAggregate (cost=43850.12..43856.62 rows=200 width=96) (actual time=3958.802..3958.830 rows=20 loops=1) Group Key: drivedata.speed_section_km Batches: 1 Memory Usage: 48kB -> CTE Scan on drivedata (cost=0.00..40825.98 rows=604829 width=64) (actual time=11.950..3226.430 rows=1697374 loops=1) Filter: (speed_section_km > '0'::numeric) Rows Removed by Filter: 85841 Planning Time: 0.385 ms JIT: Functions: 32 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 1.010 ms, Inlining 0.000 ms, Optimization 0.453 ms, Emission 11.505 ms, Total 12.969 ms Execution Time: 3964.892 ms (31 rows) |
@js94x - yes, the query provided by chatgpt is better cause within yours the positions are queried twice. Basically, right now there are some rules to follow
I opened a discussion to evaluate TimescaleDB instead of PostgreSQL - this hopefully results in better performance and adds additional possibilities (continous aggregates / new time series related functions / ...). if excluding streaming data - query exec time is ~36 times faster in my case WITH drivedata AS (
SELECT
ROUND(convert_km(p.speed::numeric, 'km') / 10, 0) * 10 AS speed_section_km,
EXTRACT(EPOCH FROM (LEAD(p."date") OVER (ORDER BY p."date") - p."date")) AS seconds_elapsed
from positions p
WHERE p.car_id = '2' AND p.date > '2023-10-12' and p.ideal_battery_range_km is not null
),
TotalDuration AS (
SELECT SUM(seconds_elapsed) AS total_elapsed_time
FROM drivedata
WHERE speed_section_km > 0
)
SELECT
speed_section_km AS "Speed",
SUM(seconds_elapsed) * 100 / (SELECT total_elapsed_time FROM TotalDuration) AS "Elapsed",
TO_CHAR((SUM(seconds_elapsed) || ' second')::interval, 'HH24:MI:SS') AS "Time"
FROM drivedata
WHERE speed_section_km > 0
GROUP BY speed_section_km
ORDER BY speed_section_km; |
Thanks. I understand your concerns / rules.
Sounds like a good compromise. I currently do not plan to clone my database, switch all docker images to master branch and test the new column. So, would it possible for you to run both queries in psql with your data? One query with |
you can validate using your dataset (simply add / remove the additional query condition in your drivedata CTE you can expect the query incl. the condition to be ~ 30x faster once the next release is out. |
@swiffer what do you think about this PR after your changes? |
values do differ (incl. vs excl. streaming data) - i would vote for dashboard loading speed over accuracy in this case.
|
thanks @swiffer for the feedback. I agree and changed both little things. |
👍 looks good to me, approved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much!
Missing convert_km()s for mph units and speed_mih should be speed_mph imo Fixed with #4323 |
I have implemented long term statistics about maximum speed and a speed histogram (already known from "Drive Details" dashboard).
Imho a good place for that kind of statistics is the "Drive Stats" dashboard.
For speed histogram I added a "hack" to hide standing times (speed == 0) at traffic lights or while parking -> see last WHERE condition. Otherwise the bar/spike at the left side of histogram would be even bigger.
See screenshot for more details: