You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Question: Are certain query patterns that would help improve arquero's performance (or conversely make it worse in which case we should avoid)?
Example: Let's say I can to compute some basic stats about 2 metrics:
Would arquero perform better if I did each metric separately such as:
dt.rollup({
min: d => op.min(d.metric_1),
max: d => op.max(d.metric_1),
sum: d => op.sum(d.metric_1),
avg: d => op.average(d.metric_1),
med: d => op.median(d.metric_1)
})
dt.rollup({
min: d => op.min(d.metric_2),
etc...
Or would arquero perform better if combined them both in the same rollup?
dt.rollup({
metric_1_min: d => op.min(d.metric_1),
metric_2_min: d => op.min(d.metric_2),
etc...
})
Background research: I looked for performance and optimiz... in the forum/docs and found nothing about query optimization:
https://uwdata.github.io/arquero/api/: "Arquero unpacks columns with null entries or containing multiple record batches to optimize query performance"
About performance benchmarks: to me these serve a different purpose. They are meant to show how fast arquero performs certain tasks, not how to optimize performance for particular queries.
Any tips would be much appreciated: not criticizing arquero nor implying it's slow, just trying to get the most out of it 😃 Many thanks!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Question: Are certain query patterns that
would help improve arquero's performance
(or conversely make it worse in which case we should avoid)?Example: Let's say I can to compute some basic stats about 2 metrics:
Would arquero perform better if I did each metric separately such as:
Or would arquero perform better if combined them both in the same rollup?
Background research: I looked for
performance
andoptimiz...
in the forum/docs and found nothing about query optimization:"to run performance benchmarks"
"Arquero unpacks columns with null entries or containing multiple record batches to optimize query performance"
About performance benchmarks: to me these serve a different purpose. They are meant to show how fast arquero performs certain tasks, not how to optimize performance for particular queries.
Any tips would be much appreciated: not criticizing arquero nor implying it's slow, just trying to get the most out of it 😃 Many thanks!
Beta Was this translation helpful? Give feedback.
All reactions