Benchmark normalization speed #7741

ChristopheDuong · 2021-11-08T16:27:13Z

Tell us about the problem you're trying to solve

Following up on #4286, we also need a test that can run normalization on a "larger" amount of data than the current integration tests to study the scalability of the generated models (whether full refresh or incremental).

See https://airbytehq.slack.com/archives/C01MFR03D5W/p1636199069220000

Describe the solution you’d like

Generate X numbers of new rows replicated in all destinations
Measure time/amount of data processed to process the X new rows

hopefully by avoiding reprocessing Y old rows from history (that may be quite large, unnecessary and expensive to work again on)

ChristopheDuong · 2021-11-19T10:26:36Z

As part of normalization integration tests, we should track running time for each model/query and record these in a persistent destination somewhere.

By comparing each run tests to the moving average of a few last runs, we should be able to "statistically" detect large variations and at least observe the overall progression of normalization running times for each query.

ChristopheDuong added type/enhancement New feature or request normalization testing labels Nov 8, 2021

ChristopheDuong mentioned this issue Nov 8, 2021

Minor fixes to incremental normalization and nesting #7669

Merged

sherifnada added the area/connectors Connector related issues label Nov 15, 2021

ChristopheDuong mentioned this issue Nov 19, 2021

🐌 Optimize incremental normalization runtime with snowflake #8088

Merged

jamakase mentioned this issue Dec 13, 2021

Jamkase/workspace slug #8734

Merged

bleonard added autoteam team/databases labels Apr 26, 2022

grishick added team/connectors-java and removed team/databases area/connectors Connector related issues labels Jun 30, 2022

grishick added the from/connector-ops label Sep 27, 2022

evantahler added the team/destinations Destinations team's backlog label Sep 29, 2022

evantahler closed this as completed Oct 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark normalization speed #7741

Benchmark normalization speed #7741

ChristopheDuong commented Nov 8, 2021 •

edited

Loading

ChristopheDuong commented Nov 19, 2021

Benchmark normalization speed #7741

Benchmark normalization speed #7741

Comments

ChristopheDuong commented Nov 8, 2021 • edited Loading

Tell us about the problem you're trying to solve

Describe the solution you’d like

ChristopheDuong commented Nov 19, 2021

ChristopheDuong commented Nov 8, 2021 •

edited

Loading