-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk commit to DB and Cache IDs #162
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #162 +/- ##
========================================
- Coverage 79.9% 79.0% -0.9%
========================================
Files 11 11
Lines 926 956 +30
Branches 145 155 +10
========================================
+ Hits 740 756 +16
- Misses 142 149 +7
- Partials 44 51 +7
|
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
Inserting data into the database was quite slow, especially when we were commiting each individual
value into the database in a diff-correlation metric. For every value we were also querying the
database for a dataset_id and a metric_id.
So to speed up the process we now use a single session.commit() for each of the DataFrame metrics.
We also use cachetools on the database/metric id functions. It's much faster locally and likely
and even better speedup when the database is remote.