Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk commit to DB and Cache IDs #162

Merged
merged 11 commits into from
Feb 5, 2024
Merged

Bulk commit to DB and Cache IDs #162

merged 11 commits into from
Feb 5, 2024

Conversation

simonhkswan
Copy link
Contributor

@simonhkswan simonhkswan commented Feb 5, 2024

Inserting data into the database was quite slow, especially when we were commiting each individual
value into the database in a diff-correlation metric. For every value we were also querying the
database for a dataset_id and a metric_id.

So to speed up the process we now use a single session.commit() for each of the DataFrame metrics.
We also use cachetools on the database/metric id functions. It's much faster locally and likely
and even better speedup when the database is remote.

@simonhkswan simonhkswan self-assigned this Feb 5, 2024
@simonhkswan simonhkswan added the pr:enhancement Improvement to existing features label Feb 5, 2024
Copy link

codecov bot commented Feb 5, 2024

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (68fbadb) 79.9% compared to head (edb0d2e) 79.0%.

❗ Current head edb0d2e differs from pull request most recent head 0be0676. Consider uploading reports for the commit 0be0676 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #162     +/-   ##
========================================
- Coverage    79.9%   79.0%   -0.9%     
========================================
  Files          11      11             
  Lines         926     956     +30     
  Branches      145     155     +10     
========================================
+ Hits          740     756     +16     
- Misses        142     149      +7     
- Partials       44      51      +7     
Files Coverage Δ
src/insight/metrics/metrics_usage.py 94.1% <100.0%> (ø)
src/insight/metrics/base.py 72.1% <93.7%> (-2.7%) ⬇️
src/insight/database/utils.py 73.1% <71.4%> (-3.2%) ⬇️

Copy link

sonarqubecloud bot commented Feb 5, 2024

@simonhkswan simonhkswan merged commit ae2bc77 into master Feb 5, 2024
3 checks passed
@simonhkswan simonhkswan deleted the db-connection-speedup branch February 5, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:enhancement Improvement to existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants