Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Make add handle duplicate articles #591

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

EmilieDel
Copy link
Contributor

@EmilieDel EmilieDel commented Mar 7, 2022

Fixes #576.

Description

bbs_database add is now filtering and keeping the articles not already present in the database (thanks to uid) before adding them.

Discussion

  • If there are no new articles (=articles whose uid does not exist in the database), should we
    • raise an error
    • or log the fact that we did not add any articles because all of them are already present in the database and return 0 ?
  • Currently, the database built in the conftest.py is inspired by the covid-19 database schema. Should we create a new database representing our future database ?
  • test_sql contains a test checking that we do not have any SQL query outside of the sql module. We never updated this test after adding new modules. Should we do something about this ?
  • Should we save more often the articles ? Currently we parse everything and then save all articles. Should we create batch ? Save articles one by one ?

Checklist

  • This PR refers to an issue from the issue tracker.
    (if it is not the case, please create an issue first).
  • Unit tests added.
    (if needed)
  • Documentation and whatsnew.rst updated.
    (if needed)
  • setup.py and requirements.txt updated with new dependencies.
    (if needed)
  • Type annotations added.
    (if a function is added or modified)
  • All CI tests pass.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make add handle duplicate articles
1 participant