Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

feature/add-abstract-to-paragraphs-table #639

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

drsantos89
Copy link
Contributor

@drsantos89 drsantos89 commented Sep 28, 2022

Fixes #637.

Description

Adds the abstract to the paragraphs table so it can be searchable using semantic search.

How to test?

Abstract field added to the tests/unit/entrypoint/database/test_add_es.py

Checklist

  • This PR refers to an issue from the issue tracker.
    (if it is not the case, please create an issue first).
  • Unit tests added.
    (if needed)
  • Documentation and whatsnew.rst updated.
    (if needed)
  • setup.py and requirements.txt updated with new dependencies.
    (if needed)
  • Type annotations added.
    (if a function is added or modified)
  • All CI tests pass.

@drsantos89 drsantos89 marked this pull request as ready for review September 28, 2022 15:09
@drsantos89 drsantos89 changed the title first commit feature/add-abstract-to-paragraphs-table Sep 28, 2022
for ppos, (section, text) in enumerate(article.section_paragraphs):
doc = {
"_index": "paragraphs",
"_source": {
"article_id": article.uid,
"section_name": section,
"section": section,
"text": text,
"paragraph_id": ppos,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, the goal of this paragraph_id is to hold a unique number (just an increment integer) for each paragraph. However, if I am not mistaken after introduction of the abstract paragraphs into the paragraphs table we might have collisions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, maybe we could put "paragraph_id": ppos + i + 1 to avoid any collisions. What do you think @drsantos89 @jankrepl ?

Copy link
Contributor

@EmilieDel EmilieDel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify abstract and section paragraphs
3 participants