-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI Refactoring] Refactor Document
fixtures in tests
#2577
Conversation
…-ai/haystack into refactor_test_documents_fixtures
…-ai/haystack into refactor_test_documents_fixtures
…-ai/haystack into refactor_test_documents_fixtures
…-ai/haystack into refactor_test_documents_fixtures
…-ai/haystack into refactor_test_documents_fixtures
…-ai/haystack into refactor_test_documents_fixtures
…-ai/haystack into refactor_test_documents_fixtures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! There just seems to be an unresolved merge conflict in crawler.py
.
@@ -942,7 +955,7 @@ def adaptive_model_qa(num_processes): | |||
logging.error(f"Not all the subprocesses are closed! {len(children)} are still running.") | |||
|
|||
|
|||
@pytest.fixture(scope="function") | |||
@pytest.fixture | |||
def bert_base_squad2(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated to this PR, but not sure about the naming of this fixture, given that the model used is not bert-base but minilm... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OMG 😄 Let's remember about this one!
test/nodes/test_label_generator.py
Outdated
@@ -86,9 +89,9 @@ def test_pseudo_label_generator_using_question_document_pairs( | |||
@pytest.mark.parametrize("document_store", ["memory"], indirect=True) | |||
@pytest.mark.parametrize("retriever", ["embedding_sbert"], indirect=True) | |||
def test_pseudo_label_generator_using_question_document_pairs_batch( | |||
document_store: BaseDocumentStore, retriever: EmbeddingRetriever, tmp_path: Path | |||
document_store: BaseDocumentStore, retriever: EmbeddingRetriever,docs_with_true_emb: List[Document] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document_store: BaseDocumentStore, retriever: EmbeddingRetriever,docs_with_true_emb: List[Document] | |
document_store: BaseDocumentStore, retriever: EmbeddingRetriever, docs_with_true_emb: List[Document] |
Problem:
conftest.py
.conftest.py
included the entire definition of two sentence embeddings in the code, for about 1400 lines of "code" added to the file for no reason.DOCS_WITH_EMBEDDINGS
Document
class, while you could do the same in the original fixture.Solution:
test/samples/embeddings
DOCS_WITH_EMBEDDINGS
into the fixturedocs_with_true_emb
and refactor relative teststest_docs_xs
into simplydocs
and ensure that all items of the returned list areDocument
objectsdocs_all_formats
fixture for documents of legacy types (dictionaries mostly)docs_with_random_emb
for tests that require documents with embeddings, but not exact embeddingsscope="function"
fixture parameters (that's the default)