Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: fix MRR and MAP calculations #7841

Merged
merged 7 commits into from
Jun 25, 2024
Merged

bug: fix MRR and MAP calculations #7841

merged 7 commits into from
Jun 25, 2024

Conversation

Amnah199
Copy link
Contributor

Related Issues

Proposed Changes:

Fixed errors of MRREvaluator:

  • Corrected the logic for calculation of MRR
  • Added a flag to break the loop when condition is fulfilled
  • Optimized the loops

Fixed errors of MAPEvaluator:

  • Corrected the logic for calculation of MAP
  • Optimized the loops

Fixed errors in pytest file of test_document_map.py:

  • Corrected values for test_run_with_complex_data()

How did you test it?

Ran unit tests for both evaluators

Notes for the reviewer

Verify the correctness of the calculated scores for an example.

Checklist

@Amnah199 Amnah199 requested a review from a team as a code owner June 11, 2024 14:06
@Amnah199 Amnah199 requested review from masci and removed request for a team June 11, 2024 14:06
@Amnah199 Amnah199 added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Jun 11, 2024
@coveralls
Copy link
Collaborator

coveralls commented Jun 11, 2024

Pull Request Test Coverage Report for Build 9467114407

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 5 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.002%) to 89.801%

Files with Coverage Reduction New Missed Lines %
components/evaluators/document_map.py 2 93.33%
components/evaluators/document_mrr.py 3 90.0%
Totals Coverage Status
Change from base Build 9451690597: -0.002%
Covered Lines: 6850
Relevant Lines: 7628

💛 - Coveralls

@Amnah199 Amnah199 requested a review from a team as a code owner June 11, 2024 15:26
@Amnah199 Amnah199 requested review from dfokina and removed request for a team June 11, 2024 15:26
@github-actions github-actions bot added the type:documentation Improvements on the docs label Jun 11, 2024
@Amnah199 Amnah199 removed ignore-for-release-notes PRs with this flag won't be included in the release notes. labels Jun 11, 2024
@coveralls
Copy link
Collaborator

coveralls commented Jun 11, 2024

Pull Request Test Coverage Report for Build 9468451282

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 56 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.01%) to 89.813%

Files with Coverage Reduction New Missed Lines %
components/evaluators/document_map.py 2 93.33%
components/evaluators/document_mrr.py 3 90.0%
core/pipeline/pipeline.py 51 65.48%
Totals Coverage Status
Change from base Build 9451690597: 0.01%
Covered Lines: 6859
Relevant Lines: 7637

💛 - Coveralls

@ju-gu
Copy link
Member

ju-gu commented Jun 12, 2024

thanks for taking care! I just doubled checked some examples with the result from v1 and it looks fine :)

@coveralls
Copy link
Collaborator

coveralls commented Jun 14, 2024

Pull Request Test Coverage Report for Build 9515678439

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 55 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.2%) to 89.645%

Files with Coverage Reduction New Missed Lines %
components/evaluators/document_map.py 2 93.33%
components/evaluators/document_mrr.py 2 92.31%
core/pipeline/pipeline.py 51 65.48%
Totals Coverage Status
Change from base Build 9451690597: -0.2%
Covered Lines: 6900
Relevant Lines: 7697

💛 - Coveralls

masci
masci previously requested changes Jun 14, 2024
Copy link
Contributor

@masci masci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify a bit the code using list comprehensions

haystack/components/evaluators/document_map.py Outdated Show resolved Hide resolved
haystack/components/evaluators/document_mrr.py Outdated Show resolved Hide resolved
@masci masci requested a review from anakin87 June 21, 2024 11:23
@masci masci dismissed their stale review June 21, 2024 11:24

handing over to @anakin87

@coveralls
Copy link
Collaborator

coveralls commented Jun 21, 2024

Pull Request Test Coverage Report for Build 9615764716

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 298 unchanged lines in 44 files lost coverage.
  • Overall coverage increased (+0.2%) to 89.968%

Files with Coverage Reduction New Missed Lines %
components/builders/answer_builder.py 1 98.21%
components/builders/chat_prompt_builder.py 1 98.41%
components/converters/utils.py 1 95.24%
components/evaluators/document_map.py 1 96.15%
components/evaluators/document_mrr.py 1 95.45%
components/preprocessors/document_cleaner.py 1 98.82%
components/preprocessors/document_splitter.py 1 98.63%
components/routers/metadata_router.py 1 95.65%
components/websearch/searchapi.py 1 96.3%
components/converters/tika.py 2 91.18%
Totals Coverage Status
Change from base Build 9451690597: 0.2%
Covered Lines: 6717
Relevant Lines: 7466

💛 - Coveralls

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still found opportunities for improvement that could make our lives easier in the future (I hope)

if ground_document.content in retrieved_document.content:
score = 1 / (rank + 1)
break
ground_truth_content = [doc.content for doc in ground_truth if doc.content is not None]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ground_truth_content = [doc.content for doc in ground_truth if doc.content is not None]
ground_truth_contents = [doc.content for doc in ground_truth if doc.content is not None]

Comment on lines 72 to 87
average_precision = 0.0
relevant_documents = 0

for rank, retrieved_document in enumerate(retrieved):
if retrieved_document.content is None:
continue
ground_truth_content = [doc.content for doc in ground_truth if doc.content is not None]
for rank, retrieved_document in enumerate(retrieved):
if retrieved_document.content is None:
continue

if ground_document.content in retrieved_document.content:
relevant_documents += 1
average_precision += relevant_documents / (rank + 1)
if relevant_documents > 0:
score = average_precision / relevant_documents
if retrieved_document.content in ground_truth_content:
relevant_documents += 1
average_precision += relevant_documents / (rank + 1)
if relevant_documents > 0:
score = average_precision / relevant_documents
individual_scores.append(score)

score = sum(individual_scores) / len(retrieved_documents)
score = sum(individual_scores) / len(ground_truth_documents)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your fault but I had a hard time understanding the algorithm.

  • I think we could benefit from a better choice of variable names.
  • Adding a comment in the code with the formula/link from which we drew inspiration could also help

Resources I took inspiration from: #1, #2

Something like this:

            average_precision_numerator = 0.0
            relevant_documents = 0

            ground_truth_contents = [doc.content for doc in ground_truth if doc.content is not None]
            for rank, retrieved_document in enumerate(retrieved):
                if retrieved_document.content is None:
                    continue

                if retrieved_document.content in ground_truth_contents:
                    relevant_documents += 1
                    precision_at_k= relevant_documents / (rank + 1)
                    average_precision_numerator+=precision_at_k
            
            average_precision=0
            if relevant_documents > 0:
                average_precision = average_precision_numerator / relevant_documents
            individual_scores.append(average_precision)

        score = sum(individual_scores) / len(ground_truth_documents)

(untested)

WDYT?

Copy link
Contributor Author

@Amnah199 Amnah199 Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I also struggled to understand the function initially. Optimized variables can help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think precision_at_k is unnecessary and will be extra storage?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is not necessary, but in my opinion it makes the code more readable.

Feel free to make any changes you think appropriate.

@coveralls
Copy link
Collaborator

coveralls commented Jun 24, 2024

Pull Request Test Coverage Report for Build 9648193977

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 298 unchanged lines in 44 files lost coverage.
  • Overall coverage increased (+0.2%) to 89.968%

Files with Coverage Reduction New Missed Lines %
components/builders/answer_builder.py 1 98.21%
components/builders/chat_prompt_builder.py 1 98.41%
components/converters/utils.py 1 95.24%
components/evaluators/document_map.py 1 96.15%
components/evaluators/document_mrr.py 1 95.45%
components/preprocessors/document_cleaner.py 1 98.82%
components/preprocessors/document_splitter.py 1 98.63%
components/routers/metadata_router.py 1 95.65%
components/websearch/searchapi.py 1 96.3%
components/converters/tika.py 2 91.18%
Totals Coverage Status
Change from base Build 9451690597: 0.2%
Covered Lines: 6717
Relevant Lines: 7466

💛 - Coveralls

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Please incorporate these small suggestions and then merge.

@@ -10,7 +10,7 @@
@component
class DocumentMAPEvaluator:
"""
A Mean Average Precision (MAP) evaluator for documents.
A Mean Average Precision (MAP) evaluator for documents. For details, please refer to the [resource](https://www.pinecone.io/learn/offline-evaluation/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A Mean Average Precision (MAP) evaluator for documents. For details, please refer to the [resource](https://www.pinecone.io/learn/offline-evaluation/).
A Mean Average Precision (MAP) evaluator for documents.

I would not put this link in the docstring, but I would put it as a comment at the beginning of the run method code to help those who will be working with the code in the future.

@@ -10,7 +10,7 @@
@component
class DocumentMRREvaluator:
"""
Evaluator that calculates the mean reciprocal rank of the retrieved documents.
Evaluator that calculates the mean reciprocal rank of the retrieved documents. For details, please refer to the [resource](https://www.pinecone.io/learn/offline-evaluation/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Evaluator that calculates the mean reciprocal rank of the retrieved documents. For details, please refer to the [resource](https://www.pinecone.io/learn/offline-evaluation/).
Evaluator that calculates the mean reciprocal rank of the retrieved documents.

Same as above

@coveralls
Copy link
Collaborator

coveralls commented Jun 25, 2024

Pull Request Test Coverage Report for Build 9659678165

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 298 unchanged lines in 44 files lost coverage.
  • Overall coverage increased (+0.2%) to 89.968%

Files with Coverage Reduction New Missed Lines %
components/builders/answer_builder.py 1 98.21%
components/builders/chat_prompt_builder.py 1 98.41%
components/converters/utils.py 1 95.24%
components/evaluators/document_map.py 1 96.15%
components/evaluators/document_mrr.py 1 95.45%
components/preprocessors/document_cleaner.py 1 98.82%
components/preprocessors/document_splitter.py 1 98.63%
components/routers/metadata_router.py 1 95.65%
components/websearch/searchapi.py 1 96.3%
components/converters/tika.py 2 91.18%
Totals Coverage Status
Change from base Build 9451690597: 0.2%
Covered Lines: 6717
Relevant Lines: 7466

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 25, 2024

Pull Request Test Coverage Report for Build 9659996000

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 297 unchanged lines in 43 files lost coverage.
  • Overall coverage increased (+0.2%) to 89.968%

Files with Coverage Reduction New Missed Lines %
components/builders/answer_builder.py 1 98.21%
components/builders/chat_prompt_builder.py 1 98.41%
components/converters/utils.py 1 95.24%
components/evaluators/document_map.py 1 96.15%
components/preprocessors/document_cleaner.py 1 98.82%
components/preprocessors/document_splitter.py 1 98.63%
components/routers/metadata_router.py 1 95.65%
components/websearch/searchapi.py 1 96.3%
components/converters/tika.py 2 91.18%
components/converters/txt.py 2 90.0%
Totals Coverage Status
Change from base Build 9451690597: 0.2%
Covered Lines: 6717
Relevant Lines: 7466

💛 - Coveralls

@Amnah199 Amnah199 merged commit fc011d7 into main Jun 25, 2024
17 checks passed
@Amnah199 Amnah199 deleted the fix-mrr-map-evaluators branch June 25, 2024 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MAP and MRR wrong for multiple gold documents
5 participants