Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsume should succeed even when the tuple is not present #487

Merged
merged 5 commits into from
Feb 3, 2025

Conversation

yihozhang
Copy link
Collaborator

Subsumes #478 and fixes #462

@yihozhang yihozhang requested a review from a team as a code owner December 10, 2024 04:32
@yihozhang yihozhang requested review from FTRobbin and saulshanabrook and removed request for a team December 10, 2024 04:32
Copy link

codspeed-hq bot commented Dec 10, 2024

CodSpeed Performance Report

Merging #487 will not alter performance

Comparing yihozhang-fix-subsume2 (3ad8d7d) with yihozhang-fix-subsume2 (2f3a6c3)

Summary

✅ 9 untouched benchmarks
🆕 1 new benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
🆕 rat-pow-eval N/A 2.5 ms N/A

tests/integration_test.rs Show resolved Hide resolved
@@ -430,7 +430,7 @@ fn test_serialize_subsume_status() {
None,
egglog::SerializedNode::Function {
name: ("a").into(),
offset: 0,
offset: 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two a functions values now? They are equivalent right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new implementation conceptually makes a new entry to the database with the new timestamp and marks the old entry as stale.

The old implementation directly modifies the old entry, which is bad because the database is supposed to be append-only I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tried to do it like delete? If that makes sense? Which does modify a row? Not saying it's right just curious what you think.

Copy link
Collaborator Author

@yihozhang yihozhang Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think remove is doing the following:

  • Delete the entry from the offset table
  • Set the actual tuple in the (append-only) log as stale (self.vals[*off].0.stale_at = ts)

insert_and_merge (which is the safe way of updating the table) does the following:

  • Mark the old tuple in the log as stale
  • Append the new tuple to the log
  • Update the offset table entry to point to the new tuple

While the old subsume implementation:

  • Find the old tuple according to the offset table
  • Update its subsumed flag

So I think it is unsafe (while remove and insert_and_merge are safe)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this and looking into it so closely!!

@saulshanabrook
Copy link
Member

Woops sorry @yihozhang I realized my review comments had been pending for two weeks sitting here unpublished! Didn't mean to leave you hanging

@yihozhang yihozhang marked this pull request as draft January 29, 2025 20:21
@yihozhang yihozhang removed the request for review from FTRobbin January 31, 2025 01:13
@yihozhang yihozhang marked this pull request as ready for review January 31, 2025 01:14
@saulshanabrook saulshanabrook merged commit 9e6ecb6 into main Feb 3, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Subsumption confusing semantics
2 participants