-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Legacy Indexing Overhaul #42
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass review
It's unfortunate that this duplicates some work from #40 and has similar problems that PR addresses. This PR seems less complex and less complexity would be my preference though (apologies @SNeedlewoods :/)
@@ -72,14 +72,15 @@ static bool try_view_scan_legacy_enote_v1(const rct::key &legacy_base_spend_pubk | |||
const std::uint64_t block_index, | |||
const std::uint64_t block_timestamp, | |||
const rct::key &transaction_id, | |||
const std::uint64_t total_enotes_before_tx, | |||
const std::function<std::uint64_t(rct::xmr_amount)> &total_enotes_before_tx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be simpler to just pass the enote_ledger_index
here and get rid of this std::function
, and also get rid of enote_tx_index_by_amount_inout
. This function param is going to be a little annoying to re-implement in the scanner.
Using enote_tx_index_by_amount_inout
like this also conflicts with #43 since the same enote can be re-scanned with different tx pub keys. Seems simpler to just get rid of that by going with a simple enote_ledger_index
passed into the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total_enotes_before_tx
doesn't just return one result for the whole tx, it will return different results based on which amounts you pass to it. We could pass a map, maybe? That would require we pre-scan the tx to look for the ledger indexing amounts in the enotes before calling this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could pass a map, maybe?
Was suggesting pass a vector of enote_ledger_index
's into try_find_legacy_enotes_in_tx
over here
Then when iterating over enotes, reference the specific enote_ledger_index
to pass into try_view_scan_legacy_enote_v1
But a map is fine too
require we pre-scan the tx to look for the ledger indexing amounts in the enotes before calling this function
The daemon currently returns a vector of global output indices per tx (source); the respective amount is inferred from the tx output. So technically "pre-scanning" is already done on the server-side today
contextual_record_out.origin_context = | ||
SpEnoteOriginContextV1{ | ||
.block_index = block_index, | ||
.block_timestamp = block_timestamp, | ||
.transaction_id = transaction_id, | ||
.enote_tx_index = enote_index, | ||
.enote_ledger_index = total_enotes_before_tx + enote_index, | ||
.enote_ledger_index = enote_ledger_index, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This messes up the is_older_than
implementation when comparing SpEnoteOriginContextV1
's enote_ledger_index
's
enotes with different transparent amounts can't be compared using this to determine age
Same as this comment: #40 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal for is_older_than
: add tx_index_in_block
to SpEnoteOriginContextV1
With this, we can then determine is_older_than
for SpEnoteOriginContextV1
using the block_index
, tx_index_in_block
, and enote_tx_index
Would be useful not only to guarantee correctness of is_older_than
, but also to guarantee a consistent order between the enote store and a wallet2 m_transfers
container. The latter is marginally useful for populating a wallet2 instance from the enote store.
Potentially worth a separate PR
@@ -784,7 +821,7 @@ void MockLedgerContext::get_onchain_chunk_legacy(const std::uint64_t chunk_start | |||
total_output_count_before_tx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A vector of the enote ledger indexes would be much appreciated here instead of total_output_count_before_tx
. This function just seems like a pain
if (variant.is_type<LegacyEnoteV1>()) | ||
return variant.unwrap<LegacyEnoteV1>().amount; | ||
else | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An unfortunate caveat here: LegacyEnoteV4
can either be pre-RCT or coinbase. This needs to return the amount for the former and 0 for the latter
EDIT: so we don't forget in this PR, I forgot to mention here LegacyEnoteV1
can also either be pre-RCT or post-RCT pre-view tag coinbase. Also needs amount for the former and 0 for the latter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I updated the PR with changes shared from PR #40, and together they tackle the problem nicely.
/// on-chain indices of the proof's ring members (serializable as index offsets) | ||
std::vector<std::uint64_t> reference_set_COMPACT; | ||
/// on-chain indices of the proof's ring members | ||
std::vector<std::pair<std::uint64_t, std::uint64_t>> reference_set; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: the reason you got rid of compaction is because this LegacyRingSignatureV4
could have {pre-RCT, RCT} enotes in the same ring, right? If instead each ring was tied to a single ledger_indexing_amount
, then it would be trivial to just keep the same reference_set_COMPACT
as is and add a field for the ledger_indexing_amount
IIUC I think this is fine as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly I got rid of compaction because I'm overhauling serialization in #39 anyways. Here's how I would do compaction:
- Order pairs by increasing amounts
- Order each pair for a given amount by increasing index number
- Decumulate the list of unique amounts
- Decumulate the list of index numbers for a given amount
- Store number of unique amounts, then entries of amount offset and a list of index number offets
For example, if your reference set entries (a_j, i_k)
is {(0, 0}, (0, 5), (0, 8), (1, 4), (1, 5), (3, 0), (3, 10)}
, the encoding could look like so:
3 // number of unique amounts
0 // base amount
3 // number of indexes for amount '0'
0 // base index for amount '0'
5 // index offset
3 // index offset
1 // amount offset
2 // number of indexes for amount '1'
4 // base index for amount '1'
1 // index offset
2 // amount offset
2 // number of indexes for amount '3'
0 // base index for amount '3'
10 // index offset
You can see how this can get pretty compact if there are not many unique ledger indexing amounts. For example, if we want to specify the first 20 RingCT enotes for our reference set (ledger indexing amount is 0), then our encoding would be: 1 0 20 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
.
Edit: It's in this PR now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely important to include this, with good tests. For follow-up PR.
1fe2d30
to
5dd9885
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very glad to see how much cleaner this is.
Apart from the few minor things I commented on, this looks good to me and I approve this PR.
Thanks @SNeedlewoods ! |
104d91b
to
8bd201b
Compare
Rebased and added compact legacy reference set serialization. See seraphis_serialization.h for the changes |
8bd201b
to
f26ce71
Compare
Can you get it to pass CI? |
8c13298
to
e9b159a
Compare
Rebased to integrate with the async wallet scanner changes |
Windows build and MacOS builds are broken because of wrong Boost versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one comment then merge!
This PR removes "universal"-style indexing for legacy CLSAG rings, and replaces it with a reference set scheme that uses (amount, index in amount) indexing pairs to reference on-chain enotes. This is the same method that Cryptonote txs use, and is how the current Monero Core LMDB database is referenced. Doing things this way means that the database will not have to be re-indexed, saving at a very minimum 1.6 GB (100M on-chain enotes * (16 bytes for extra table keys)) of storage space, and an expensive database migration involving moving all existing enote data to a new table. We change the MockLedgerContext to support this indexing scheme. In practice, serialized txs under this method shouldn't take up much more space than pre-PR if compressed clever-ly, and assuming most ring members will RingCT enotes. We also add LegacyEnoteOriginContext for contextualized enote records so we can better keep tracked of scanned legacy enotes under the legacy indexing scheme. Co-authored-by: SNeedlewoods <sneedlewoods_1@protonmail.com>
e9b159a
to
9e9aa1b
Compare
This PR removes "universal"-style indexing for legacy CLSAG rings, and replaces it with a reference set scheme that uses (amount, index in amount) indexing pairs to reference on-chain enotes. This is the same method that Cryptonote txs use, and is how the current Monero Core LMDB database is referenced. Doing things this way means that the database will not have to be re-indexed, saving at a very minimum 1.6 GB (100M on-chain enotes * (16 bytes for extra table keys)) of storage space, and an expensive database migration involving moving all existing enote data to a new table. We change the MockLedgerContext to support this indexing scheme. In practice, serialized txs under this method shouldn't take up much more space than pre-PR if compressed clever-ly, and assuming most ring members will RingCT enotes. We also add LegacyEnoteOriginContext for contextualized enote records so we can better keep tracked of scanned legacy enotes under the legacy indexing scheme. Co-authored-by: SNeedlewoods <sneedlewoods_1@protonmail.com>
This PR removes "universal"-style indexing for legacy CLSAG rings, and replaces it with a reference set scheme that uses
(amount, index in amount) indexing pairs to reference on-chain enotes. This is the same method that Cryptonote txs use, and is how the current Monero Core LMDB database is referenced. Doing things this way means that the database will not have to be re-indexed, saving at a very minimum 1.6 GB (100M on-chain enotes * (16 bytes for extra table keys)) of storage space, and an expensive database migration involving moving all existing enote data to a new table. We change the MockLedgerContext to support this indexing scheme.
In practice, serialized txs under this method shouldn't take up much more space than pre-PR if compressed clever-ly, and assuming most ring members will RingCT enotes.
We also add LegacyEnoteOriginContext for contextualized enote records so we can better keep tracked of scanned legacy enotes under the legacy indexing scheme.