Initial chain access implementation #9

tnull · 2022-09-01T07:13:50Z

~~Based on #8~~.

This PR adds an initial implementation for LdkLiteChainAccess, which provides the necessary interfaces for accessing chain data.

TheBlueMatt

Mostly questions.

src/access.rs

TheBlueMatt · 2022-09-13T20:43:15Z

src/access.rs

+				let mut unconfirmed_registered_txs = Vec::new();
+
+				for txid in registered_txs {
+					if let Some(tx_status) = client.get_tx_status(&txid)? {


What are each of these calls - are they hitting a server somewhere, or are they hitting a local cache? When does that cache get updated? How do we add new entries to the cache?

Yep, they are all just requests directly hitting the Esplora server for now. While these calls are kind of specific to get the current status of a transaction, we may want to look into using the BDK wallet cache to retrieve some data when we tackle #3 .

Oof, so this is gonna be a really slow function? There's a lot of serial requests. Is there some way to speed that up?

Well, Esplora doesn't support any kind of batching really, but we could at least make some of the functionality run concurrently? As it stands I grouped the method into three scopes anyways (essentially corresponding to best_block_updated, transactions_confirmed, transaction_unconfirmed). Could these three run in separate threads without an issue? The Confirm::transactions_confirmed docs say: "However, in the event of a chain reorganization, it must not be called with a header that is no longer in the chain as of the last call to best_block_updated." But how bad would it be to let all three run concurrently rather than in order, especially if we check again for updates at the end of the method?

Also, I could take the road of BDK's wallet sync and spawn a number of threads to allow for concurrent client requests, e.g., to get_tx_status.

I understood that we were doing things async, does BDK not support rust-async esplora requests? Spinning up more threads probably isn't worth it, but if we can do parallel requests by just polling multiple futures together that likely is.

More generally, I wonder seriously about what happens if we get a reorg in the middle of this function, or between calls for a single transaction. I didn't carefully review it, but at first glance I'm a bit dubious that we meet the LDK requirements of topological sorting of transactions (and block-disconnections) in such a case.

I understood that we were doing things async, does BDK not support rust-async esplora requests? Spinning up more threads probably isn't worth it, but if we can do parallel requests by just polling multiple futures together that likely is.

No, currently all is blocking. rust-esplora-client supports async, but using this would complicate things quite a bit, IIUC. For example, we'd need to run LdkLiteChainAccess in its dedicated thread and then start communicating with it, e.g., via message passing. Happy to explore this further in a follow-up, but to keep things a bit more simple, I'd like to keep it blocking for now.

More generally, I wonder seriously about what happens if we get a reorg in the middle of this function, or between calls for a single transaction. I didn't carefully review it, but at first glance I'm a bit dubious that we meet the LDK requirements of topological sorting of transactions (and block-disconnections) in such a case.

Mh, are there any (undocumented) requirements that I'm not aware of? I tried to consider the following:

Notify of new blocks, if there are intermediaries, feeding just the newest one is fine.

Notify of newly confirmed transactions, ensure inter- and intra-block ordering.

Notify of newly unconfirmed transactions.

No, currently all is blocking. rust-esplora-client supports async, but using this would complicate things quite a bit, IIUC. For example, we'd need to run LdkLiteChainAccess in its dedicated thread and then start communicating with it, e.g., via message passing. Happy to explore this further in a follow-up, but to keep things a bit more simple, I'd like to keep it blocking for now.

Yea, we can do that as a followup, but I don't see why we're building blocking things to begin with here. If we want to end up async ISTM we should just start there, converting everything is a pain. The nice thing with async is we don't have to spawn a new thread for everything, which ends up with a lot of overhead. I'm not sure why we'd have more message passing with an async implementation in LDKLite than with a sync one - most calls would be correctly done in a sync context.

Alright, now went async with 35bad47.

src/access.rs

tnull · 2022-09-19T16:53:28Z

Rebased on main.

src/access.rs

G8XSU · 2022-09-20T21:28:32Z

src/access.rs

+			}
+		}
+
+		// TODO: check whether new outputs have been registered by now and process them


is this required ?
won't it just sync in next sync call ?

Well, if I understand @TheBlueMatt's concerns correctly, double-checking if something changed while we were running this is the least we can do here?

imo, this shouldn't be required.
if we called sync from outside this chain-access interface, we synced according to tx's and outputs available at that point in time.
when we release the lock, there can be another set of tx and outputs added but that should be handled in next call. we can not make sure to sync everything at this place.

src/access.rs

G8XSU · 2022-09-29T21:08:15Z

src/access.rs

+		Ok(())
+	}
+
+	pub(crate) async fn sync(&self, confirmables: Vec<&(dyn Confirm + Sync)>) -> Result<(), Error> {


just for my understanding, top level function will eventually block on this right ?
and that is before starting next sync call

No, we spawn a new background task via the tokio runtime. This task runs in a loop however, currently set to sleep 5 secs between each sync (attempt).

can this go wrong ?
For e.g. if we start this sync every 1 sec and it takes 10 secs to complete each sync. Multiple syncs are active at the same instance of time.
Normally it would have been fine but since each sync acquires resource locks, multiple instances of this sync cannot run in parallel or independent of each other.
In effect, they are running serially waiting for the previous one to complete first. This can create an ever growing waiting tasks queue if completionTime > syncTriggerInterval.

Is this understanding correct ?

No, no, it's just a single loop, so no concurrent sync attempts should be spawned, it would always wait 5 secs between finishing the last and starting the next attempt:

https://github.com/lightningdevkit/ldk-lite/blob/389f960d7533fcc3c57071d4dc15549d14de538b/src/lib.rs#L488-L524

(The chosen parameters are however entirely up for debate)

TheBlueMatt · 2022-10-05T17:14:09Z

src/access.rs

+				if tx_status.confirmed {
+					if let Some(tx) = client.get_tx(&txid).await? {
+						if let Some(block_height) = tx_status.block_height {
+							let block_header = client.get_header(block_height).await?;


I think this is race-y - if there's a reorg while we're processing this transaction we'll get the header at height X, but that may no longer be the block in which this transaction was confirmed.

Yes it is, looking at these two lines only. However, if the transaction would have been reorged-out the next call to get_merkle_proof would fail. I now added a explicit check that the height didn't change inbetween.

Mh, but this likely won't catch the case in which the transaction is also included in the new tip. In this case we could end up reporting the old header. To fix that we probably want to query the block by hash here, just opened an upstream PR that would allow us to do so (bitcoindevkit/rust-esplora-client#17).

I think the check block_height == merkle_proof.block_height doesn't help but is misleading.
because the reorged new block could also have same height ?

And even with get_merkle_proof(), block could always get re-orged after that. So we mainly rely on next sync to fix this for us.

So the fix is to get header from block_hash inside of tx_status, right?
For now what we can do is compare block_hash from tx_status and the one from header? It will remove any unintended confirms with mismatched header.

Right, yea, we'll need upstream to fix this. A simpler fix, if its available, would be to extract the block header from the merkle proof. That would also reduce the number of requests we have to send the server...I think we could then replace the whole block with a single request to get a merkle proof (and understand that the tx is not confirmed if the request fails?).

Right, yea, we'll need upstream to fix this. A simpler fix, if its available, would be to extract the block header from the merkle proof.

Unfortunately, it is not available. While there is an alternate (so far unimplemented) API method that would return the merkle proof in Bitcoind's merkleblock format which would feature the header, this then in turn wouldn't give us the pos, IIUC.

src/access.rs

G8XSU · 2022-10-07T02:47:07Z

src/access.rs

+				if tx_status.confirmed {
+					if let Some(tx) = client.get_tx(&txid).await? {
+						if let Some(block_height) = tx_status.block_height {
+							let block_header = client.get_header(block_height).await?;


I think the check block_height == merkle_proof.block_height doesn't help but is misleading.
because the reorged new block could also have same height ?

And even with get_merkle_proof(), block could always get re-orged after that. So we mainly rely on next sync to fix this for us.

So the fix is to get header from block_hash inside of tx_status, right?
For now what we can do is compare block_hash from tx_status and the one from header? It will remove any unintended confirms with mismatched header.

src/access.rs

TheBlueMatt · 2022-10-21T14:27:50Z

Note: also need to carefully review for compliance with https://docs.rs/lightning/latest/lightning/chain/trait.Confirm.html#order

TheBlueMatt · 2022-10-21T14:28:10Z

Will review this more carefully later today.

TheBlueMatt

I don't think this approach is going to work with the LDK ordering requirements. We'll need to somewhat rethink what we're doing here. (a) we need to connect transactions in topological order. That should at least be somewhat easy, I think. (b) we need to do disconnections first, and then refuse to do any connections if the chain reorgs out from under us while we're running, which may imply somehow ensuring any headers we're connecting a tx from are in the chain when we started?

src/access.rs

tnull · 2022-10-24T10:02:01Z

(a) we need to connect transactions in topological order. That should at least be somewhat easy, I think.

I agree that should be straight forward, if we apply the change to Confirm::get_relevant_txids (see #9 (comment)).

(b) we need to do disconnections first, and then refuse to do any connections if the chain reorgs out from under us while we're running, which may imply somehow ensuring any headers we're connecting a tx from are in the chain when we started?

For the time being, could we track all block hashes we get from the get_tx_status/get_merkle_proof calls and then have dedicated section where we only mark the transactions confirmed when all of them are still in the best chain and the tip is still equal to the one we started with? This is still somewhat race-y, but possibly less so, and hopefully still would only spawn 1-2 additional queries in the normal case, as syncs should happen more frequently than new blocks arrive.
Moreover, I'd then don't quite see a big difference to the (unavoidable) case in which we get a reorg after we're done syncing: in both cases we'd have stale information about the confirmation status until the next sync fixes this.

TheBlueMatt · 2022-10-29T17:32:20Z

I agree that should be straight forward, if we apply the change to Confirm::get_relevant_txids (see #9 (comment)).

I don't see how this helps with topological ordering? We need to sort tx connection into the order they appear on the chain, basically, but I think that's doable here.

jkczyz

What's the plan for testing this?

jkczyz · 2022-10-31T19:16:57Z

Cargo.toml

-bdk = { git = "https://github.com/tnull/bdk", branch="feat/use-external-esplora-client", features = ["use-esplora-ureq", "key-value-db"]}
+bdk = { git = "https://github.com/tnull/bdk", branch="feat/use-external-esplora-client", features = ["use-esplora-reqwest", "key-value-db"]}


How comfortable are we with taking on the reqwest dependency? @TheBlueMatt I believe you had a concern awhile back: lightningdevkit/rust-lightning#763 (comment)

I think that's a bit out of date these days. I'd really love to avoid taking an HTTP dependency, I think for something like this its generally overkill, but we probably do need to take a TLS dependency, and at the end of the day its not our stuff anyway - we can't pick what BDK offers for their async HTTP client.

src/access.rs

TheBlueMatt · 2022-11-02T21:45:19Z

What's the plan wrt ensuring the code is safe? Are we intending to merge it before we get to that point and fix it later (if so I think we need a Big Fat Warning both in the file and in the README) or do we intend to leave this as a PR until we get there?

tnull · 2022-11-03T09:11:20Z

What's the plan wrt ensuring the code is safe? Are we intending to merge it before we get to that point and fix it later (if so I think we need a Big Fat Warning both in the file and in the README) or do we intend to leave this as a PR until we get there?

I'd prefer to go merge it and revisit it once we still find it to be lacking and have a clear path forward. However, before we're thinking of merging I want address two main things to bring this as close as possible to being safe:

Build in additional checks for detecting inconsistencies/reorgs, as I think we can mostly get there by doing consistency checks between all data returned by API calls.
Based on that find a better loop construction that allows us to restart sync as soon as we detect any inconsistencies. This should also allow us to process any leftover/dependent transactions queued up in the previous iteration, ensuring the right ordering. It should also help discern permanent vs temporary failures (i.e., do we ever want to ? out of the function or even panic?)

In parallel I'm exploring whether a push-based approach based on Electrum is feasible and preferable.

tnull · 2022-11-03T10:28:52Z

What's the plan for testing this?

I imagine unit tests for individual parts of the chain access could be a bit tricky (and would likely amount to replicating coverage from upstream projects), but I'll implement end-to-end tests using the electrsd/bitcoind crates, likely reusing the approach from the Esplora client (cf. https://github.com/bitcoindevkit/rust-esplora-client/blob/e24663b6bc8536fef60a77681f78792d770ea315/src/lib.rs#L218-L298 ff.)

tnull · 2022-11-04T15:00:13Z

Now making use of lightningdevkit/rust-lightning#1796 and also implemented yet another API method upstream (bitcoindevkit/rust-esplora-client#28) that allows us to retrieve a MerkleBlock, which reduces the number of needed calls. Also DRYed up the code a bit, will have another go at it though.

TheBlueMatt · 2022-11-04T23:19:55Z

Let me know when you want another review pass here.

tnull · 2022-11-07T15:38:54Z

Alright, I iterated once more on the newer approach and added additional checks. I now squashed fixups and split out the wallet functionality into its own commit. I'll probably split out the wallet entirely into it's own dedicated module in the future, but avoided to do here since I'd otherwise would need to replicate a good part of #26 here..

Should be ready for another round.

src/access.rs

benthecarman · 2022-11-22T04:40:55Z

I had copy pasted a lot of this in here. We ran into an issue where a channel would never become ready because we never mark it confirmed. watched_transactions would never have anything added to it so they would never be checked if they were queued. Not sure if I was using it wrong or what, but we added *self.watched_transactions.lock().unwrap() = to_watch; (diff here) to the end of sync_unconfirmed_transactions which fixed the issue.

tnull · 2022-11-22T08:48:30Z

I had copy pasted a lot of this in here. We ran into an issue where a channel would never become ready because we never mark it confirmed. watched_transactions would never have anything added to it so they would never be checked if they were queued. Not sure if I was using it wrong or what, but we added *self.watched_transactions.lock().unwrap() = to_watch; (diff here) to the end of sync_unconfirmed_transactions which fixed the issue.

Right, the need to re-register transactions (or just keep monitoring all returned by Filter) has been discussed above (see #9 (comment))

tnull · 2022-11-24T12:45:05Z

Closing this in favor of the upstreamed version: lightningdevkit/rust-lightning#1870.

Will open another PR for the wallet parts ASAP.

This was referenced Sep 1, 2022

Initial implementation of LDK Node #11

Merged

Initial payment store implementation #13

Merged

Make wallet entropy source configurable #14

Closed

tnull force-pushed the 2022-09-initial-chain-access branch 4 times, most recently from d548d1c to 766a0a7 Compare September 8, 2022 12:27

tnull mentioned this pull request Sep 9, 2022

Implement chain::Access trait #21

Closed

tnull force-pushed the 2022-09-initial-chain-access branch from 766a0a7 to b42e3e3 Compare September 12, 2022 11:52

TheBlueMatt reviewed Sep 13, 2022

View reviewed changes

tnull force-pushed the 2022-09-initial-chain-access branch 4 times, most recently from 7182aca to 553d966 Compare September 19, 2022 16:50

G8XSU reviewed Sep 20, 2022

View reviewed changes

tnull mentioned this pull request Sep 27, 2022

Initial event handling implementation #10

Merged

G8XSU reviewed Sep 29, 2022

View reviewed changes

TheBlueMatt reviewed Oct 5, 2022

View reviewed changes

G8XSU reviewed Oct 7, 2022

View reviewed changes

TheBlueMatt reviewed Oct 17, 2022

View reviewed changes

src/access.rs Outdated Show resolved Hide resolved

This was referenced Oct 20, 2022

Make everything async #26

Closed

UniFFI bindings #25

Merged

TheBlueMatt reviewed Oct 21, 2022

View reviewed changes

src/access.rs Outdated Show resolved Hide resolved

src/access.rs Outdated Show resolved Hide resolved

tnull requested a review from jkczyz October 24, 2022 10:14

jkczyz reviewed Oct 31, 2022

View reviewed changes

jkczyz reviewed Nov 2, 2022

View reviewed changes

src/access.rs Outdated Show resolved Hide resolved

src/access.rs Outdated Show resolved Hide resolved

src/access.rs Show resolved Hide resolved

src/access.rs Show resolved Hide resolved

tnull added 2 commits November 7, 2022 16:33

Add initial chain access implementation

273617a

Add create_funding_transaction method

d5df624

tnull force-pushed the 2022-09-initial-chain-access branch from 18e7fe6 to d5df624 Compare November 7, 2022 15:33

tnull requested review from TheBlueMatt and jkczyz November 7, 2022 15:46

TheBlueMatt reviewed Nov 10, 2022

View reviewed changes

src/access.rs Show resolved Hide resolved

src/access.rs Show resolved Hide resolved

src/access.rs Outdated Show resolved Hide resolved

src/access.rs Outdated Show resolved Hide resolved

src/access.rs Outdated Show resolved Hide resolved

src/access.rs Show resolved Hide resolved

benthecarman reviewed Nov 10, 2022

View reviewed changes

src/access.rs Show resolved Hide resolved

benthecarman reviewed Nov 10, 2022

View reviewed changes

src/access.rs Outdated Show resolved Hide resolved

benthecarman mentioned this pull request Nov 10, 2022

Implement Filter & helper functions MutinyWallet/mutiny-node#42

Merged

benthecarman reviewed Nov 10, 2022

View reviewed changes

src/access.rs Show resolved Hide resolved

benthecarman reviewed Nov 10, 2022

View reviewed changes

src/access.rs Outdated Show resolved Hide resolved

tnull added 3 commits November 10, 2022 12:29

f Docs and typos

d9bc655

f DON'T PANIC in large friendly letters

39e56c7

f Some restructuring, variable renaming

d85730c

benthecarman reviewed Nov 21, 2022

View reviewed changes

src/access.rs Show resolved Hide resolved

tnull mentioned this pull request Nov 24, 2022

Add transaction sync crate lightningdevkit/rust-lightning#1870

Merged

tnull closed this Nov 24, 2022

		bdk = { git = "https://github.com/tnull/bdk", branch="feat/use-external-esplora-client", features = ["use-esplora-ureq", "key-value-db"]}
		bdk = { git = "https://github.com/tnull/bdk", branch="feat/use-external-esplora-client", features = ["use-esplora-reqwest", "key-value-db"]}

Initial chain access implementation #9

Initial chain access implementation #9

Conversation

tnull commented Sep 1, 2022 • edited Loading

TheBlueMatt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnull Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnull commented Sep 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnull Oct 6, 2022 • edited Loading

Choose a reason for hiding this comment

G8XSU Oct 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

G8XSU Oct 7, 2022 • edited Loading

Choose a reason for hiding this comment

TheBlueMatt commented Oct 21, 2022

TheBlueMatt commented Oct 21, 2022

TheBlueMatt left a comment

Choose a reason for hiding this comment

tnull commented Oct 24, 2022 • edited Loading

TheBlueMatt commented Oct 29, 2022

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Nov 2, 2022

tnull commented Nov 3, 2022 • edited Loading

tnull commented Nov 3, 2022

tnull commented Nov 4, 2022

TheBlueMatt commented Nov 4, 2022

tnull commented Nov 7, 2022 • edited Loading

benthecarman commented Nov 22, 2022 • edited Loading

tnull commented Nov 22, 2022 • edited Loading

tnull commented Nov 24, 2022

tnull commented Sep 1, 2022 •

edited

Loading

tnull Sep 15, 2022 •

edited

Loading

tnull Oct 6, 2022 •

edited

Loading

G8XSU Oct 7, 2022 •

edited

Loading

G8XSU Oct 7, 2022 •

edited

Loading

tnull commented Oct 24, 2022 •

edited

Loading

tnull commented Nov 3, 2022 •

edited

Loading

tnull commented Nov 7, 2022 •

edited

Loading

benthecarman commented Nov 22, 2022 •

edited

Loading

tnull commented Nov 22, 2022 •

edited

Loading