Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate and fix token transfers #4521

Closed
2 of 6 tasks
aalan3 opened this issue Oct 4, 2023 · 42 comments
Closed
2 of 6 tasks

Migrate and fix token transfers #4521

aalan3 opened this issue Oct 4, 2023 · 42 comments
Assignees
Labels
balances WIP work in progress

Comments

@aalan3
Copy link
Contributor

aalan3 commented Oct 4, 2023

Problem

We have a bunch of issues with erc20.transfers at the moment and we have been neglecting the fixes due to the migration. Right now the transfers tables are a bit all over the place with how they work and types, specially around amount_raw. We want to clean these up so that we can fix balances.

This issue exists to fix and migrate the spells and discuss the way forward.

Goal

The goal is for all evms to have a working transfers table that includes both ERC20 and native tokens (naming of table TBD) using a macro, so that implementation is the same (as much as possible) across chains.

We want to fix these so that we can then start to fix token balances and bring it back.

How

We want to base our implementation on the fungible_transfers which include native and ERC20 transfers.

The main difference is that we want to build a base model which is materialized without metadata (prices and erc20.tokens), which we then enrich with metadata via a view. We still also want to have a main model that has all the chains as well.

For each chain: base_model (materialized) -> metadata enrichment view (prices, erc20)

JOIN on the transactions table

Right now we are forced to join on the transactions table to include tx_from/to and tx_index since these are not available on the decoded tables. These columns are planned in the future and will be added for future blockchains. For example for base we already have tx_from/to available on the events and for the next chain that's released tx_index will also be available from the start.

Related

Some bug issues:
#3431 - "from" is show literally as from instead of the actual address
#3108 - Types are wrong but I think this one might be fixed?

There are also more than a few PRs open that addresses the transfers tables.

Plan

Might add more things to this.

@jeff-dude jeff-dude added WIP work in progress balances labels Oct 4, 2023
@jeff-dude
Copy link
Member

fyi @henrystats @tomfutago @primatepngs @lajarre
i believe you all have open PRs related to transfers and/or balances

we are looking to standardize these spells, with a consistent approach across evm chains leveraging macros. first step, as noted in the issue, fix transfers. then we can look into balances.

@jeff-dude
Copy link
Member

for reference @0xRobin @Hosuke

@aalan3 aalan3 mentioned this issue Oct 4, 2023
3 tasks
@aalan3 aalan3 self-assigned this Oct 4, 2023
@0xRobin
Copy link
Collaborator

0xRobin commented Oct 4, 2023

I've always been a bit confused as to what the purpose is for these transfers models.
Their main goal seems to be breaking up 1 record with "from" and "to" to 2 records with one being a "send" and the other being a "receive". Is it just to more easily build out the balances tables?

@henrystats
Copy link
Contributor

henrystats commented Oct 4, 2023

yeah @0xRobin I believe that's the main goal, plus that's how folks build balances on dune itself.

We have an aggregate table to get the sum value of transactions within an hour/day for a wallet_address, so that can be negative or positive, and since there are two columns we need to look at (both "from" & "to") we need to dedupe the aggregation.

It seems easier to do that at the transfers level before doing the sum.

We never used the transfers.erc20 table for anything other than building the balances spell...

Also, I have a couple of pr's open for other chains (all of them actually except celo) lmk if I should close them. I was going to edit and use macros for them.

@aalan3
Copy link
Contributor Author

aalan3 commented Oct 4, 2023

@henrystats yeah I noticed that shortly after writing this. I think unit256 is the way to go but will investigate.

@aalan3
Copy link
Contributor Author

aalan3 commented Oct 4, 2023

regarding the purpose, yes this is a intermediary table originally but happens to be exposed now. I suspect we don’t need as many intermediary tables/views for balances (the agg + rolling, maybe we don’t need both) anymore since we will materialize and performance is much better

@henrystats
Copy link
Contributor

oke then...

@aalan3
Copy link
Contributor Author

aalan3 commented Oct 5, 2023

Had a look at the types for Polygon to use int256. We can use try_cast which results in null if there's an overflow. I checked and there would currently be about 4800 rows (out of a total 4.7bn rows) that overflow when going from uint256 -> int256. I think that's fine since these value would likely overflow or loose precision in other places either way.

@0xRobin
Copy link
Collaborator

0xRobin commented Oct 6, 2023

Could we make this less ambiguous and find a more suitable name instead of transfers?
This should make the distinction more clear between these 2 use cases.

transfers:

  • timestamp
  • asset (token address)
  • sender (from)
  • receiver (to)
  • amount (uint256, always positive)

something else but not transfers:

  • timestamp
  • asset (token address)
  • address (wallet address)
  • amount (int256, positive or negative depending on the transfer direction)

maybe something in the line of balance_changes/balance_delta/balance_updates?

@aalan3
Copy link
Contributor Author

aalan3 commented Oct 6, 2023

@0xRobin this is a very good point. erc20 transfers could then just be a view of evt_transfer + the wrapped token transfers. Makes me reconsider if we should even have the table in the first place. I'll experiment a bit on the other end with balances and see what's needed to get that to work.

@henrystats
Copy link
Contributor

ouuuu awesome!

@pngwerks
Copy link

pngwerks commented Oct 6, 2023

maybe something in the line of balance_changes/balance_delta/balance_updates?

balance_updates seems quite clear for table name.

@lajarre
Copy link

lajarre commented Oct 9, 2023

fyi @henrystats @tomfutago @primatepngs @lajarre i believe you all have open PRs related to transfers and/or balances

we are looking to standardize these spells, with a consistent approach across evm chains leveraging macros. first step, as noted in the issue, fix transfers. then we can look into balances.

Looks good!

Part of my PR #4074 is already suggesting some changes to macros:

  • filter out rebase tokens (this is necessary).
  • optionally re-add unique_transfer_id via a macro parameter: this is a suggestion that I would happily remove. Deleting this column would break the balances tables behavior for users. This is a tradeoff to be made between code complexity and backwards compatibility.

But finally it seems we are awaiting #4522 to be done first. Shall I rather suggest these changes to be added to #4522 @aalan3 ?

Also, I realized my PR and #4420 from @henrystats are dupes. My PR suggests fixes to tests already (thanks @Hosuke for the help!). But I'd happily contribute to whichever PR moves the needle faster.

Had a look at the types for Polygon to use int256. We can use try_cast which results in null if there's an overflow. I checked and there would currently be about 4800 rows (out of a total 4.7bn rows) that overflow when going from uint256 -> int256. I think that's fine since these value would likely overflow or loose precision in other places either way.

Interesting analysis but trying to have in mind the expectations that a Dune user would have about data, it sounds like casting uint256 as int256 is breaking something. There might be relatively few overflowing addresses, but these might be the most important addresses (think, a treasury contract). I've been confronted to the problem before and I felt that loss of precision due to doubles is painful but not a deal breaker.
But perhaps am I missing part of the picture?

@aalan3
Copy link
Contributor Author

aalan3 commented Oct 10, 2023

@lajarre thank you for the input.

We've received a ton of feedback in the past about precision in balances it becomes worse when you sum things together.

Interesting analysis but trying to have in mind the expectations that a Dune user would have about data, it sounds like casting uint256 as int256 is breaking something. There might be relatively few overflowing addresses, but these might be the most important addresses (think, a treasury contract). I've been confronted to the problem before and I felt that loss of precision due to doubles is painful but not a deal breaker.

This assumes that token contracts mostly use a total supply close to 2^256 which I don't think is the case. I looked at the 4k rows and it was mostly spam tokens. Either way I will forward this feedback internally to see if there's something we can do to fix these values as well.

@lajarre
Copy link

lajarre commented Oct 10, 2023

This assumes that token contracts mostly use a total supply close to 2^256 which I don't think is the case. I looked at the 4k rows and it was mostly spam tokens.

Ok, makes sense. I guess then that properly documenting the limitation would already help mitigate the potential issues from the users point of view.

@aalan3 aalan3 changed the title Migrate and fix erc20.transfers Migrate and fix token transfers Oct 18, 2023
@aalan3
Copy link
Contributor Author

aalan3 commented Oct 19, 2023

I've updated this a bit based on the changes for the balances issue.

@aalan3
Copy link
Contributor Author

aalan3 commented Oct 19, 2023

@hildobby this is based on fungible, I'll have a PR up shortly for ethereum so we can see what it looks like. One question is, should we add WETH-like deposits and withdraws to transfers as well? Did you plan on adding it and do you think it makes sense? It's needed to get balances right eventually.

@hildobby
Copy link
Collaborator

@hildobby this is based on fungible, I'll have a PR up shortly for ethereum so we can see what it looks like. One question is, should we add WETH-like deposits and withdraws to transfers as well? Did you plan on adding it and do you think it makes sense? It's needed to get balances right eventually.

What do you mean? Like ETH <> WETH mints & burns? Those should be included afaik, right?

@0xRobin
Copy link
Collaborator

0xRobin commented Oct 19, 2023

What do you mean? Like ETH <> WETH mints & burns? Those should be included afaik, right?

WETH doesn't emit a Transfer event on deposit and withdraw, so they aren't in the erc20_ethereum.evt_Transfer table.
They instead emit a Deposit or Withdrawal event. Don't ask me why.
image

@hildobby
Copy link
Collaborator

What do you mean? Like ETH <> WETH mints & burns? Those should be included afaik, right?

WETH doesn't emit a Transfer event on deposit and withdraw, so they aren't in the erc20_ethereum.evt_Transfer table. They instead emit a Deposit or Withdrawal event. Don't ask me why. image

Oh TIL something new, then it should definitely be added to fungible.transfers imo. I'm kind of lost on all the PRs around it rn though, maybe it's worth bundling that addition with the rework to the new structure (base and enhanced tables) that you guys mentioned you wanted to do for that sector

@grkhr
Copy link
Contributor

grkhr commented Nov 7, 2023

Btw there are a lot of tokens which supports staking option and there are no easy way to properly determine balance of them.

For example stETH (which is huge) balanceOf implementation is custom: https://github.com/lidofinance/lido-dao/blob/master/contracts/0.4.24/StETH.sol#L166

E.g. for 0x41C59bf14D1407311AAa1B91EEE6D2d441a3Ec52:
Dune balance based on evt_Transfer: 9,963 stETH ($18,701,257)
True balance on etherscan: 10,430 stETH ($19,577,849)
The diff is huge 😬 And I have no idea how to deal with that (and should we?)

jfyi @aalan3 @hildobby @0xRobin @henrystats

@aalan3
Copy link
Contributor Author

aalan3 commented Nov 7, 2023

@grkhr I'm not sure how we can support those in balances given their implementation (which deviates from the ERC20 standard). Maybe we could do a staked_balances table in the future where we calculate those on the fly, otherwise I'm not sure how we could get the correct values using the data currently available for us?

@grkhr
Copy link
Contributor

grkhr commented Nov 7, 2023

I have no idea too, just adding some context :)

Also it'd be nearly impossible to handle staking logic for top 10-20 tokens as it can be too sophisticated (1inch for example)

@henrystats
Copy link
Contributor

I believe stETH is a rebase token and those were filtered from the balances spell originally, and still are, stETH just wasn't added to the rebase tokens list hence why it's still showing up.

Getting stETH balances is a very long process, even on dune itself, I've not seen any query that gets the balances completely accurate (i.e the correct exact balance at the time), there has been some workaround that the stETH team created themselves, Hildobby also worked on one and that seems to be the approach to getting stETH balances for non contract addresses.

I'd advise we filter them for now.. cc @grkhr @aalan3.

I've also been inactive for a few weeks now as I was trying to deal with some personal issue.... I'm back working now, please do let me know how I can be of help here.

@aalan3
Copy link
Contributor Author

aalan3 commented Nov 7, 2023

@henrystats @grkhr I opened an issue to continue the discussion and organize the work #4735 since I think this is potentially separate from transfers (because there are not always transfers as mentioned above).

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 1, 2023

We have transfers for all chains! The tables are queryable but not yet exposed in the data explorer. There's still some work left in getting gas fees in and wrapped tokens might be missing for some chains.

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 18, 2023

With #4987 we now have a transfers table tokens.transfers where you can query all chains at the same time. If you find any problems or have feedback please report them in this issue so we can fix them

@henrystats
Copy link
Contributor

woohoo @aalan3 niceeeee... Thank you very much ser. These tables and the balances tables are going to very very important once they're live.

I've been off spellbook for a while now and only just recently resumed, pls let me know if there's any way I can be of help. I'll check out the pr nowww!

@hildobby
Copy link
Collaborator

With #4987 we now have a transfers table tokens.transfers where you can query all chains at the same time. If you find any problems or have feedback please report them in this issue so we can fix them

Nice, ty! Do those include nft transfers too? tokens.transfers is the name @MSilb7 and I wanted to use for a spell joining all fungible and non-fungible token transfers, else it might be confusing to have a table named tokens.transfers that only includes one type of tokens, especially since there's already some tables for nft tokens in that schema

Screenshot 2023-12-18 at 14 57 02

I can't seem to pull the table on Dune, are native token transfers included in there too? Joining ERC20s with native transfers is the most annoying thing to do in every query revolving around fungible tokens rn

@jeff-dude
Copy link
Member

With #4987 we now have a transfers table tokens.transfers where you can query all chains at the same time. If you find any problems or have feedback please report them in this issue so we can fix them

unfortunately, we had to revert the cross-chain token transfers :(
here is the revert: #5004

the prod cluster had an issue trying to build it. it's been escalated to the engine team.

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 18, 2023

Ah sorry I missed that the table was reverted 🤦‍♂️

Nice, ty! Do those include nft transfers too? tokens.transfers is the name @MSilb7 and I wanted to use for a spell joining all fungible and non-fungible token transfers, else it might be confusing to have a table named tokens.transfers that only includes one type of tokens, especially since there's already some tables for nft tokens in that schema

@hildobby it's not included atm we just wanted to first finish the non nfts first. But we could add nft transfers pretty easily because it would basically just be reading what's in nft.transfers. I can create a new issue for this.

@grkhr
Copy link
Contributor

grkhr commented Dec 18, 2023

looked through the reverted PR. it tries to join prices for all chains and materialize to one table as tokens_<blockchain>_transfers are just non-mat views. i can suggest to make tokens_<blockchain>_transfers materialized and release it chain-by-chain, then union all to non-mat-view tokens.transfers.

i'm not surprised that the prod cluster failed 😄

@jeff-dude @aalan3

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 18, 2023

Also note that #4911 is still open, if someone has any input on how to include gas fees please provide it in the PR.

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 18, 2023

@grkhr yeah we could also union the *_base_transfers tables instead and then enrich.

@hildobby
Copy link
Collaborator

Ah sorry I missed that the table was reverted 🤦‍♂️

Nice, ty! Do those include nft transfers too? tokens.transfers is the name @MSilb7 and I wanted to use for a spell joining all fungible and non-fungible token transfers, else it might be confusing to have a table named tokens.transfers that only includes one type of tokens, especially since there's already some tables for nft tokens in that schema

@hildobby it's not included atm we just wanted to first finish the non nfts first. But we could add nft transfers pretty easily because it would basically just be reading what's in nft.transfers. I can create a new issue for this.

hmm sounds good, but ideally the end result is nft.transfers, fungible.transfers and tokens.transfers (which joins the former two) so that there's a specific table for each use-case (sometimes I want just fungibles/nfts and sometimes both), that's why I originally created the fungible transfers under fungible.transfers with the intent of joining the two into tokens.transfers afterwards. idk if fungible is the best name but it's the best I could come up with for clarity & naming simplicity, having only fungible in tokens.transfers is confusing imo, would love to hear other ppl's thoughts

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 18, 2023

hmm sounds good, but ideally the end result is nft.transfers, fungible.transfers and tokens.transfers (which joins the former two) so that there's a specific table for each use-case (sometimes I want just fungibles/nfts and sometimes both), that's why I originally created the fungible transfers under fungible.transfers with the intent of joining the two into tokens.transfers afterwards. idk if fungible is the best name but it's the best I could come up with for clarity & naming simplicity, having only fungible in tokens.transfers is confusing imo, would love to hear other ppl's thoughts

That's a good point. What I was thinking was that we can partition the table by token_standard (and blockchain), so if you'd like to query only nfts or skip native transfers for example you should still get almost the same performance as if it was a separate table.

@jeff-dude
Copy link
Member

looked through the reverted PR. it tries to join prices for all chains and materialize to one table as tokens_<blockchain>_transfers are just non-mat views. i can suggest to make tokens_<blockchain>_transfers materialized and release it chain-by-chain, then union all to non-mat-view tokens.transfers.

i'm not surprised that the prod cluster failed 😄

@jeff-dude @aalan3

great point!

i was naively trying to push that out quick & went against the new sector design 🤦‍♂️ i will try to take the correct approach this afternoon and see how that runs

if it runs fine, we can revisit the points here on naming standards

@grkhr
Copy link
Contributor

grkhr commented Dec 18, 2023

@jeff-dude already tried with BNB as the biggest one, CI seems fine
#5021 - PR for testing purpose

you can try to merge it and see if it'll run well on prod. if yes, so releasing by chain in 3-4 PRs will pass easily.

@hildobby
Copy link
Collaborator

hmm sounds good, but ideally the end result is nft.transfers, fungible.transfers and tokens.transfers (which joins the former two) so that there's a specific table for each use-case (sometimes I want just fungibles/nfts and sometimes both), that's why I originally created the fungible transfers under fungible.transfers with the intent of joining the two into tokens.transfers afterwards. idk if fungible is the best name but it's the best I could come up with for clarity & naming simplicity, having only fungible in tokens.transfers is confusing imo, would love to hear other ppl's thoughts

That's a good point. What I was thinking was that we can partition the table by token_standard (and blockchain), so if you'd like to query only nfts or skip native transfers for example you should still get almost the same performance as if it was a separate table.

For nfts, I've already been using nft.transfers for a long time now, so I thought having a table for a tokens but none for just fungibles would be weird. All these things can already be done in queries but the fewer lines I need for a specific queries the better the wizard UX imo. If you prefer having all in tokens schema I'll make a view for just fungibles on top of it but that seems more inefficient than having both joined, no?

Also, how would the two joined work here? prices aren't really obtained at a transfer level for nfts currently afaik

@aalan3
Copy link
Contributor Author

aalan3 commented Dec 19, 2023

Also, how would the two joined work here? prices aren't really obtained at a transfer level for nfts currently afaik

Prices would not be available for nfts just like they are not available for a lot of erc20 tokens. The table would be normalized and unioned together with existing transfers

@jeff-dude
Copy link
Member

considering this done for now, as 13 chains are built and live

we can start new issues to track enhancements and/or bug fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
balances WIP work in progress
Projects
None yet
Development

No branches or pull requests

8 participants