Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redeem implementation for dso-token #46

Merged
merged 6 commits into from
Aug 26, 2021
Merged

Redeem implementation for dso-token #46

merged 6 commits into from
Aug 26, 2021

Conversation

hashedone
Copy link
Collaborator

Still draft, as test are to be written (working on this right now, just wanted to expose what is done already).

@hashedone hashedone marked this pull request as draft August 18, 2021 13:27
Copy link
Contributor

@ethanfrey ethanfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice design and thanks for saving redeems to query them.

it would be nice to add some way to clean them up.

I will review the contract.rs code later

@@ -5,3 +5,123 @@ This is a cw20-based token with whitelisting, to be used in the context of a DSO
It can be used to provide one side of a trading pair in an AMM setting.

Only whitelisted users will be able to add liquidity to the token, and trade it.

## Differences between standard cw20 and dso-token
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the documentation


```json
"reedem": {
"amount": 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amount should be a string (Uint128 serialises that way). Otherwise, good.

"reedem": {
"amount": 1000,
"code": "reedem-code",
"memo": "Meta information"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to add optional info on original-sender, so an AMM can say who was requesting the redeem? (As proposed in the issue)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand Reedem should be executed by holder, so by the one requesting reedem (even if not a DSO member). Those I don't see a reason why to add it in message, it is just info.sender. It is included in reedem info (both in event, and stored in chain).
`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the AMM does this on the user's behalf.

But you are right, that is #14. Be agile, we add it later if needed.

}
```

To finalize off-chain reedem operation, token provider might either subscribe on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, these are then stored on-chain.
There should be someway to delete them / clean up, especially for the list queries.

We can assume the Minter is the issuer and can delete these (and no one else can delete it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I am just thinking on how to perform it. I imagine that it would be preferred to perfom bulk-removal instead of one by one, probably just taking a vector of codes to be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector of codes make sense. That is the pk, right?

It costs ~2000 gas for a delete, something like 20.000 gas overhead to submit any tx. 40.000 gas to execute a contract (if not pinned). it makes sense to do 10-50 deletes in one tx gas wise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I added vec based delete + full cleanup (I can imagine scenario when provider handles all pending redeems, then just cleans it up).

pub const REEDEMS: Map<String, Reedem> = Map::new("reedems");

/// Entry about reedem which had place
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, JsonSchema)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good


REEDEMS.save(
deps.storage,
code.clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use &str in the map, and then just &code here.

Copy link
Contributor

@maurolacy maurolacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for adding the detailed docs.

Please s/\([rR]\)eedem/\1edeem/g everywhere. I don't want to be annoying, but if you don't mind I can proof read the docs, etc. and suggest edits.

@hashedone
Copy link
Collaborator Author

@maurolacy you are not annoying, I know I am communicative in English, but not fluent. As I write more docs here I would probably look for some spell checking plugin for vim (but it would need to be one which checks only comments to be reasonable).

@maurolacy
Copy link
Contributor

As I write more docs here I would probably look for some spell checking plugin for vim (but it would need to be one which checks only comments to be reasonable).

Intellij Idea does a good job with spelling IMO. Spelling mistakes are highlighted, you can press F2 to see next error / suggestion. And you can add misspelled words / variable names to an allowed list.

Copy link
Contributor

@ethanfrey ethanfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few more comments.

let me know when you push all changes and I will go through for last approval

{
"code": "reedem-code",
"sender": "addr-performed-reedem",
"amount": 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, a string here

.map(|reedem| {
let (code, reedem) = reedem?;
Ok(ReedemInfo {
code: String::from_utf8(code)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had a discussion about this form, or the unsafe unchecked version.

Mauro convinced me we just use the unsafe one, as we wrote the data ourselves.

Maybe you two can discuss this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is - we are not. I had deep think about it. The problem with unsafe string conversion is it does not look scary, but it factually is unsound.

The data we read in this point are at this point not necessary the same we wrote. When they are written they are going through storage, wasmd, operation system, hdd, network, and then the same way reversed. There are somehow taken from storage, and were somehow formatted. It is storage a place to take care about delivering proper structures. If storage provided us flat bytes, we got flat bytes and we cannot claim at this point that we exactly know they structure - anything could happen, starting from corrupted blockchain input (on verifier hdd), different version of wasm interpreter or wasm storage used to store data (it is possible that it was stored with some preprocessing, which changed during versions, and respective libraries are responsible for handling it - in smart contract there are no rights to assume that binary data are valid unless we validated them).

The problem is that using unsafe version doesn't open code for crashing. It opens it for being vulnerable. Basically if for any strange reason (either random or actually exploit) data which came are not of the format we expected, then all Rust guarantees about memory safety are invalidated - because everything including serde and printing utilities is allowed to assume, that this string is properly formatted utf8 string, and can perform any optimization and unsafe string tricks.

Long story short - I strongly believe that assuming that data are valid just because we think we stored it before is not only unsafe, it is actually unsound. This is the reason why from_utf8 doesn't just assume utf is proper, as such case wont just result in improper printing, it would result in memory safety violation (that is what unsafe mean). refusing to perform linear check of data validation (this is what basically happens, it is O(code.len()) when memory allocation is involved is a huge mistake.

Copy link
Contributor

@maurolacy maurolacy Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My analysis of the convenience of using from_utf8_unchecked comes from the assumption that the data we read is the same data we wrote.

If that assumption is broken, you are of course right: invalid utf8 strings produced from from_utf8_unchecked result in undefined behaviour.

I think the decision here is, do we want to "protect" against those situations? If a hard disk fails, or some blockchain input is corrupted somehow, the situation is already extremely serious. Does it make sense to check against invalid input, when that input can only be the result of some serious failure at a lower / more fundamental level?

In my opinion, it makes sense to protect against invalid input when it can be the result of a higher, less fundamental level (typically, user-provided input).

from_utf8_unchecked is there for a reason. That reason is, precisely, that you can trust the data as valid. That is, that you assume that it is because you checked it before and now it comes from a safe place, etc. etc.

What you are saying amounts to saying that that situation is never possible. A cosmic ray can hit the computer, alter a bit, and turn a valid utf8 character into an invalid one, by example. If that can happen, by the way (and in fact, it can), my point is then other parts of the system can fail unpredictably or in an undefined way as well. Not only utf8 strings are at risk.

But, maybe I'm wrong. I'm OK with switching back to the safe from_utf8 version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starting from corrupted blockchain input (on verifier hdd),

We do not need to worry about this one, as it is a blockchain. There are dozens of computers. If your computer has some random bit flip, it will produce a different result, this will be detected in the next consensus round (app hash and data hash) and your node will be kicked out of consensus. It doesn't matter how we handle it in the code. If your computer differs from the majority, it will be kicked out.

I would argue we can assume anything that only affects 1 or 2 nodes can be ignored.

However, if you argue there was a previous logic issue that affected all nodes storing the code, and this lead to invalid utf8 being (correctly) stored in the database, only to be read later by code that assumes it's correctness, this is indeed a problem.

We can discuss the second class of issues, but let's only look at the "deterministically invalid" data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If v1 of a contract writes arbitrary binary and v2 writes strings, but also assumes they are all valid utf8 (unchecked), then we do have a problem.

v2 would have to either (1) migrate all keys when doing the upgrade (we provide the migrate entry_point for this) or (2) use from_utf8 and handle the error case by somehow handling "legacy binary keys". Simply returning the error doesn't make sense in this case.

In short, I favour from_utf8_unchecked(x) over from_utf8(x)?, where we just return an error, as I see no cases where the second is proper behaviour and the first can fail.

In some situations (changing the data format between versions), we should do something like:

let name: String = match String::from_utf8(&key) {
  Ok(x) => x,
  Err(_) => hex::encode(&key),
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good discussion and should be added to the docs somewhere or at least linked from somewhere prominent.

Copy link
Collaborator Author

@hashedone hashedone Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But proper case completely doesn't matter. Only failing scenario matters. If "we just return an error", then we properly return an error and stopping further processing. We are safe. If we do unchecked conversion, we are opening contract for undefined behavior including segmentation fault. _unchecked doesn't panic on failure, it disables all safety promises Rust gives you - returning error is really way better than having UB. The only potential excuse is that it is wasm, but I find it not very reasonable excuse.

And tbh - I completely cannot understand any reasoning what unsafe version may be preferred. It is:

  • longer by itself
  • requires unsafe block
  • exposes for vulnerability including potential exploits (there are multiple known attack vectors related to invalid/lack of utf-8 validation), intoduces UB
    While the only downside is iteration which is literally (actual implementation is more complicated as it performs tricks to make it even more efficient, but this is human-readable equivalent):
let mut i = 0;
while i < s.len() {
  let n = match s[i] {
    c if c < 0x80 => { i += 1; 0 },
    c if c < 0xc0 => { return Err(...) }
    c if c < 0xe0 => { i += 2; 1 },
    c if c < 0xf0 => { i += 3; 2 },
    c id c < 0xf8 => { i += 4; 3 },
    _ => return Err(...)
  }
  
  for j in 0..n {
    if s[i + j] & 0xc0 != 0x80 { return Err(...) };
  }
}

This is in worst case single and + comparison per byte character, right before next string is allocated vs introducing input dependent UB. If we really look for cutting edge optimizations then I would really start on profiling and possibly reduce some obsolete clones, needless v-calls...

And quoting documentation:

This function is unsafe because it does not check that the bytes passed to it are valid UTF-8. If this constraint is violated, undefined behavior results, as the rest of Rust assumes that &strs are valid UTF-8.

and for from_utf8:

If you are sure that the byte slice is valid UTF-8, and you don’t want to incur the overhead of the validity check, there is an unsafe version of this function, from_utf8_unchecked, which has the same behavior but skips the check.

There is no word about how error are handled. If you are 100% sure than data are utf8 encoded, and you can prove it (it is not just your intuition), you can use _unchecked (and probably should leave soundess proof in the comment). Otherwise do it right.

Copy link
Contributor

@maurolacy maurolacy Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in worst case single and + comparison per byte character, right before next string is allocated vs introducing input dependent UB. If we really look for cutting edge optimizations then I would really start on profiling and possibly reduce some obsolete clones, needless v-calls...

You have a good point here. The strings we're handling are typically short (less than 100 chars) and the cost of validating them is therefore very small / negligible.
There are other issues with from_utf8, in that we have to handle the error, which in some cases requires a map_err. But, they are somewhat compensated by not requiring an ugly unsafe block or mark (along with the increased safety).

So, I think you are right: the potential advantages of using from_utf8_unchecked are not worth the risk. Particularly when dealing with short strings.

I guess I was biased by the fact of having used from_utf8_unchecked in the past, over long (several kilobytes) strings, with significant performance gains.

Also, I still think that that validation doesn't add / protect from anything significant. But, you are right: better to err on the safe side.

Copy link
Collaborator Author

@hashedone hashedone Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map_err in this case is literally zero cost, it just wraps type. Not even clonning there in many cases. I understand case of long strings, but it is not a case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go forward with this, and please add an issue to cosmwasm-plus to change from the usafe variant to the safe one... in all that code (and add a From helper if needed so we can use ?)

@hashedone hashedone force-pushed the 12-reedem-dso-token branch from 6da5128 to 964f63c Compare August 25, 2021 08:14
@hashedone hashedone marked this pull request as ready for review August 25, 2021 10:45
@hashedone
Copy link
Collaborator Author

@ethanfrey @maurolacy final review-ready version.

Added only single test. I was playing around how to improve it, but basically too much to do for the scope of this. I want to improve query mocking, so it is easy to use it with minimal boilerplate, by having configurable mock (I have an idea how to approach this in future using mockall). Also there should be some way to easy verify events which occurred.

@hashedone hashedone changed the title Reedem implementation for dso-token Redeem implementation for dso-token Aug 25, 2021
Copy link
Contributor

@ethanfrey ethanfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
A few stylistic comments, but the logic seems good and the tests nice.

I would add one more test to cover some failure cases where you cannot redeem


assert!(
suite
.events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a helper for this... thinking of it already

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just started CosmWasm/cw-plus#392

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: that was for AppResponse, not Suite, but we could reuse the same code somehow

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I am thinking to return AppResponse from helpers instead of caching events. Possibility of functions chaining is nice, but not so crucial, and I think that possibility to verification on response is kind of relevant. I overlooked it before. Then we could switch to linked solution.

Copy link
Contributor

@ethanfrey ethanfrey Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be a separate PR? We merge this, and then update the tests/suite in a new PR (maybe with cosmwasm-plus 0.8.1 and helpers).

I agree completely with the change to return AppResponse and remove chaining in event execution

.iter()
.find(|&ev| ev.ty == "wasm-reedem")
.is_some(),
suite.events.iter().any(|ev| ev.ty == "wasm-reedem"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is a lot like what I was trying to simplify in CosmWasm/cw-plus#392, but I guess makes sense when you want to assert one or more attributes as well. It would be like:

suite.assert_event(Event::new("wasm_redeem"));

or more verbose:

assert!(
  suite.has_event(Event::new("wasm_redeem"),
  "No redeem event: {:?}",
  suite.events
);

@@ -150,13 +150,16 @@ pub struct Suite {
pub whitelist: Cw4Contract,
/// dso-token cash contract address
pub cash: Cw20Contract,
/// events collected from executions, for verifiaction purposes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this all executions appended?
Just the last execution?

@@ -188,6 +194,8 @@ impl Suite {
)
.map_err(|err| anyhow!(err))?;

self.events.extend(resp.events);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, you append them all. Seems a bit verbose after a few executions.

Maybe we make it Vec<Vec<Events>> and the first index is the number of the transaction, and we capture the events from each execute separately.

And/or add a clear_events() method to clear them right before the event we want to check. Maybe that would be easier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said in other place - I am not super happy about caching events solution, I would just return AppResult and give up chaining as it is not so important probably.

to blockchain:

```json
"redeem": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick, buy that is not valid JSON.
You need to wrap it into an object ({ as a first line, another } as a last line)

for key in keys {
REEDEMS.remove(
deps.storage,
std::str::from_utf8(&key).map_err(StdError::from)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .map_err(StdError::from) would be gone if we implemented impl From<str::Utf8Error> for ContractError, right? Might be nice to add that, which encourages using std::from_utf8(&key)? rather than the unsafe variant (it becomes even shorter)

@@ -185,6 +312,40 @@ fn query_is_whitelisted(deps: Deps, address: String) -> StdResult<IsWhitelistedR
Ok(IsWhitelistedResponse { whitelisted })
}

fn query_redeem(deps: Deps, code: String) -> StdResult<RedeemResponse> {
REEDEMS
.may_load(deps.storage, &code)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistic point. I typically use:

let redeem = REDEEMS.may_load(deps.storage, &code)?;
Ok(RedeemResponse { redeem })

Which seems shorter and clearer to me. Do you feel this is more idiomatic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is more natural to me as I like to think in terms of transformations. I don't think that there is actual relevant argumentation to one being more idiomatic than the other. Technically maybe the ? involves additional branching and maybe some copying on return, but I don't believe in that, it probably compiles to exactly the same, and even if not - the only difference would occur on error-flow which is not so relevant. Also argumentation that intermediate assignments are less functional is not true, all functional languages provides intermediate bindings. And Rust is not functional in any matter so whatever.

Only reason why I prefer map in this kind of simple cases is that id doesn't introduce branches in my head (while reading), it is just transformation after transformation which I like. However I don't mind go your way if you prefer it and it reads better to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine.
Interesting to hear your reasoning. That was more why I brought it up, not to enforce the style.

assert_eq!(suite.total_supply().unwrap(), 1000);

// members still can redeem after he is removed from whitelist
suite.events.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you do use clear_events() but from the standard library. Looks good to me.

suite
.remove_member(&member)
.unwrap()
.redeem(&member, 1000, "redeem-code-2", None, "Second redeem")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a number other than 1000, so it is different?
If we add a nice events comparison helper, we can even check the proper amounts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - I didn't care about it as I didn't compare it on event, but if I would then it makes sense. Also I added different sender to check overwriting it works.

@@ -524,3 +524,43 @@ fn whitelist() {
.unwrap();
assert!(!is_whitelisted.whitelisted);
}

#[test]
fn redeem() {
Copy link
Contributor

@ethanfrey ethanfrey Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a cannot_redeem() test with 3 cases:

  • member trying to redeem more than their balance
  • anyone with no balance trying to redeem 1
  • member with sufficient balance trying to redeem with previously used code

Both should show a proper error message, not some underflow/panic

Copy link
Collaborator Author

@hashedone hashedone Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. However here I see the problem with our multitests - errors compared as strings. Even with anyhow, they come there as strings. Errors should be therefore tested on UT level but better mocking would be helpful here (I really don't like setting up test by inserting to internal map in-test).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wrap them in anyhow inside multitest itself (ContractError -> anyhow), then these compares would work better, right?

Let's have some ugly string tests now. And mark a TODO to improve that, when we have a cosmwasm-plus 0.9 with these better error handling lower down

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants