-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify SIGN_MODE_TEXTUAL #6513
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Let's get this discussion started. There are two general approaches that I'm thinking of. JSON/YAMLBasically we take the proto3 JSON serialization (possibly with some additional constraints) and make sure it's serialized canonically with http://gibson042.github.io/canonicaljson-spec/. If we go with this approach, we should consider using a canonical YAML serialization instead as that would be somewhat more human readable which is the goal here. String formattingTaking inspiration from ICU MessageFormat, we could embed format strings in .proto files like: message MsgSend {
option (cosmos.format_string) = "Send {amount, coins} from {from, address} to {to, address}";
} And then clients would sign formatted text that looks like:
The text for each message could be localized to different languages using standard translation bundles. Neither of these approaches is mutually exclusive, i.e. we could support both. Also I would note that we could deal with malleability as a secondary concern by appending these messages with base64 proto bytes from I want to personally express a preference for the string formatting approach. Having studied the ledger app source code a bit, it’s going through a fair bit of effort to extract fields out of JSON structures. Message names and fields are generally hard-coded, and as more and more message types get added the ledger app will be unable to keep up and just default to showing raw JSON for lots of message types which is not ideal. With the string formatting approach, the protocol would define a human readable format and the ledger app has a much simpler task of parsing human readable lines of text. New message types and localized translations can be added without needing any changes to the ledger app. It’s actually hard to imagine how we would scale the ledger app to support all that without this approach. So I really think string formatting is the future of |
I think we should be deprecating JSON. I support the string formatting proposal. We should have a list of primitive types that allowed to be referenced in the formatting strings. There should be the option of supporting multiple formatting strings per message. I expect this to be used to internationalization. There should be a character limit on formatting strings. |
This seems like a great direction and we should figure what we need to do support this across out ecosystem. |
Deprecating JSON/YAML is good, those are mostly "web" formats that allow nesting, are complicated without dynamic memory and are hard to define canonically. With respect to hardcoding, the only aspect that has been hardcoded are some lookup tables used to convert long confusing keys into human readable text descriptions (e.g. "msgs/value/validator_src_address" => "Validator Source"). As cosmos evolved, these less readable names started to creep in and we had to find a workaround. I see how this can happen even with the good intentions behind the string format... once people start using "dev-speak" in the strings, they may get hard to read anyway. Overall, I am not sure yet about this string formatting concept. It may not be easy to parse at all and unless it is very well defined. This requires working on a spec for something new.. and to be fair, there are plenty of good standard encoding options. Given only a string like the example, unless we just take it AS IS and paginate it... then parsing will be hard and it will require plenty of hardcoded "chunks", etc. These plain strings create lots of opportunities for mistakes (or even exploits) once other developers start extending the protocol beyond Cosmos. If the redesign is focused on small and/or embedded devices, then these long strings descriptions (as per the example) are actually not useful or convenient. Screens are not big enough so information needs to be split and navigated. A long string being paginated could be automatically cut off at inconvenient points, etc. If we are thinking of a redesign, personally, I would prefer a well defined flat array of key/value pairs where keys are human readable and values comes pre-formatted. How this array is provided is not really critical, however, something like CBOR works well and is easy to handle. |
The intention of SIGN_MODE_TEXTUAL is that the hardware device will not need to parse the string provided at all. The hardware device just needs to display the string, sign the string + additional data. The correctness of the string as a representation of the additional data will be enforced by the chain. |
I think we need to discuss this on an architecture call with both of you present (and other key stakeholders) with some hypothetical examples worked out. @clevinson can you put it on the schedule sometime soon? |
We just discussed this on our architecture call with @jleni and @gagbo, notes here: https://hackmd.io/CIusfoNYS2u99zBMm-8LNA It seems like it may be valuable to start with adding a protobuf JSON sign mode following https://github.com/Zondax/ledger-cosmos/blob/master/docs/TXSPEC.md as closely as possible, and this doesn't prevent us from exploring a string formatting approach later down the line. If people are aligned with the JSON approach as a starting point, I think it's relatively straightforward and I can propose something more concrete. @zmanian @alessio @alexanderbez One important consideration is that the ledger screen only supports ASCII characters 32-127... so localized text strings in other languages will be cumbersome. |
I created a proto JSON |
A design with similar goal from Eth land https://github.com/ethereum/EIPs/blob/b92a65e0473f4c1e9118c74ed069c1040721bc4e/EIPS/eip-3224.md |
In the last architecture call, we agreed that custom deterministic proto JSON serializer is essentially inevitable. It's fine if it works off of reflection. Just as a note we do have a custom reflection based protobuf deserializer for rejecting unknown fields and it isn't horribly complex. Also, clients could use an RPC endpoint if they don't have deterministic JSON as a stop gap measure. This should be reasonably secure because Closing this issue and opening another issue for implementation. |
I want to re-open this discussion. Previously @zmanian and I had been in alignment that the textual sign mode should be something that is actually human readable, i.e. format strings in the user's language. Also, we are not the only ones to propose such a thing: https://github.com/ethereum/EIPs/blob/b92a65e0473f4c1e9118c74ed069c1040721bc4e/EIPS/eip-3224.md. When we discussed with @jleni and the ledger team, we ended up deciding to just use protobuf JSON because a) that seemed simplier both on the ledger and SDK side and b) concerns about developers not writing good human readable format strings. Regarding a), it does not actually look like this is simple for the SDK, CosmJs or the ledger app. For the SDK and CosmJs (cc @webmaster128), we need to write proper canonical proto JSON serializers which adhere to a number of rules beyond the official proto3 spec. For the ledger app, 3 JSON sign modes will need to be supported with slightly different structures (legacy amino JSON, proto JSON, & proto JSON aux for #9406). With that, we will not get much better UX than we currently have. We will actually need to show users longer Msg names ( We can always do proto JSON signing first and then add textual signing later, but given that proto JSON signing is actually a lot of work for three teams, I want to invite us to consider whether we're maybe better off scraping proto JSON and just doing sign mode textual like we originally intended. My fear is that proto JSON will have felt like such an effort for so little pay off that nobody will have much enthusiasm to do the work for actual textual signing. I know @jleni has concerns that format strings will be poorly implemented by developers and this is a realistic concern, but also something that can be solved by proper QA on the part of the SDK team and teams building on the SDK. We can probably deal with most concerns with proper guidelines and automated linting in CI. For instance, we could enforce rules such as the following:
So far we have started working on the proto JSON signing, but I think we are not too far along to change course. I obviously want to get this done soon so the specification would need to happen quickly, but happy to lean in there to get something formalized ASAP. My gut feeling looking at all this is we're going to feel better about what we end up with if we spend 2-3 months on the textual signing than proto JSON, but would love to hear what others think. |
@hxrts as I mentioned today, I also see this relevant to gov UX in the sense that if we had textual descriptions of all messages as part of the protocol, that would automatically give us a human readable way to describe what a user is voting on. |
I don't think it's a requirement for the messages to be bijective. I think sign mode textual is a fantastic vision and would be a massive improvement over the status quo of custody which is terrible. I wasn't willing to fight a desire to just do a new protobuf but I am willing to support a new protobuf work |
Excellent analysis, @aaronc and I couln't agree more. One more note: we all need to be prepared to support Amino JSON much longer than originally anticipated last year. We already do the heavy lifiting in CosmJS now. So I don't see the point of rushing for something new we are not fully comfortable with. |
I agree with @aaronc concerns. Specifically regarding sdk proto JSON representation. I don't think it's that much user friendly, surely for a developer is easy to understand a JSON message, but even if we assume it's going to be outputted as a single line, it will be hard to read and understand. (Oops, I didn't see aaron proposed a similar approach!) I'm more in support of implementing sign mode textual, even if initially it will not be perfect I think it will give better user experience than jsonpb. Also since we use protobuf for everything and it's the way we share chain API specifications, we could use them to represent how message MsgSendCoins {
option (cosmos.sign_mode_textual) = {
format: "sending coins from account (1) to account (2), total: (3)",
};
string from_address = 1;
string to_address = 2;
repeated Coin amount = 3;
}
message Coin {
option (cosmos.sign_mode_textual) = {
format: "(1)(2)"
};
string amount = 1;
string denom = 2;
} If we used protobuf files, the format can be shared and be canonical because defined in pb files and easily implemented by different languages (especially those who have protoreflection, it should be super straightforward). There would still be a problem with for example |
Assuming this gets implemented in a protov2 environment, we can easily check at chain level which messages contain known types that need to be parsed in a different way (ex: Duration, Timestamp, cosmos "wellknowns") |
Thanks for commenting @zmanian and @webmaster128! And thanks for writing up those ideas @fdymylja 🙏 I think there a few different design areas to think about. I'll break them up here: Format String StorageSo my initial thought was to use a protobuf option for the textual format string. The downside I think might be if we want to have different versions of the format strings independent of the proto files. In that case, maybe having the format strings in separate files makes sense. Then we could have sub-versions, i.e. Template LanguageI've been thinking that something like mustache, might be a good fit. I spent a bit of time brainstorming what the strings might look like for different Msg types based on a mustache model: https://hackmd.io/fsZAO-TfT0CKmLDtfMcKeA?both. This approach assumes that we have some well-known types that are pre-formatted like you're suggesting @fdymylja. I'm thinking we'd start with cosmos scalars (#9694) and proto well know types, but also cover special cases like We could also take an approach where formatting needs to be explicit and use a slightly more complex templating language like https://handlebarsjs.com. Ex:
where I think the advantage of pre-formatted types is that it makes templates easier to write. The advantage of explicit helpers is that it maybe allows for more advanced versioning of helper methods. i.e. say we had a v2 format string that uses a more advanced coins format (say that makes IBC coins look nicer), v1 could still use Also, I would note that I don't think there's any particular reason to prefer a minimalistic template language like mustache over something more complex like handlebars as long as they're reasonable well defined and platform independent. |
So I've fleshed out some ideas in https://hackmd.io/fsZAO-TfT0CKmLDtfMcKeA a bit more. Here's my rough proposal on the syntax:
In terms of versioning:
Would love feedback on whether this direction makes sense. If so, I believe we need to present it to Ledger somewhat formally to see if we can get their approval. At least that's my understanding from chats with @jleni. |
To me everything sounds good, I have one question: do we plan to use an existing library/framework to parse format strings or do we plan to create our own? I'd vote to use something that exists, because creating our own template language would require some time and a lot of care and audits most likely.
I don't want to be too annoying regarding this, but I'd really like for us to explore a way in which we can make this fit inside protobuf files instead of having some other file/format to share this information with consumers. Reasons being the following:
Regarding the concern of format versioning vs protobuf versioning it can be handled in the following way: // NOTE this definition is part of cosmos.proto
extend google.protobuf.MessageOptions {
repeated SignModeTextual sign_mode_textual_formats = 1110000;
}
message SignModeTextual {
string version = 1;
string parser = 2;
string format = 3;
// blablabla
}
// This is part of bank.proto
message MsgSendCoins {
option (sign_mode_textual_formats) = {
version: "v1",
parser: "mustache",
format: "Send {amount} from {from_address} to {to_address}",
};
option (sign_mode_textual_formats) = {
version: "v2",
parser: "custom",
format: "You will send {amount} to {to_address} from your account {from_address} "
};
string from_address = 1;
string to_address = 2;
repeated Coin amount = 3;
} This changes nothing from the proposal you've made above, aside from putting this information into protobuf files, which are something that devs already understand, know how to share and handle. The only downside I can see from this: if we add formats we would most likely need to bump pb files versions too (or maybe not since message spec is not being broken?). Although I prefer to deal with versioning one thing than two, but this is just my personal preference. |
I think we can start with existing mustache libraries. We just need to instruct it to switch delimiters with a preamble I think ( Regarding versioning of value rendering, that actually should be one version per chain I think so maybe it isn't even tied to an individual message set. I think the big issue is how to coordinate between client and server versions as well specifying the version in I don't really have a strong preference as to whether format strings are in proto files or some other file (I was thinking using go embed with a dir of .yaml or .mustache files). Possibly both could be supported? One question I have is how does protobuf deal with multi-line strings in options? In general though, if we can align on how format strings and value renderers work, that's a good start and files/versioning/etc. is sort of a tricky but solvable problem. |
I'm in huge support of this proposal! I think this is an amazing idea, and going to be a huge improvement for signing UX/auditability across the entire blockchain space. I like mustache. If there is some concern around bugs w/ templating injections (which I don't have for mustache, but exists for more complex templating engines ala Jinja, Python fmt strings, C++ fmt strings, etc.), we can lint to block 'complex logic' in templates. Pretty minor point around "does this encoding need to be bijective encoding", that I think changes ~nothing: |
Bijectivity may be a bit of a strong requirement, but if we had bijectivity it would ensure both non-malleablility and losslessness. I think having a lossless encoding is actually a pretty big assurance that no important information is either intentionally or non-intentionally obfuscated from the user. I would argue that anytime there lossiness there's a chance for malleability. Anyway, bijectivity while not a requirement in itself just seems like the easiest way to check for these things in tests. |
From today's architecture call, three variant proposals (based on #6513 (comment)) came up:
Possibly 2 or 3 might provide more of a balanced approach to Ledger so that there's some confidence in the structure of the outer packets, with free-form human readable content on inner values. Obviously there are more variations that might be worth considering where we could provide 2 + 3 or full canonical JSON + 1 in the same |
## [Rendered ADR](https://github.com/cosmos/cosmos-sdk/blob/am/adr-textual/docs/architecture/adr-050-sign-mode-textual.md) ## Description Closes: #6513 Support documents: - context: #6513 - specification living doc HackmD: https://hackmd.io/7RkGfv_rQAaZzEigUYhcXw --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [ ] provided a link to the relevant issue or specification - [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [ ] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable)
For implementation, follow #11970 |
As documented in ADR 020 and discussed in #6078. Let's slate this for the 0.40 release and start a thorough process shortly after 0.39 is wrapped up.
/cc @zmanian @jleni @tarcieri
The text was updated successfully, but these errors were encountered: