Skip to content
This repository has been archived by the owner on May 4, 2024. It is now read-only.

[stdlib] add type_name module for type reflection #566

Merged
merged 1 commit into from
Oct 14, 2022

Conversation

sblackshear
Copy link
Member

Move allows type-safe implementations of key types like fungible tokens using the Coin standard. However, distinguishing coins at the type level only is sometimes limiting--e.g., you might want to write a DEX that has at most one pool for a given pair of currencies (need to compare types + reify the result in a value) or write a lending protocol that supports an extensible pool of currencies (need to reify the currency types + map them to the reserve coin supply).

This PR seeks to add the weakest form of reflection that supports use-cases like the ones above in order to keep static reasoning as simple/predictable as possible. Similar modules exist in the Starcoin and Aptos frameworks, and Sui users have also been requesting this feature.

  • Add opaque TypeName value that can be produced from a Move type via get<T>(): TypeName
  • Add functions for converting a TypeName's into its inner representation as an ASCII strings so it can be compared or inspected. This is safe because the Move binary format enforces ASCII encoding for all identifiers

One issue worth discussing is whether we want to use the full-length representation of addresses (e.g., always produce a 16, 20, or 32 byte string, depending on the platform address length) or the "short" representation which strips leading 0's (e.g., the convention 0x1 for the Move stdlib). I think the former is slightly cleaner/more predictable, but I went with the latter because it already seems to be the standard in the TypeTag implementation of Display, as well as in the Aptos/Starcoin TypeInfo module that provides similar functionality to this code.

@@ -0,0 +1,29 @@
/// Functionality for converting Move types into values. Use with care!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add license header here.

@crazybits
Copy link

crazybits commented Oct 11, 2022

is it possible to get T from the type string in running time after add this feature?

e.g.

let T = ("uint64").Type;
let v = get<T>();

debug_assert!(arguments.is_empty());

let type_tag = context.type_to_type_tag(&ty_args[0])?;
let type_name = type_tag.to_string();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type_name should be the full address or short address, this is affected by PR #441

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is essential that this is a canonical-forever name for type safety. I'd suggest to introduce a new function to_canonical string in type_tag to make this clear over there. Then we still have to choose w/wo leading 0s. I don't mind here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very supportive of to_canonical_string here, will do!

@wrwg
Copy link
Member

wrwg commented Oct 11, 2022

is it possible to get T from the type string in running time after add this feature?

e.g.

let T = ("uint64").Type;
let v = get<T>();

This is a different feature which we may import from Aptos (called Any over there).

debug_assert!(arguments.is_empty());

let type_tag = context.type_to_type_tag(&ty_args[0])?;
let type_name = type_tag.to_string();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is essential that this is a canonical-forever name for type safety. I'd suggest to introduce a new function to_canonical string in type_tag to make this clear over there. Then we still have to choose w/wo leading 0s. I don't mind here.

Copy link
Member

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets please wait with merge, just checking some more things...

@jolestar
Copy link
Collaborator

This is a different feature which we may import from Aptos (called Any over there).

Good feature, expect import to stdlib.

@sblackshear
Copy link
Member Author

is it possible to get T from the type string in running time after add this feature?

e.g.

let T = ("uint64").Type;
let v = get<T>();

IIUC, this is actually two separate things:

  1. The ability to construct a TypeName from a string
  2. The ability to go back and forth between a TypeName value and a Move type

This PR very deliberately avoids (1), but you can have (2) without (1) (as in Any).

}

/// Return a value representation of the type `T`.
public native fun get<T>(): TypeName;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the thought behind the name get?
My worry is making it clear that it is allocating something, not possibly fetching from a cache. new might be better here from that line of thinking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning is that you are "extracting" or "getting" the TypeName from T rather than constructing it (e.g., from a string). Trying to evoke that these come from real types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I think should think about this more then as T::type_name() sorta, rather than TypeName::new()


/// Get the String representation of `self`
public fun borrow_string(self: &TypeName): &String {
&self.name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about exposing the String here, but if y'all are fine with fine by me.

I think the root of my trepidation around the strings is it allows for cyclic thinking/checks with types. For example consider the difference between t == TypeName::get<0x42::m::R> vs borrow_string(&t) == &ascii::new(b"0x42::m::R"). The second can be used anywhere, where the first requires being able to link against 0x42::m.

But perhaps this cyclic style is desirable?

Copy link
Member Author

@sblackshear sblackshear Oct 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your concern about both borrow_string and into_string, or only the former (guessing both, but wanted to double-check)?

I share the concern, and I think that the safest/most idiomatic usages of TypeName's (and the ones easiest to verify with the prover) will look like your option (1). But I think hiding the string is a bit too paternalistic--if we're going to expose this form of reflection at all, I think we don't buy ourselves much by trying to limit its expressivity in this way.

As one concrete example of where exposing the string adds expressivity we already know we want: we can (I think) make is_one_time_witness an in the Sui stdlib an ordinary Move function!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As one concrete example of where exposing the string adds expressivity we already know we want: we can (I think) make is_one_time_witness an in the Sui stdlib an ordinary Move function!

We used to be able to, but not anymore sadly, with the field change needed to allow things like u256::U256

But I think hiding the string is a bit too paternalistic--if we're going to expose this form of reflection at all, I think we don't buy ourselves much by trying to limit its expressivity in this way.

I worry about that too. I'm not really that concerned about this to be clear. The main thing I wanted to highlight is that it changes the way modules can relate to each other. Before this, I'm not sure there was any way for modules to reason about their dependents (outside of friend). More importantly about their dependents' types. Or more generally, they can now have logic that refers to types that have not yet been published.

I'm not sure if people will "abuse" this feature in that way, but it is a new type of logic that is being exposed here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem in exposing a string. We have a very beautiful property specific to Move: the same 'name' (string) is unique for this type (and its layout) forever (assuming only compatible upgrades). There is no reason to hide the string. C++/Rust do this because they cannot guarantee uniqueness over different processes. We can.

@sblackshear
Copy link
Member Author

Added to_canonical_string + used it in the native function


/// Get the String representation of `self`
public fun borrow_string(self: &TypeName): &String {
&self.name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem in exposing a string. We have a very beautiful property specific to Move: the same 'name' (string) is unique for this type (and its layout) forever (assuming only compatible upgrades). There is no reason to hide the string. C++/Rust do this because they cannot guarantee uniqueness over different processes. We can.

pub fn short_str_lossless(&self) -> String {
/// Return a canonical string representation of the address
/// Addresses are 0x-prefixed, hex-encoded lowercase values with leading 0's stripped
/// e.g., 0xa, *not* 0x00000000a, 0xA, or 0x00000000A
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expect this can be config by a feature flag. Because we have used the full address in Starcoin production env, must keep capability, like #441

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I did not mean to change this--will revert to the old behavior. Thanks for flagging!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed--you should not need to change anything on the StarCoin side. Apologies again for that oversight.

@sblackshear sblackshear force-pushed the type_reflection branch 3 times, most recently from 125802b to df243ff Compare October 13, 2022 20:03
@sblackshear sblackshear enabled auto-merge (squash) October 13, 2022 20:26
Move allows type-safe implementations of key types like fungible tokens using the Coin<T> standard. However, distinguishing coins at the type level only is sometimes limiting--e.g., you might want to write a DEX that has at most one pool for a given pair of currencies (need to compare types + reify the result in a value) or write a lending protocol that supports an extensible pool of currencies (need to reify the currency types + map them to the reserve coin supply).

This PR seeks to add the weakest form of reflection that supports use-cases like the ones above in order to keep static reasoning as simple/predictable as possible. Similar modules exist in the Starcoin and Aptos frameworks, and Sui users have also been requesting this feature.

- Add opaque `TypeName` value that can be produced from a Move type via `get<T>(): TypeName`
- Add functions for converting a `TypeName`'s into its inner representation as an ASCII strings so it can be compared or inspected. This is safe because the Move binary format enforces ASCII encoding for all identifiers

One issue worth discussing is whether we want to use the full-length representation of addresses (e.g., always produce a 16, 20, or 32 byte string, depending on the platform address length) or the "short" representation which strips leading 0's (e.g., the convention `0x1` for the Move stdlib). I think the former is slightly cleaner/more predictable, but I went with the latter because it already seems to be the standard in the `TypeTag` implementation of `Display`, as well as in the Aptos/Starcoin `TypeInfo` module that provides similar functionality to this code.
@sblackshear sblackshear disabled auto-merge October 14, 2022 01:17
@sblackshear sblackshear merged commit e9b0aed into move-language:main Oct 14, 2022
wrwg pushed a commit to wrwg/move-lang that referenced this pull request Nov 5, 2022
Move allows type-safe implementations of key types like fungible tokens using the Coin<T> standard. However, distinguishing coins at the type level only is sometimes limiting--e.g., you might want to write a DEX that has at most one pool for a given pair of currencies (need to compare types + reify the result in a value) or write a lending protocol that supports an extensible pool of currencies (need to reify the currency types + map them to the reserve coin supply).

This PR seeks to add the weakest form of reflection that supports use-cases like the ones above in order to keep static reasoning as simple/predictable as possible. Similar modules exist in the Starcoin and Aptos frameworks, and Sui users have also been requesting this feature.

- Add opaque `TypeName` value that can be produced from a Move type via `get<T>(): TypeName`
- Add functions for converting a `TypeName`'s into its inner representation as an ASCII strings so it can be compared or inspected. This is safe because the Move binary format enforces ASCII encoding for all identifiers

One issue worth discussing is whether we want to use the full-length representation of addresses (e.g., always produce a 16, 20, or 32 byte string, depending on the platform address length) or the "short" representation which strips leading 0's (e.g., the convention `0x1` for the Move stdlib). I think the former is slightly cleaner/more predictable, but I went with the latter because it already seems to be the standard in the `TypeTag` implementation of `Display`, as well as in the Aptos/Starcoin `TypeInfo` module that provides similar functionality to this code.
wrwg added a commit that referenced this pull request Nov 5, 2022
* Show Gas Used In Unit Tests & Allow Client Supplied Gas Cost Table (#616)

* allow cost table in tests

* allow cost table in tests

* Clippy simplification

* docs

* [stdlib] add `type_name` module for type reflection (#566)

Move allows type-safe implementations of key types like fungible tokens using the Coin<T> standard. However, distinguishing coins at the type level only is sometimes limiting--e.g., you might want to write a DEX that has at most one pool for a given pair of currencies (need to compare types + reify the result in a value) or write a lending protocol that supports an extensible pool of currencies (need to reify the currency types + map them to the reserve coin supply).

This PR seeks to add the weakest form of reflection that supports use-cases like the ones above in order to keep static reasoning as simple/predictable as possible. Similar modules exist in the Starcoin and Aptos frameworks, and Sui users have also been requesting this feature.

- Add opaque `TypeName` value that can be produced from a Move type via `get<T>(): TypeName`
- Add functions for converting a `TypeName`'s into its inner representation as an ASCII strings so it can be compared or inspected. This is safe because the Move binary format enforces ASCII encoding for all identifiers

One issue worth discussing is whether we want to use the full-length representation of addresses (e.g., always produce a 16, 20, or 32 byte string, depending on the platform address length) or the "short" representation which strips leading 0's (e.g., the convention `0x1` for the Move stdlib). I think the former is slightly cleaner/more predictable, but I went with the latter because it already seems to be the standard in the `TypeTag` implementation of `Display`, as well as in the Aptos/Starcoin `TypeInfo` module that provides similar functionality to this code.

* [build] Added additional diagnostics for fetching deps (#643)

* [build] Added additional diagnostics for fetching deps

* Fixes to EVM code and tests

* Addressed review comments

* [Tutorial] fix a minor inconsistency with code in step_4 (#623)

* [move-cli] update description for 'move test --coverage' command (#620)

* [unit tests] Improve DevX of unit tests diagnosis (#641)

This does two improvements for UT diagnosis:

- We now capture the stack trace (if according feature flags are enabled) for every interpreter error, not just aborts. The restriction to just aborts seem to be unnecessary.
- We now treat execution failures (arithmetic errors, borrow errors, index out of bounds) as regular errors, not longer as internal VM errors.

Effectively, were we had an output like this before:

```
│ ITE: An unknown error was reported. Location: error[E11001]: test failure
│   ┌─ missing_data.move:6:9
│   │
│ 5 │     fun missing_data() acquires Missing {
│   │         ------------ In this function in 0x1::MissingData
│ 6 │         borrow_global<Missing>(@0x0);
│   │         ^^^^^^^^^^^^^
│
│
│ VMError (if there is one): VMError {
│     major_status: MISSING_DATA,
│     sub_status: None,
│     message: None,
│     exec_state: None,
│     location: Module(
│         ModuleId {
│             address: 00000000000000000000000000000001,
│             name: Identifier(
│                 "MissingData",
│             ),
│         },
│     ),
│     indices: [],
│     offsets: [
│         (
│             FunctionDefinitionIndex(0),
│             1,
│         ),
│     ],
│ }
```

... we now have:

```
│ error[E11001]: test failure
│   ┌─ missing_data.move:6:9
│   │
│ 5 │     fun missing_data() acquires Missing {
│   │         ------------ In this function in 0x1::MissingData
│ 6 │         borrow_global<Missing>(@0x0);
│   │         ^^^^^^^^^^^^^ Execution failure MISSING_DATA
│
│
│ stack trace
│ 	MissingData::missing_data_from_other_function(tests/test_sources/missing_data.move:12)
│
```

Co-authored-by: oxade <93547199+oxade@users.noreply.github.com>
Co-authored-by: Sam Blackshear <sam@mystenlabs.com>
Co-authored-by: Adam Welc <adam@mystenlabs.com>
Co-authored-by: wp-lai <techwplai@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants