You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2024. It is now read-only.
I'm opening this issue to summarise what IMO are the problems with the current design of some libraries and to propose an alternative.
Problems
From my perspective, there are three interconnected problems in the libraries that represent protocol objects (i.e. Account, Block and Transaction):
They are based on ehereumjs-util's defineProperties. This function is based on super dynamic aspects of javascript, making the libs hard to adapt to typscript, hard to understand, and hard to maintain. This has been discussed during multiple typescript migrations, most notably during ethereumjs-account's one.
They are designed around the RLP representation of each object. While this isn't necessarily a bad thing, these libraries are also used by smart contracts and dapps developers, that don't have enough understanding of the protocol and have trouble understanding some things. A clear example of this is that ethereumjs-tx periodically receives false bug reports about leading zeros being stripped. This is the expected behavior when working with the RLP representation of some of the transactions fields, but totally unexpected for someone using it for another purpose.
All of them are mutable. This is an issue, as there are complex requirements/dynamics between them, and being able to mutate each object individually opens the door to getting objects out of sync. An example of this is parsing a block, which will construct a Block object and a series of Transaction objects, and then changing the block number to 0. This would/should render any EIP155-signed transaction invalid, but it's not clear how this can be implemented with the current design. What will probably happen now is that the Block and the Transactions will have incompatible Common instances.
Proposal
I think (1) can be solved without breaking changes by inlining part of defineProperties logic in each library. I implemented a prototype of a similar idea on ethereumjs-tx in this PR. While I consider this an improvement, it will mostly increase the maintainability of the libraries. This change wouldn't make any difference with (2) and (3), which are the points that can lead to confusion and bugs in the users' code.
In my opinion, to effectively fix (3) without an intricate design that may not be enough in the future, we should prohibit mutating the domain objects. This means making Account, Transaction and Block immutable. And the same should be done for Common. If they are immutable, everything can be validated on construction, and objects can't get out of sync.
Fixing (2) needs a deeper redesign of the libraries. RLP should be treated just as a serialization strategy, and everything should be modeled with their actual data types instead of everything being a Buffer.
I think a good approach to design them would be to aknowledge that there are multiple representations of these objects, implement everything as close to the actual domain as possible, and make it easy to go from one representation to the others. At the moment, I think these should be:
Domain objects: All the fields have their actual data type. If something is a number, for example, it should be represented as a bigint or BN.js. These should be normal classes, everything dealing with how to get from other representation to this one should be placed in a factory method.
Json objects: this is pretty self-explanatory, with a domain object you should be able to construct a json object, where everything is represented as string.
RLP representation: A factory method should be able to construct a domain object from its RLP serialization, and domain objects should be RLP serializable.
Plain javascript objects: The most common way to represent things like transactions in ethereum today are plain js objects, most of the times even object literals. To adapt to this reality, there should be factory methods that turn these representations into a domain object.
Example implementation
I sketched how Transaction would look if reimplemented with this design
// I just use this class to indicate that the Common object should be immutableclassImmutableCommon{}// These functions and types should go to a shared library, -util?functionbigIntToRlp(bn: bigint): Buffer{returnBuffer.from([]);}classAddress{publicserialize(): Buffer{returnBuffer.from([]);}}exportinterfaceTransformableToBuffer{toBuffer(): Buffer;}exporttypeBufferLike=Buffer|TransformableToBuffer|string|number;exporttypeAddressLike=string|Address;exporttypeBigIntLike=string|bigint|number;exportfunctionaddresLikeToAddress(addr: AddressLike): Address{returnnewAddress();}exportfunctionbufferLikeToBuffer(bl: BufferLike): Buffer{returnBuffer.from([]);}exportfunctionbigIntLikeToBigInt(bil: BigIntLike): bigint{return1n;}exportfunctionbigIntToHex(bi: bigint): string{return"0x"+bi.toString(16);}// The -tx proposal starts here.exportinterfaceTxData{gasLimit?: BigIntLike;gasPrice?: BigIntLike;to?: AddressLike;nonce?: BigIntLike;data?: BufferLike;v?: BigIntLike;r?: BigIntLike;s?: BigIntLike;value?: BigIntLike;}exportinterfaceJsonTx{gasLimit?: string;gasPrice?: string;to?: string;nonce?: string;data?: string;v?: string;r?: string;s?: string;value?: string;}classTransaction{// **This is the "entry point" for people working with objects at the protocol// level.** It takes a serialized tx as seen in the protocol, and returns an// immutable domain object.publicstaticdeserialize(rlpSerializedTx: Buffer,common: ImmutableCommon): Transaction{// parse the rlp bufferreturnnewTransaction(/* ... */);}// **This is the "entry point" for people working with the RPC or other higher// level applications.** While the domain object is already pretty high level,// having a way to go from bare javascript objects into a domain object is// super useful.//// This function would probably be the most commonly used.//// This function handles default values.publicstaticfromTxData(txData: TxData,common: ImmutableCommon){returnnewTransaction(common,txData.nonce!==undefined ? bigIntLikeToBigInt(txData.nonce) : 0n,txData.gasLimit!==undefined
? bigIntLikeToBigInt(txData.gasLimit)
: 9000n,txData.gasPrice!==undefined
? bigIntLikeToBigInt(txData.gasPrice)
: 8000000000n,txData.to!==undefined ? addresLikeToAddress(txData.to) : undefined,txData.value!==undefined ? bigIntLikeToBigInt(txData.value) : 0n,txData.data!==undefined
? bufferLikeToBuffer(txData.data)
: Buffer.from([]),txData.v!==undefined ? bigIntLikeToBigInt(txData.v) : undefined,txData.r!==undefined ? bigIntLikeToBigInt(txData.r) : undefined,txData.s!==undefined ? bigIntLikeToBigInt(txData.s) : undefined);}// Everything is readonly.publicreadonlycommon: ImmutableCommon;publicreadonlynonce: bigint;publicreadonlygasLimit: bigint;publicreadonlygasPrice: bigint;publicreadonlyto?: Address;publicreadonlyvalue: bigint;publicreadonlydata: Buffer;publicreadonlyv?: bigint;publicreadonlyr?: bigint;publicreadonlys?: bigint;// This constructor just takes the values, validates them, assigns them and// freezes the object.constructor(common: ImmutableCommon,nonce: bigint,gasLimit: bigint,gasPrice: bigint,to: Address|undefined,value: bigint,data: Buffer,v?: bigint,r?: bigint,s?: bigint){// TODO: Validate everythingthis.common=common;this.nonce=nonce;this.gasLimit=gasLimit;this.gasPrice=gasPrice;this.to=to;this.value=value;this.data=data;this.v=v;this.r=r;this.s=s;Object.freeze(this);}// This function is the way to go from a domain object to the protocol// representation of a transaction.//// Things that were previously handled by `defineProperties` are handled here.// For example, stripping the leading zeros of numeric values.serialize(): Buffer{returnrlpencode([bigIntToRlp(this.nonce),bigIntToRlp(this.gasLimit),bigIntToRlp(this.gasPrice),this.to!==undefined ? this.to.serialize() : Buffer.from([]),bigIntToRlp(this.value),this.data,this.v!==undefined ? bigIntToRlp(this.v) : Buffer.from([]),this.r!==undefined ? bigIntToRlp(this.r) : Buffer.from([]),this.s!==undefined ? bigIntToRlp(this.s) : Buffer.from([])]);}// Methods that previously altered the object, now return a new one.sign(privateKey: Buffer): Transaction{constdata=this.getDataToSign();constsignature=sign(data,privateKey);returnnewTransaction(this.common,this.nonce,this.gasLimit,this.gasPrice,this.to,this.value,this.data,signature.v,signature.r,signature.s);}toJson(): JsonTx{return{nonce: bigIntToHex(this.nonce)// ...};}}// Sample usage:constcommon=newImmutableCommon("mainnet");consttx=Transaction.fromTxData({to: "0x123123123123213123123123123",value: 123123},common);// Modifying a txconsttxNonce123=Transaction.fromTxData({
...tx,nonce: 123},tx.common// Having to pass common here seems error-prone);// Signing a txconstpk=Buffer.from([]);constsignedTx=tx.sign(pk);
Unknowns
There are two things that are hard to answer about this proposal:
How do we go from the current design to this one incrementally? It'd be a major implementation and coordination effort.
How would implementing everything with types more abstract than Buffer impact on the project's performance? I have a strong belief that it wouldn't be much of an impact, as we are copying buffers and transforming them from/to strings super often now.
Previous discussions
Finally, here's a list of interesting links where these things have been discussed:
Previous discussions about the limitations of the design:
Hi Patricio, thanks for this, I'm not completely through with reading but just to let you know that this won't get lost, generally super valuable to have such broadly-ontaking and deep reaching strategic analysis.
I'm opening this issue to summarise what IMO are the problems with the current design of some libraries and to propose an alternative.
Problems
From my perspective, there are three interconnected problems in the libraries that represent protocol objects (i.e. Account, Block and Transaction):
They are based on
ehereumjs-util
'sdefineProperties
. This function is based on super dynamic aspects of javascript, making the libs hard to adapt to typscript, hard to understand, and hard to maintain. This has been discussed during multiple typescript migrations, most notably duringethereumjs-account
's one.They are designed around the RLP representation of each object. While this isn't necessarily a bad thing, these libraries are also used by smart contracts and dapps developers, that don't have enough understanding of the protocol and have trouble understanding some things. A clear example of this is that
ethereumjs-tx
periodically receives false bug reports about leading zeros being stripped. This is the expected behavior when working with the RLP representation of some of the transactions fields, but totally unexpected for someone using it for another purpose.All of them are mutable. This is an issue, as there are complex requirements/dynamics between them, and being able to mutate each object individually opens the door to getting objects out of sync. An example of this is parsing a block, which will construct a
Block
object and a series ofTransaction
objects, and then changing the block number to 0. This would/should render any EIP155-signed transaction invalid, but it's not clear how this can be implemented with the current design. What will probably happen now is that theBlock
and theTransaction
s will have incompatibleCommon
instances.Proposal
I think (1) can be solved without breaking changes by inlining part of
defineProperties
logic in each library. I implemented a prototype of a similar idea onethereumjs-tx
in this PR. While I consider this an improvement, it will mostly increase the maintainability of the libraries. This change wouldn't make any difference with (2) and (3), which are the points that can lead to confusion and bugs in the users' code.In my opinion, to effectively fix (3) without an intricate design that may not be enough in the future, we should prohibit mutating the domain objects. This means making
Account
,Transaction
andBlock
immutable. And the same should be done forCommon
. If they are immutable, everything can be validated on construction, and objects can't get out of sync.Fixing (2) needs a deeper redesign of the libraries. RLP should be treated just as a serialization strategy, and everything should be modeled with their actual data types instead of everything being a
Buffer
.I think a good approach to design them would be to aknowledge that there are multiple representations of these objects, implement everything as close to the actual domain as possible, and make it easy to go from one representation to the others. At the moment, I think these should be:
Domain objects: All the fields have their actual data type. If something is a number, for example, it should be represented as a
bigint
orBN.js
. These should be normal classes, everything dealing with how to get from other representation to this one should be placed in a factory method.Json objects: this is pretty self-explanatory, with a domain object you should be able to construct a json object, where everything is represented as string.
RLP representation: A factory method should be able to construct a domain object from its RLP serialization, and domain objects should be RLP serializable.
Plain javascript objects: The most common way to represent things like transactions in ethereum today are plain js objects, most of the times even object literals. To adapt to this reality, there should be factory methods that turn these representations into a domain object.
Example implementation
I sketched how
Transaction
would look if reimplemented with this designUnknowns
There are two things that are hard to answer about this proposal:
How do we go from the current design to this one incrementally? It'd be a major implementation and coordination effort.
How would implementing everything with types more abstract than
Buffer
impact on the project's performance? I have a strong belief that it wouldn't be much of an impact, as we are copying buffers and transforming them from/to strings super often now.Previous discussions
Finally, here's a list of interesting links where these things have been discussed:
Previous discussions about the limitations of the design:
ethereumjs/ethereumjs-block#69 (comment)
https://github.com/ethereumjs/ethereumjs-account/issues/29
https://github.com/ethereumjs/ethereumjs-tx/issues/151
ethereumjs/ethereumjs-tx#154
Common "bugs" that users report because of the libraries using RLP-based representation.
https://github.com/ethereumjs/ethereumjs-tx/issues/74#issuecomment-318066638
https://github.com/ethereumjs/ethereumjs-tx/issues/112
https://github.com/ethereumjs/ethereumjs-tx/issues/140
https://github.com/ethereumjs/ethereumjs-tx/issues/164
https://github.com/ethereumjs/ethereumjs-util/issues/141
The text was updated successfully, but these errors were encountered: