-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal on integer field size adjustments and clever way to shrink value field #273
Comments
Oh, an addendum. I realize that some parts of the Spec look at the Currency ID. For instance, when you post a property for sale and specify what currency you want. This can be taken advantage of by specifying the Currency ID to be 1000, which the software will interpret as 1001 to 1009 which maps to all the Bitcoin currency ID's (in my example table above). If they list it as 1010, it would map to 1011 to 1019 which is all the USD entries (in my example table above), etc. You just have to assign currencies blocks of 10 integers. The 0 integer is like the broadcast address to the currency assigned to that 10 integer space. This allows for 6553 currency types if the Currency ID field is 2 bytes in size. 6543 if you start at 100 to makes the old currency types backwards compatible. |
An alternative is using a variable-length integer algorithm like http://en.wikipedia.org/wiki/LEB128 and then just null terminating the field. This lets you not have to have a table of currency ID's mapped to byte sizes and you can just specify the currency ID and put in the value which is compressed down and then add one byte for the NULL. Most values would still shrink from the 8 bytes they are now down to 4 or 5 bytes. |
Yay, nice input, thanks! You mention a point where I see great potential for improvement as well. In general I see two routes: define a new type for each action or use sub-action. Here is probably the best (worst) example for the second route:
Because there is a constraint of only one active (Bitcoin) offer per "address", all fields marked with X could be removed entirely. But that aside, let's quickly modify the transaction: Example 1:
Example 2:
In this case the action fields determines the structure of the rest of the transaction, but it would also be possible to define an explicit transaction type for each action, for example:
I would be surprised, if we'd run out of numbers, even if we'd use only one byte to encode both version as well as transaction type. A special value could be reserved to indicate a transition to a greater range, similar to what you proposed by merging currency identifiers with an "amount length indicator" - which could be either combined into one field like
If I'm not mistaken, there would be no need for null-terminating the number due to how LEB128 is constructed: the last byte is always greater or equal than 128 and all bytes before are less than 128. |
I agree we'll probably never have more than 65k REAL properties. unfortunately, since creating a property is very cheap, it's not very hard to see people burning through two bytes just messing around. 4 bytes on the other hand is much harder to burn through. I've tried to not use the ecosystem flag any place the ecosystem could be reliably inferred from a currency ID. Did you find an exception? If so, we should fix that. I like the variable-length integer idea, although I wonder how much range we give up . . . . |
Actually none. It would - in the worst case - result in fields that are longer than in pure form, but without giving up range. I played a bit with LEB128. It has an overhead of 1 bit per 1 byte, that means 4 byte no longer hold 32 bit, but only 28, ... This allows the identify the end of a varlength encoded field, even if it's not fully padded - sort of similar to the trick of using 0x00 to terminate strings. A version of 2 would no longer be encoded as 0x0002, but simply as 0x02. A number of coins field with a value of 321 would be reduced from 0x0000000000000141 to an encoded form of 0xc102 (if I'm not mistaken). Let me phrase it differently: I believe this would be a very, very significant improvement and could save lots of space without any significant downside. The wikipedia article isn't bad for some information about the encoding. I already see some further tweaks here and there - for example we may seperate the ecosystem flag from currency identifiers or using an actual 1 bit wide flag instead of a full byte etc. Now the really tricky question: how could such a transition done best, if at all in the near future? |
I'm tentatively in favor of a change like this IF there are suitable I well-remember the thread about version changes, where we decided that new On Mon, Oct 13, 2014 at 8:53 PM, dexX7 notifications@github.com wrote:
|
I'm assuming that the field sizes are static, other than the null-terminated string fields.
Just for reference with some common values:
1 byte would allow for unsigned integer values from 0 to 255
2 bytes would allow for unsigned integer values from 0 to 65535
3 bytes would allow for unsigned integer values from 0 to 16777215
4 bytes would allow for unsigned integer values from 0 to 4294967295
5 bytes would allow for unsigned integer values from 0 to 1099511627775
6 bytes would allow for unsigned integer values from 0 to 281474976710655
The specification has a few fields that I think might be optimized.
Ecosystem
The Ecosystem field seems wasteful since it's more of just a boolean value that's using a whole byte. Why not just pay attention to the Currency Identifier field? If the Currency Identifier field is set to 2 (Test MSC) then you know the ecosystem is the Mastercoin Testnet. That lets you drop the entire Ecosystem field. 1 byte can be shaved off.
Currency Identifier
The currency identifier field is set to be a 32-bit unsigned integer that takes 4 bytes. I know it's possibly treading on "640k should be enough for everybody" territory, but is Mastercoin anticipating more than 65535 currency types? 2 bytes could be shaved off.
Transaction Types and Transaction Version
I can understand that you want to be able to support a multitude of transaction types. 2 bytes allows for 65535 transaction types in the future, which is a bit excessive. Then you add the field for Transaction Version with another 65535 types in the future. This is a bit excessive. Why not just use the Transaction Type field and everytime there is a change in the spec for a transaction type just add another type ID rather than increment the version number?
Or perhaps, if you want to keep things cleaner, use 1 byte to store the transaction type (255 possible types) with 1 byte for the transaction version (255 possible versions) and if the transaction version ever gets to 255, then create a new transaction type?
Both proposals would shave off 2 bytes.
Possibly crazy idea but kinda neat way to make the Value integer field a variable length
The size of the Number of Coins field could be shaved down if each coin had multiple listings in currency type. This would allow the system to determine how many bytes is needed for the number of coins field during reading or writing the data. For instance:
So, if someone wanted to sell a Bible for 0.3 BTC, the system would convert that to satoshi's and see that it's 30,000,000 satoshi, which fits in Currency ID #1004. When it puts together the data payload to write into the blockchain, it'll make sure to use a 4 byte field and set the two byte Currency ID field to 1004.
When someone reads the mastercoin transaction, they'll look at the Currency ID and see that it's 1004 and know to expect the Value field to be 4 bytes long. They don't even need to look it up in a table, you can use the last integer of the Currency ID to see how many bytes to expect from the Value field!
Currently the currency ID and value takes a total of 12 bytes of space. For a sale with a value set to 0.05 BTC, the currency ID and Value field will just take 6 Bytes! For a 6 BTC transaction, it'll be a total of 7 bytes!
Conclusion
The Ecosystem, Currency ID, Transaction Type, and Value fields currently take 17 bytes. If these proposals are accepted, that can drop down anywhere from 5 to 10 bytes depending on the value listed in the transaction. That's a 7 to 12 byte savings!
Forgive me if any of my calculations aren't quite right, this is more of a theoretical proposal. Someone should bust out a calculator and double check everything :-)
The text was updated successfully, but these errors were encountered: