-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Declarative RLP Encoding/Decoding #7975
base: master
Are you sure you want to change the base?
Conversation
- Test for Set Theoretical Representation
- Extend instances
- Good things happen to those who respect symmetry.
- Code works, but compiler complains
- Primitives are integers, byte sequences, and lists - No need for `IntRlpConverter` (covered by primitives)
- Replace virtual calls with conditional and inner state flag - Refactor call sites
- Rename to `FastRlp` to avoid conflicts
I've added a benchmark that encodes and decodes an
Results on my machine are the following:
There is room for a possible optimization: some records like |
- Nice speedup
Replacing
|
var size = sizeof(T); | ||
Span<byte> bigEndian = stackalloc byte[size]; | ||
value.WriteBigEndian(bigEndian); | ||
bigEndian = bigEndian.TrimStart((byte)0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TrimStart
does not seem to be heavily optimized. There might be something better that we can use, specially considering that we're removing leading zeros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like .net doesn't use SIMD for it!
We could write Vector based way to find start index.
@benaadams
2x slower. Quite a lot. Can you add the ASM diagnoser and memory diagnoser? Would be nice to compare it more. |
After running some benchmarks I found that
|
var decoder = Eip2930.AccessListDecoder.Instance; | ||
|
||
var length = decoder.GetLength(_current, RlpBehaviors.None); | ||
var stream = new RlpStream(length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is responsible for all the allocations as it allocates a new array underneath
_data = new byte[length]; |
RlpStream
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be comparable with the FluentRlp
approach we should allocate a new buffer. Since both will allocate the same buffer size then it should not matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. If they allocate the same buffer, what makes the current allocate over 400kb more then? Is it the different return type Nethermind.Core.Eip2930.AccessList
vs AccessList
in the new one or something else? With 400kb more the current will be greatly penalized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly they're not allocating the same buffer size: the fluent approach uses a buffer of 170850
bytes while the current one uses 172845
. That does not account for the 400kb you mention though. Rider's profiler is not giving me anything useful so I'm kind of stuck now.
Maybe we should add other objects (ex. LogEntry
, BlockInfo
, etc.) to get more accurate benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use NettyRlpStream that uses arena memory
dcd6d85
to
6dc21f4
Compare
- Return value is now `ReadOnlyMemory<byte>` - Add overloads for reading `ReadOnlyMemory<byte>` - Add `FluentAssertions` extensions
…ature/declarative-rlp # Conflicts: # src/Nethermind/Nethermind.Serialization.FluentRlp/Rlp.cs
…ature/declarative-rlp # Conflicts: # src/Nethermind/Nethermind.Serialization.FluentRlp/Rlp.cs
…ature/declarative-rlp
/// <param name="capacity">The capacity of the underlying buffer.</param> | ||
public FixedArrayBufferWriter(int capacity) | ||
{ | ||
_buffer = new T[capacity]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArrayPool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default I think we should go with a plain array to match the ArrayBufferWriter
behavior. At the end of the day, the RLP
static class is like a "safe default" API.
If we want more control over the buffers we use we can write a custom IBufferWriter
as you suggested earlier while using RlpReader
and RlpWriter
directly.
Changes
Types of changes
What types of changes does your code introduce?
Testing
Requires testing
If yes, did you write tests?
Notes on testing
The core library has 100% test coverage. Source generated code might not be fully covered.
Documentation
Requires documentation update
Requires explanation in Release Notes
Remarks
When we started working on refactoring our
TxDecoder
one thing that came up was how unergonomic is to work with our current RLP API. We even have some comments on the code itself mentioning these difficulties, for example:nethermind/src/Nethermind/Nethermind.Serialization.Rlp/Eip2930/AccessListDecoder.cs
Lines 17 to 23 in b81070d
nethermind/src/Nethermind/Nethermind.Serialization.Rlp/Eip2930/AccessListDecoder.cs
Lines 76 to 81 in b81070d
This PR introduces a new RLP API based on #7334 (comment) with several improvements:
virtual
oroverride
). Interfaces are only used to enforce implementations.static
). You can still use them if you want to, but overloads are provided to avoid them.