Represent small values as single bytes #4929

IGI-111 · 2023-08-09T10:52:00Z

Description

This change leverages the SB/LB instructions to change the memory representation of all small enough values to make them fit in a single byte instead of a full word.

The type size and passing calculations have been changed to align elements of structs and enums to full words.

Structs and data section entries are filled with right-padding to align their elements to words. Enums are still left-padded. The Data section generation has been refactored to allow for these two padding modes.

Arrays and slices contain no inner padding, byte sequences will now be properly consecutive and packed. Though, as a whole, they may be right padded in certain circumstances to maintain word alignment.

Direct usages of LW/SW have been changed to LB/SB where appropriate.

The LWDataId virtual instruction has been changed to LoadDataId to better represent the fact that it can load both word and byte sized values.

Checklist

I have linked to any relevant issues.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation where relevant (API docs, the reference, and the Sway book).
I have added tests that prove my fix is effective or that my feature works.
I have added (or requested a maintainer to add) the necessary Breaking* or New Feature labels where relevant.
I have done my best to ensure that my PR adheres to the Fuel Labs Code Review Standards.
I have requested a review from the relevant team or maintainers.

xunilrj · 2023-10-12T07:53:29Z

The most important changes made by this PR:

1 - u8, bool and unit are 1 byte now; That means that [u8;3], for example, is 3 bytes.
2 - "Virtual" loads are now realized to lb or lw depending on the case of course.
3 - When a 1-byte value is inside a struct it is left aligned (padding on the right). For example:

struct A { a: u8, b: u64 }
let a = A { a: 1, b: 2 } // memory: 10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000010

4 - Type checking/inference on arrays changed a little bit. Now the first item of the array does not necessarily start as Unknown, but with the array element type, if it is known. This fixes a problem with let a: [u8; 3] = [1, 2, 3] where the type checking would decide for [Numeric; 3], fallback to [u64; 3] later and then complain that types do not match.

5 - StorageVec offset was calculated in words. THis generates problems with types smaller than 8 bytes and now we add padding in these cases.

All the fuels-rs changes are here: FuelLabs/fuels-rs#1163

sway-core/src/asm_generation/fuel/fuel_asm_builder.rs

ironcev

Good job! I have a feeling finding all the places in code that need to be touched by this change wasn't trivial at all.

Speaking of which, my wish is to see #5227 implemented soon 😉

And afterwards same support for u16 and u32 😄

(And afterwards shuffling struct types to get optimized data layout 😄)

ironcev · 2023-11-02T18:00:30Z

@xunilrj Well, after pulling the latest change it looks to me that we were doing the same thing in parallel 😄 I was also onto removing all the size_bytes_round_up_to_word_alignment calls and packing the things while working on #5227 on top of this PR. I'll wait until this PR is merged, because as it looks now, a lot of clean up will already be done here which is great. Forget it, just got confused when switching between the branches 😄 But the question below still holds.

One question. In the sway-core/src/asm_generation/fuel/functions.rs line 799 this is the way how the var_size_words is calculated for string arrays:

TypeContent::StringArray(n) => size_bytes_round_up_to_word_alignment!(n),

Is this the right formula for getting var_size_words for string arrays? The way I understand it, for the string "123" n will be 3 and we will and up with 8 words.

xunilrj · 2023-11-02T18:33:03Z

Is this the right formula for getting var_size_words for string arrays? The way I understand it, for the string "123" n will be 3 and we will and up with 8 words.

That is a nice question. I think this is wrong. But we have older tests expecting string arrays to be padded. My idea was to fix this when doing the layout issue you are working at the moment.

ironcev · 2023-11-02T18:51:01Z

Ok I'll then take a closer look at the padding logic for string arrays and see if something needs to be changed.

ironcev · 2023-11-04T14:22:09Z

Storage definitely makes heavy assumptions on memory layout.
I think I stumbled upon a slight issue in storage.rs line 229.

let value_size_in_words = ir_type_size_in_bytes(context, ty) / 8;
let constant_size_in_words = ir_type_size_in_bytes(context, &constant.ty) / 8;

This code has assumed that the raw type sizes are word aligned. Our new u8, bool, and unit break this assumption.
If I interpret right the code that handles unions/enums, if we had e.g.

enum E {
    A: u8,
    B: u64,
}

for variant A, the value_size_in_words would be 1 and the constant_size_in_words zero, and the remaining code would do an unnecessary word of padding and the word of u8 data
on top of it.

I am fixing this on the go and extending test while refactoring access to memory layout information.

xunilrj · 2023-11-04T15:47:30Z

for variant A, the value_size_in_words would be 1 and the constant_size_in_words zero

Yes, but I don't remember a place that deals with variant size alone. ir_type_size_in_bytes foe enums returns 8 bytes (tag) plus max(variant size). Where the variant is right aligned.

enum E {
    A: u8,    // 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 AA
    B: u64,  // 00 00 00 00 00 00 00 01 BB BB BB BB BB BB BB BB
}

ironcev · 2023-11-04T17:39:53Z

Yes, but I don't remember a place that deals with variant size alone. ir_type_size_in_bytes foe enums returns 8 bytes (tag) plus max(variant size). Where the variant is right aligned.

It looks to me that storage::serialize_to_word might be such a place. Enums are structs with two fields, one for the tag and one (union) for the variant. The size_in_bytes doesn't have special handling for enums (and I would leave it so), and for union, it will return the word aligned size of the biggest variant without tag included.

So I assume that serialize_to_word, when it encounters an enum, it will first serialize the tag when serializing struct, and then recursively proceed with serializing the union part, but without the tag.

Unfortunately I have issues running locally the tests that require contract deployment so I cannot immediately test the described case locally and on the build machine I got an error indeed with the changed test, but saying that the contract cannot be found?! Will let you know about the test outcome once I manage to run it.

ironcev · 2023-11-05T21:52:47Z

In the end, there were two issues in serializing constants into storage slots.

One was the one explained above. In case of enums with one byte variants mixed with (multi) word variants the one byte variants were not properly padded (there was an additional shift to the right as commented above.)

~~The second issue was not only in enums but in general. u8s and bools were not properly converted to BE u64. They were positioned as the most significant byte instead as least significant.~~

Interestingly, the existing tests didn't catch the issue, because the storage access test is actually not calling the contract methods, only one, and the basic storage test that is very detailed didn't have constants in the storage statement that would reproduce the issue.

Note that the issue is only in initializing the storage via constants. Reading and writing from and into the storage works fine afterwards. That's why the existing tests pass.

I've pushed the fix together with the test suit.

@xunilrj Can you please triple-check the logic? 😄

Also, the existing storage tests should be extended to check structs, but I've already started doing that in the refactoring of the access to the memory layout information. So I propose to do it there and leave the existing storage tests in this PR as they are.

ironcev · 2023-11-06T08:10:04Z

As it goes, fixed one bug introduced another 😄 It turned out we do have testing of storage initialization within the SDK tests (which I, shame on me, never run locally so far). These tests also need to be extended to cover one-byte values in enums.

Long story short, it looks like we will need a logic here similar to one we have in initializing data section, where the outer context (struct or enum) decides on the padding (left or right) of the single byte values. It has nothing to do with endianess as wrongly pointed above.

Let me now triple-check that changed logic on my own before pushing 😄, extend the tests etc. and provide the fix.

ironcev

Storage initialization still needs to be fixed as pointed in the comment above.

IGI-111 added compiler: ir IRgen and sway-ir including optimization passes breaking May cause existing user code to break. Requires a minor or major release. compiler: codegen Everything to do with IR->ASM, register allocation, etc. labels Aug 9, 2023

IGI-111 requested review from vaivaswatha and a team August 9, 2023 10:52

IGI-111 self-assigned this Aug 9, 2023

IGI-111 force-pushed the IGI-111/int-memory-repr branch 2 times, most recently from fddc699 to 0bdf48e Compare August 9, 2023 12:28

IGI-111 marked this pull request as draft August 9, 2023 17:12

This was referenced Aug 11, 2023

Serialization and encoding/decoding #4769

Closed

Load and Store opcodes for small values > u8 and < u64 FuelLabs/fuel-specs#510

Open

IGI-111 force-pushed the IGI-111/int-memory-repr branch 2 times, most recently from 65d5718 to 9dca3bd Compare August 22, 2023 15:41

IGI-111 assigned xunilrj Sep 13, 2023

xunilrj force-pushed the IGI-111/int-memory-repr branch 2 times, most recently from f923ebf to 3c12320 Compare October 11, 2023 13:03

xunilrj force-pushed the IGI-111/int-memory-repr branch 3 times, most recently from 636eaa5 to d56632a Compare October 19, 2023 11:47

xunilrj force-pushed the IGI-111/int-memory-repr branch 2 times, most recently from c37e200 to 4bbf431 Compare October 24, 2023 18:10

xunilrj marked this pull request as ready for review October 24, 2023 19:25

xunilrj mentioned this pull request Oct 25, 2023

Memory Layout #5227

Closed

ironcev reviewed Oct 30, 2023

View reviewed changes

sway-core/src/asm_generation/fuel/fuel_asm_builder.rs Show resolved Hide resolved

sway-core/src/asm_generation/fuel/fuel_asm_builder.rs Show resolved Hide resolved

ironcev previously approved these changes Oct 30, 2023

View reviewed changes

xunilrj dismissed ironcev’s stale review via a806f58 November 1, 2023 13:44

xunilrj force-pushed the IGI-111/int-memory-repr branch from bd6e3a4 to a806f58 Compare November 1, 2023 13:44

xunilrj requested review from ironcev and a team November 1, 2023 15:12

xunilrj requested a review from JoshuaBatty November 2, 2023 17:35

arboleya mentioned this pull request Nov 3, 2023

Representation of small values as single bytes FuelLabs/fuels-ts#1398

Closed

ironcev dismissed their stale review via b40988e November 5, 2023 21:50

ironcev approved these changes Nov 5, 2023

View reviewed changes

ironcev requested changes Nov 6, 2023

View reviewed changes

ironcev approved these changes Nov 6, 2023

View reviewed changes

IGI-111 requested a review from a team November 7, 2023 03:53

JoshuaBatty approved these changes Nov 7, 2023

View reviewed changes

IGI-111 enabled auto-merge (squash) November 8, 2023 12:04

IGI-111 disabled auto-merge November 8, 2023 12:04

xunilrj and others added 8 commits November 8, 2023 13:01

better memory representation for ints/bool

790a365

clippy and fmt issues

4fc1bac

improving readability of two matches expr

f933931

fixing rebase issue

50fa19c

update contract output

192f1c2

Fix serialization of enums into storage slots

4bdac65

Fix serialization of aggregates into storage slots

a4505fb

Fix fmt issues

64945a3

xunilrj force-pushed the IGI-111/int-memory-repr branch from 6ecd2d7 to 64945a3 Compare November 8, 2023 13:01

xunilrj merged commit 95481fc into master Nov 8, 2023
31 checks passed

xunilrj deleted the IGI-111/int-memory-repr branch November 8, 2023 14:28

vaivaswatha mentioned this pull request Jan 10, 2024

Overflow behaviour relies on Optimization #5449

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Represent small values as single bytes #4929

Represent small values as single bytes #4929

IGI-111 commented Aug 9, 2023

xunilrj commented Oct 12, 2023 •

edited

Loading

ironcev left a comment

ironcev commented Nov 2, 2023 •

edited

Loading

xunilrj commented Nov 2, 2023

ironcev commented Nov 2, 2023

ironcev commented Nov 4, 2023

xunilrj commented Nov 4, 2023 •

edited

Loading

ironcev commented Nov 4, 2023

ironcev commented Nov 5, 2023 •

edited

Loading

ironcev commented Nov 6, 2023

ironcev left a comment •

edited

Loading

Represent small values as single bytes #4929

Represent small values as single bytes #4929

Conversation

IGI-111 commented Aug 9, 2023

Description

Checklist

xunilrj commented Oct 12, 2023 • edited Loading

ironcev left a comment

Choose a reason for hiding this comment

ironcev commented Nov 2, 2023 • edited Loading

xunilrj commented Nov 2, 2023

ironcev commented Nov 2, 2023

ironcev commented Nov 4, 2023

xunilrj commented Nov 4, 2023 • edited Loading

ironcev commented Nov 4, 2023

ironcev commented Nov 5, 2023 • edited Loading

ironcev commented Nov 6, 2023

ironcev left a comment • edited Loading

Choose a reason for hiding this comment

xunilrj commented Oct 12, 2023 •

edited

Loading

ironcev commented Nov 2, 2023 •

edited

Loading

xunilrj commented Nov 4, 2023 •

edited

Loading

ironcev commented Nov 5, 2023 •

edited

Loading

ironcev left a comment •

edited

Loading