Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder structure fields to maximize usage of immediate offset access #5268

Merged
merged 18 commits into from
Dec 9, 2021

Conversation

gilles-peskine-arm
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm commented Dec 6, 2021

Resolve part of #4747: PSA crypto and TLS. This is a forward port of #5189.

Also fix #5191.

Supersedes #4821 with refinements to the TLS and PSA crypto parts, but without the classic crypto and X.509 parts.

Compared with #5189, the changes are similar, but there are differences.

  • The commits are not in the same order: here, I grouped the changes to check_names.py at the beginning, instead of mixed with others near the end.
  • The changes in headers could not be rebased due to the extensive presence of MBEDTLS_PRIVATE. I applied similar changes manually and re-did the code size measurements. I didn't spend much time experimenting to see if other variations might be better in 3.x. I consider it an objective to align the field types between the branches where possible, but I don't think aligning the field order is that important.

There is further work to be done on structures that we don't want to change in 2.x, both in crypto and in X.509. Some of this has been done in #4821. I haven't had time to update and expand that part yet, and I think it's better for the current state to go into the upcoming 3.1 release, so I propose to do the further crypto part in a future pull request that won't go into 3.1.

Related PR: a script I wrote to list structure fields and their position, a script I wrote to quickly show code size gains.

Several files among include/psa/crypto_*.h are not meant to be included
directly, and are not guaranteed to be valid if included directly. This
makes it harder to perform some static analyses. So make these files more
self-contained so that at least, if included on their own, there is no
missing macro or type definition (excluding the deliberate use of forward
declarations of structs and unions).

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Fix cases like
```
/*short comment*/ /*long
 comment */
int mbedtls_foo;
```
where the previous code thought that the second line started outside of a
comment and ended inside of a comment.

I believe that the new code strips comments correctly. It also strips string
literals, just in case.

Fixes Mbed-TLS#5191.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Make parse_identifiers less complex. Pylint was complaining that it had too
many local variables, and it had a point.

* Lift the constants identifier_regex and exclusion_lines to class
  constants (renamed to uppercase because they're constants).
* Lift the per-file loop into a new function parse_identifiers_in_file.

No intended behavior change.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Use '|'.join([comma-separated list]) rather than r'...|' r'...|'. This way
there's less risk of forgetting a '|'. Pylint will yell if we forget a comma
between list elements.

Use match rather than search + mandatory start anchor for EXCLUSION_LINES.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
No intended behavior change.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
In computing, brackets are []. () are called parentheses.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Make that part of the code more readable.

Add support for // line comments.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Empty the current line if it's entirely inside a comment.

Don't incorrectly end a block comment at the second line if it doesn't
contain `*/`.

Recognize `/*` to start a multiline comment even if it isn't at the start of
the line.

When stripping off comments, consistently strip off `/*` and `*/`.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Move fields around to have fewer accesses outside the 128-element Thumb
direct access window.

In psa_hkdf_key_derivation_t, move the large fields (output_block, prk,
hmac) after the state bit-fields. Experimentally, it's slightly better
to put hmac last.

Other operations structures don't go outside the window, at least when not
considering nested structures.

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/psa_crypto.o: 16510 -> 16434 (diff: 76)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Move fields around to have fewer accesses outside the 128-element Thumb
direct access window.

Make the same change as in 2.27+, for the same small benefit.

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/psa_crypto.o: 16434 -> 16414 (diff: 20)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Replace bitfields mbedtls_ssl_handshake_params by bytes. This saves some
code size, and since the bitfields weren't group, this doesn't increase the
RAM usage.

Replace several ints that only store values in the range 0..255 by uint8_t.
This can increase or decrease the code size depending on the architecture
and on how the field is used. I chose changes that save code size on Arm
Thumb builds and will save more after field reordering.

Leave the bitfields in struct mbedtls_ssl_hs_buffer alone: replacing them by
uint8_t slightly increases the code size.

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/ssl_cli.o: 19759 -> 19763 (diff: -4)
library/ssl_srv.o: 20790 -> 20754 (diff: 36)
library/ssl_tls13_keys.o: 5153 -> 5133 (diff: 20)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Reorder fields mbedtls_ssl_handshake_params in order to save code on Arm
Thumb builds. The general idea is to put often-used fields in the direct
access window of 128 elements from the beginning of the structure.

The reordering is a human selection based on a report of field offset and
use counts, and informed by measuring the code size with various
arrangements. Some notes:
* This is the same reordering as the corresponding commit in Mbed-TLS#5189 for 2.2x.
* I moved most byte-sized fields at the beginning where they're sure to be
  in the direct access window.
* I moved buffering earlier because it can be around the threshold depending
  on the configuration, and it's accessed in a lot of places.
* I moved several fields, including update_checksum and friends, early so
  that they're guaranteed to be in the early access window.

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/ssl_cli.o: 19763 -> 19687 (diff: 76)
library/ssl_msg.o: 24874 -> 24834 (diff: 40)
library/ssl_srv.o: 20754 -> 20562 (diff: 192)
library/ssl_tls.o: 21003 -> 20907 (diff: 96)
library/ssl_tls13_client.o: 7284 -> 7272 (diff: 12)
library/ssl_tls13_generic.o: 4749 -> 4721 (diff: 28)
library/ssl_tls13_keys.o: 5133 -> 5077 (diff: 56)

Results (same architecture, config-suite-b.h + MBEDTLS_ECDH_LEGACY_CONTEXT +
MBEDTLS_ECP_RESTARTABLE):
library/ssl_cli.o: 3000 -> 2936 (diff: 64)
library/ssl_msg.o: 3084 -> 3080 (diff: 4)
library/ssl_srv.o: 3428 -> 3400 (diff: 28)
library/ssl_tls.o: 6754 -> 6730 (diff: 24)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Move small fields first so that more fields can be within the Arm Thumb
128-element direct access window.

Keep the int section after the pointer section: moving int fields first cost
a few bytes on the reference baremetal-m0plus build.

The ordering in this commit is not based on field access frequency.

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/ssl_cli.o: 19687 -> 19543 (diff: 144)
library/ssl_msg.o: 24834 -> 24726 (diff: 108)
library/ssl_srv.o: 20562 -> 20462 (diff: 100)
library/ssl_tls.o: 20907 -> 20707 (diff: 200)
library/ssl_tls13_client.o: 7272 -> 7252 (diff: 20)
library/ssl_tls13_generic.o: 4721 -> 4705 (diff: 16)

Results (same architecture, config-suite-b.h + MBEDTLS_ECDH_LEGACY_CONTEXT +
MBEDTLS_ECP_RESTARTABLE):
library/ssl_cli.o: 2936 -> 2876 (diff: 60)
library/ssl_msg.o: 3080 -> 3068 (diff: 12)
library/ssl_srv.o: 3400 -> 3372 (diff: 28)
library/ssl_tls.o: 6730 -> 6658 (diff: 72)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
This slightly increases the RAM consumption per context, but saves code
size on architectures with an instruction for direct byte access (which is
most of them).

Although this is technically an API break, in practice, a realistic
application won't break: it would have had to bypass API functions and rely
on the field size (e.g. relying on -1 == 1 in a 1-bit field).

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/ssl_cli.o: 19543 -> 19559 (diff: -16)
library/ssl_msg.o: 24726 -> 24690 (diff: 36)
library/ssl_srv.o: 20462 -> 20418 (diff: 44)
library/ssl_tls.o: 20707 -> 20555 (diff: 152)
library/ssl_tls13_client.o: 7252 -> 7244 (diff: 8)
library/ssl_tls13_generic.o: 4705 -> 4693 (diff: 12)

Results (same architecture, config-suite-b.h + MBEDTLS_ECDH_LEGACY_CONTEXT +
MBEDTLS_ECP_RESTARTABLE):
library/ssl_cli.o: 2876 -> 2864 (diff: 12)
library/ssl_msg.o: 3068 -> 3080 (diff: -12)
library/ssl_srv.o: 3372 -> 3340 (diff: 32)
library/ssl_tls.o: 6658 -> 6566 (diff: 92)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
@gilles-peskine-arm gilles-peskine-arm added enhancement needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review labels Dec 6, 2021
@mpg mpg requested review from mpg and daverodgman December 7, 2021 08:19
@mpg mpg linked an issue Dec 7, 2021 that may be closed by this pull request
mpg
mpg previously approved these changes Dec 7, 2021
Copy link
Contributor

@mpg mpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I noticed a couple of differences with the 2.x PR:

  1. You didn't keep the /* bool /*, /* 2 bits */, etc. comments for fields that used to be bitfields and are now uint8_t. I assume that's because the lines are longer in development due to MBEDTLS_PRIVATE; if that's intentional, I agree with this choice.
  2. In 2.x you moved ecrs_ctx separately from the other ecrs_xxx fields, but didn't do the same in development. I'm not sure if that was intentional. Anyway, that doesn't affect correctness, so not an issue.

(Also, 2.x has a ChangeLog entry and development doesn't, which is fully justified as the fields are private in development, so that's not a user-visible change.)

daverodgman
daverodgman previously approved these changes Dec 7, 2021
@daverodgman daverodgman added approved Design and code approved - may be waiting for CI or backports and removed needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review labels Dec 7, 2021
@gilles-peskine-arm gilles-peskine-arm removed the approved Design and code approved - may be waiting for CI or backports label Dec 8, 2021
@gilles-peskine-arm
Copy link
Contributor Author

Apart from the lack of changelog entry, the differences were unintended. I'll add the bit counts and check if ecrs_ctx makes a difference. Also, I have an uncommitted change to move group_list_heap_allocated which makes a significant difference.

Placing group_list earlier seems to help significantly, not just as a matter
of placing it in the 128-element (512-byte) access window.

Results (arm-none-eabi-gcc 7.3.1, build_arm_none_eabi_gcc_m0plus build):
library/ssl_cli.o: 19559 -> 19551 (diff: 8)
library/ssl_msg.o: 24690 -> 24674 (diff: 16)
library/ssl_srv.o: 20418 -> 20406 (diff: 12)
library/ssl_tls.o: 20555 -> 20519 (diff: 36)
library/ssl_tls13_client.o: 7244 -> 7240 (diff: 4)
library/ssl_tls13_generic.o: 4693 -> 4697 (diff: -4)

Results (same architecture, config-suite-b.h + MBEDTLS_ECDH_LEGACY_CONTEXT +
MBEDTLS_ECP_RESTARTABLE):
library/ssl_cli.o: 2864 -> 2860 (diff: 4)
library/ssl_tls.o: 6566 -> 6546 (diff: 20)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Ensure that the documentation of fields affected by
"mbedtls_ssl_config: Replace bit-fields by separate bytes"
conveys information that may have been lost by removing the exact size of
the type. Extend the preexisting pattern "do this?" for formerly 1-bit
boolean fields. Indicate the possible values for non-boolean fields.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
"mbedtls_ssl_handshake_params: reorder fields to save code size" moved this
filed earlier along with byte-sized fields that should be in the 128-element
access window on Arm Thumb. This took away precious room in the 128-byte
window. Move it back further out.

Results (same architecture, config-suite-b.h + MBEDTLS_ECDH_LEGACY_CONTEXT +
MBEDTLS_ECP_RESTARTABLE):
library/ssl_cli.o: 2860 -> 2816 (diff: 44)
library/ssl_msg.o: 3080 -> 3076 (diff: 4)
library/ssl_srv.o: 3340 -> 3300 (diff: 40)
library/ssl_tls.o: 6546 -> 6478 (diff: 68)

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
@gilles-peskine-arm gilles-peskine-arm dismissed stale reviews from daverodgman and mpg via cfe74a3 December 8, 2021 17:40
@gilles-peskine-arm
Copy link
Contributor Author

  • I moved group_list earlier. This has a significant impact if MBEDTLS_DEPRECATED_REMOVED is defined.
  • I didn't add the /* bool */ annotations, but I made sure that the information is conveyed somehow (such as through a ? at the end of the description of boolean fields).
  • Moving ecrs_ctx early did turn out to have a significant impact, I moved it back out.

@gilles-peskine-arm gilles-peskine-arm added needs-review Every commit must be reviewed by at least two team members, and removed needs-work labels Dec 8, 2021
Copy link
Contributor

@mpg mpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for addressing my comments.

@mpg mpg added approved Design and code approved - may be waiting for CI or backports and removed needs-review Every commit must be reviewed by at least two team members, labels Dec 9, 2021
@mpg
Copy link
Contributor

mpg commented Dec 9, 2021

CI notes that this changes the ABI, which is fully expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Design and code approved - may be waiting for CI or backports enhancement
Projects
None yet
3 participants