Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically size tasks #595

Merged
merged 22 commits into from
Jun 10, 2022
Merged

Automatically size tasks #595

merged 22 commits into from
Jun 10, 2022

Conversation

mkeeter
Copy link
Collaborator

@mkeeter mkeeter commented Jun 8, 2022

This PR implements task autosizing, at long last!

It builds on the previous work with relocatable task builds (#584). After
building the relocatable task ELF file, it runs a "dummy link" against a
linker script with "infinite" memory (in practice, the entirety of memory
available on the chip). It then parses the resulting (static) binary to extract
sizes.

After finding sizes for every task, it runs the same memory packer as before,
then relinks each task with the resulting memory.

Task sizes are based on the target microcontroller, with a new alignment
parameter passed to allocate_one.

There are extensive changes to cargo xtask sizes to make it more generically
useful, decoupling the suggestions from the "find the size of a static ELF".

WARNING: this changes the format of the exported JSON files!

In addition, there are a bunch of new helper functions in Config to help with
task and memory sizing / alignment.

This fixes #474 and maybe #439, and deprecates #476

@mkeeter mkeeter added developer-experience Fixing this would have a positive impact on developer experience build Affects or requires changes in the build system labels Jun 8, 2022
Copy link
Collaborator

@cbiffle cbiffle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good. There are a couple of cases I've pointed out here that I think are bugs, but most of my comments are more informational than anything.

build/xtask/src/clippy.rs Outdated Show resolved Hide resolved
build/xtask/src/config.rs Outdated Show resolved Hide resolved
build/xtask/src/config.rs Outdated Show resolved Hide resolved
build/xtask/src/config.rs Show resolved Hide resolved
build/xtask/src/config.rs Outdated Show resolved Hide resolved
build/xtask/src/dist.rs Outdated Show resolved Hide resolved
build/xtask/src/dist.rs Show resolved Hide resolved
t => panic!("Unknown mpu requirements for target '{}'", t),
};

let power_of_two_required = toml.mpu_power_of_two_required();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this routine is already consulting the MPU requirement data, could you move the computation of align into here, rather than consulting the MPU requirement data outside this function and then calling it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this suggestion, since this function is called after tasks are placed in memory (which is the alignment-sensitive part).

I think the right fix may be to remove the task power-of-two check from this function entirely, since autosizing should enforce MPU requirements! (This would also address your comment below about checking flash / ram but not any other memories...)

if power_of_two_required && !task.requires["flash"].is_power_of_two() {
panic!("Flash for task '{}' is required to be a power of two, but has size {}", task.name, task.requires["flash"]);
if power_of_two_required
&& !task_allocations[name]["flash"].len().is_power_of_two()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a new bug, but: these conditions need to be applied to all regions, not just "flash" and "ram". For instance the sram1 area used by the ethernet driver still needs to be pow2.

.map(|(name, range)| (name, range.end - range.start))
.collect();

// XXX: are there alignment issues with the stack here?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the stack is 8-byte aligned on all of our current targets -- so, technically, this could underestimate RAM usage by 7 bytes in corner cases.

@cbiffle
Copy link
Collaborator

cbiffle commented Jun 8, 2022

I'm playing around with this branch, and I'm confused by the new xtask sizes output. (Yeah, I know, I read the code, I should probably understand it.)

What's this mean?

thermal
  flash: 12608 bytes (38%) [autosized to 16384]
  ram:    5284 bytes (64%)

12608 is 77% of 16384, so this is fine. (The yellow numbers used to indicate cases where space is being wasted; this is not one.)

Is it suggesting that the max-size could be reduced? If so, it would be nice to print the max-size it's mad about.

@cbiffle
Copy link
Collaborator

cbiffle commented Jun 8, 2022

Okay, how about this:

The use case for xtask sizes is, imo, showing the sizes of things. In an autosized world, its previous use case of "see how close I am to needing to adjust size up/down" is no longer the main reason you'd want to do that. So, we may want to reorient the output, maybe something like

kernel
  flash: 32768 bytes / 32768 (0% fragmentation) [limited to 32768]
  ram:    4596 bytes / 8192 (44% fragmentation) [limited to 8192]

thermal
  flash: 12608 bytes / 16384 (23% fragmentation) [limited to 32768]
  ram:    5284 bytes / 8192 (35% fragmentation) [limited to 8192]

Changes I'm suggesting there:

  • Print the size and chosen partition size, X / Y, as the primary piece of information
  • Flip the sense of the percentage to show how much space is being wasted.
  • Add a [limited to X] note only for cases that have limits specified in config, which today is essentially everything, but in the future will not be.

Thoughts?

Copy link
Contributor

@steveklabnik steveklabnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! A few anecdotes and one maybe-typo.

for (mem, &used) in sizes.iter() {
let size = match requires.get(&mem.to_string()) {
Some(s) => *s,
_ => continue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this kind of thing is the one time I wish Rust had Ruby like semantics for closures; this kind of code has to be written as a match rather than as a method chain because you can't continue into the parent scope. oh well :)

panic!("Unknown target: {}", toml.target);
/// Loads the size of the given task (or kernel)
pub fn load_task_size<'a>(
toml: &'a Config,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure i prefer toml as a name, given that the old name was config because it's a Config, but I don't feel strongly enough to argue that you should change it back :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is a little awkward, but matches the naming in dist.rs (where config is a BuildConfig and toml is the Config inside it 😱 )

};

let check_task = move |name: &str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am glad this monstrosity is leaving, haha. i almost wanted to turn it into a free function, but it captured one or two variables and that was convenient.


// If the VirtAddr disagrees with the PhysAddr, then this is a
// section which is relocated into RAM, so we also accumulate
// its FileSiz in the physical address (which is presumably
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is FileSiz a typo? Shouldn't it be "file size" or FileSz or something? an extremely minor thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, the names in this section based on the output of readelf, so it's intentional

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000120 0x08039400 0x08039400 0x00060 0x00060 R E 0x20
  LOAD           0x000180 0x08039460 0x08039460 0x00020 0x00020 R E 0x20
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x0

@mkeeter
Copy link
Collaborator Author

mkeeter commented Jun 10, 2022

Alright, I've refactored the cargo xtask sizes output and think this is ready to go!

Here's an example:

flash = 0x08000000..0x08100000
ram   = 0x20000000..0x20020000
sram1 = 0x30000000..0x30020000
Used:
  flash: 0x66880 (40%)
  ram:   0x1d500 (91%)
  sram1: 0x4000 (12%)

flash:
      ADDRESS  |   PROGRAM    |   USED |   SIZE | LIMIT
    0x08000000 | kernel       |  32768 |  32768 | (fixed)
    0x08008000 | hiffy        |  20120 |  32768 | 32768
    0x08010000 | gimlet_seq   |  51980 |  65536 | 65536
    0x08020000 | net          |  86080 | 131072 | 131072
    0x08040000 | spi4_driver  |  11436 |  16384 | 16384
    0x08044000 | spi2_driver  |  11852 |  16384 | 16384
    0x08048000 | i2c_driver   |  11916 |  16384 | 16384
    0x0804c000 | spd          |  11608 |  16384 | 16384
    0x08050000 | thermal      |  12608 |  16384 | 32768
    0x08054000 | power        |  12012 |  16384 | 16384
    0x08058000 | hf           |   8768 |  16384 | 16384
    0x0805c000 | jefe         |   7200 |   8192 | 8192
    0x0805e000 | sensor       |   4160 |   8192 | 8192
    0x08060000 | udpecho      |   7456 |   8192 | 16384
    0x08062000 | udpbroadcast |   5024 |   8192 | 16384
    0x08064000 | validate     |   8012 |   8192 | 8192
    0x08066000 | sys          |   1664 |   2048 | 2048
    0x08066800 | idle         |     96 |    128 | 128

ram:
      ADDRESS  |   PROGRAM    |   USED |   SIZE | LIMIT
    0x20000000 | kernel       |   4596 |   8192 | (fixed)
    0x20002000 | thermal      |   5284 |   8192 | 8192
    0x20004000 | net          |  16208 |  16384 | 16384
    0x20008000 | hiffy        |  27312 |  32768 | 32768
    0x20010000 | spd          |  10264 |  16384 | 16384
    0x20014000 | udpecho      |   4228 |   8192 | 8192
    0x20016000 | power        |   2828 |   4096 | 4096
    0x20017000 | gimlet_seq   |   2508 |   4096 | 4096
    0x20018000 | udpbroadcast |   2180 |   4096 | 8192
    0x20019000 | validate     |   2164 |   4096 | 4096
    0x2001a000 | jefe         |   1620 |   2048 | 2048
    0x2001a800 | spi4_driver  |   2036 |   2048 | 2048
    0x2001b000 | spi2_driver  |   2036 |   2048 | 2048
    0x2001b800 | i2c_driver   |   2008 |   2048 | 2048
    0x2001c000 | hf           |   2048 |   2048 | 2048
    0x2001c800 | sensor       |   2048 |   2048 | 2048
    0x2001d000 | sys          |    896 |   1024 | 1024
    0x2001d400 | idle         |    256 |    256 | 256

sram1:
      ADDRESS  |   PROGRAM    |   USED |   SIZE | LIMIT
    0x30000000 | net          |  12480 |  16384 | 16384

PROGRAM       REGION  USED    SIZE    LIMIT
kernel        flash   32768   32768   (fixed)
              ram     4596    8192    (fixed)
jefe          flash   7200    8192    8192
              ram     1620    2048    2048
net           flash   86080   131072  131072
              ram     16208   16384   16384
              sram1   12480   16384   16384
sys           flash   1664    2048    2048
              ram     896     1024    1024
spi4_driver   flash   11436   16384   16384
              ram     2036    2048    2048
spi2_driver   flash   11852   16384   16384
              ram     2036    2048    2048
i2c_driver    flash   11916   16384   16384
              ram     2008    2048    2048
spd           flash   11608   16384   16384
              ram     10264   16384   16384
thermal       flash   12608   16384   32768
              ram     5284    8192    8192
power         flash   12012   16384   16384
              ram     2828    4096    4096
hiffy         flash   20120   32768   32768
              ram     27312   32768   32768
gimlet_seq    flash   51980   65536   65536
              ram     2508    4096    4096
hf            flash   8768    16384   16384
              ram     2048    2048    2048
sensor        flash   4160    8192    8192
              ram     2048    2048    2048
udpecho       flash   7456    8192    16384
              ram     4228    8192    8192
udpbroadcast  flash   5024    8192    16384
              ram     2180    4096    8192
validate      flash   8012    8192    8192
              ram     2164    4096    4096
idle          flash   96      128     128
              ram     256     256     256

========== Suggested changes ==========
kernel:
  ram:    4608 (currently 8192)

@mkeeter mkeeter merged commit 020acab into master Jun 10, 2022
@mkeeter mkeeter deleted the autosize-build-redux branch June 10, 2022 14:11
This was referenced Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Affects or requires changes in the build system developer-experience Fixing this would have a positive impact on developer experience
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xtask sizes is incorrect with regards to the kernel
3 participants