Skip to content
This repository was archived by the owner on Jan 24, 2022. It is now read-only.

[RFC] make sections .bss and .data opt in #32

Closed
japaric opened this issue Jul 23, 2017 · 17 comments
Closed

[RFC] make sections .bss and .data opt in #32

japaric opened this issue Jul 23, 2017 · 17 comments
Labels
RFC This issue needs your input!

Comments

@japaric
Copy link
Member

japaric commented Jul 23, 2017

The zero_bss and init_data routines will run before main even if the
application makes use of neither or only one of these sections. This may not
matter much if your device has plenty of Flash memory / ROM but if you are
targeting really small devices with e.g. only 512 bytes of ROM then every byte
you can save counts.

The only safe way to opt out of these routines is to forbid the existence of the
.bss and / or .data sections. That can be easily done in this crate because
it controls the linker script of the application.

Design

We add two opt-in Cargo features: "bss" and "data". If neither is enabled then
static variables can't be used at all (unless they end up in .rodata) and
neither, zero_bss or init_data, will run before "main".

If only "bss" is enabled then zero_bss will run before main but only static
variables that end up in .bss are allowed in the program. For instance this
would compile:

static mut STATE: bool = false;

but this would not:

static mut STATE: bool = true;

Something similar would happen with the "data" feature.

Enabling both features gives you today's behavior.

Bonus

For the really brave we could allow this pattern when both "bss" and "data" are
disabled:

#[link_section = ".uninit"]
static mut STATE: bool = false;

STATE is uninitialized because neither zero_bss and init_data runs before
main. Reading this variable before assigning it any value at runtime will return
junk and could even cause undefined behavior, for example if the variable is an
enum.

Drawbacks

This is a breaking change and it's also kind of annoying for people that usually
use both .bss and .data because they'll now have to enable both features or
their program won't compile.

cc @cr1901 @therealprof

@cr1901
Copy link

cr1901 commented Jul 23, 2017

Hmmmm, this one's tricky. I think I'm against this feature.

Startup Code Reasonable Space Usage

On one token, I agree that once you get below approx 2kB, all the space you can get for your application counts. However, timer-rtfm example with initialization and a single interrupt still leaves over half of the smallest MSP430 for their application (this is for g2553; I can confidently say the code size wouldn't change for g2001):

William@William-THINK MINGW32 ~/src/msp430-quickstart
$ size target/msp430-none-elf/release/examples/timer-rtfm
   text    data     bss     dec     hex filename
    210       0       0     210      d2 target/msp430-none-elf/release/examples/timer-rtfm

I don't know any ARM microcontroller that doesn't offer at least 4kB of RAM. AVRs and MSP430s bottom out at 512 bytes. Even the smallest PIC bottoms out at 256 bytes.

Feature Is Very Specific

Most people are going to use .bss and .data. I've also never personally developed an purely stack-based application, so I don't have the experience to say whether the gains in space are worth it.

At the same time, somebody who is targeting such a space-constrained environment is likely to know what they are doing in terms of:

  • Expectations of which Rust constructs are zero-cost (CriticalSection), ones which take little space, and ones which take up more space (i.e. be careful with match, large enums, closures which escape the code where they are defined).
  • What code sections will be dragged in by default, both within (zero_bss, init_data) and beyond (panic_bounds_check) one's control.
  • Doing startup tasks themselves when all else fails (as rust_on_msp demonstrates).

Most people are likely to want the init code, because they are just either experimenting with microcontrollers, and/or they have an application in mind where space is not a concern. As Rust makes it easier to get started with microcontrollers, I think it's fair to make the general case in embedded Rust more similar to hosted Rust, similar to the differences in freestanding C and hosted C. I'd rather see this feature opt-in, so experimenters can opt-in when ready, and those who are truly space-constrained can enable the feature immediately. Explicitly needing to enable .data/.bss is probably something most users would expect to need to do. And of course it's a breaking change :).

Break Out Into Disable .bss/.data and Disable Init?

I don't know if Rust will ever target these sorts of micros (and putting aside the fact that a number of people despise these families :)), but a few embedded micros (PIC) don't have a general purpose stack. This means a programmer must have a .data/.bss.

It may be beneficial to have the programmer just initialize their statics manually (assigning 0 in the code to prevent compile errors), as an "unrolled" memcpy (as I did in at2xt) to save space. And only the vars that will ever be read before they are written need to be initialized, at that. That would mean disabling r0 code in rt, but all other portions of the rt crates could be used.

Comparisons to C/Practicality

Of course there will also be naysayers who say "Don't use Rust in such constrained environments/Rust wasn't meant to target that deeply embedded" :). To which I no longer believe that's true if you keep your expectations realistic. I think it's worth a try to target as deeply as possible while, making as few concessions as possible. Perhaps one day idiomatic Rust can target where C can't target well too (PIC, 8051).

I don't know if people who have written code for 256 byte - 2kB micros in C typically disable the startup code, but assuming they don't b/c it's good enough, I don't feel enabling this feature by default enables a fair comparison to C code usage. This was one reason I enabled r0 in at2xt in the first place.

And again, if all else fails, startup, interrupt initialization, etc can be done manually; see at2xt and rust_on_msp links in this comment. This method will of course differ for different microcontroller families, but you probably know what you are doing and/or want to learn if you opted in to this level of control.

@therealprof
Copy link
Contributor

therealprof commented Jul 23, 2017

While I love the idea of saving a few bytes this feature really would only make sense to me if we could somehow automatically determine that there was no .bss or .data section and automatically omit the code and the addresses from the binary.

In contrary to what @cr1901 said most of my Rust MCU experiments so far have turned out to be stack-only, thanks to brilliant features of Rust and @japaric's frameworks. I think there's a real chance that we can often do without heap and static mutable variables and thus omit the initialisation. I was actually surprised looking at some disassemblies where .sbss, .ebss, .sdata and .edata were all the same value and thus the code very much useless...

I totally agree though that it is completely impractical to assume that this will always be the case and require an opt-in otherwise otherwise. If we could could pull this off via Rust compiler or Linker voodoo (sans the targets where having the initialisation is an absolut must) that would be really nice.

@japaric
Copy link
Member Author

japaric commented Jul 23, 2017

@cr1901

I don't know any ARM microcontroller that doesn't offer at least 4kB of RAM.
AVRs and MSP430s bottom out at 512 bytes. Even the smallest PIC bottoms out at
256 bytes.

Even though I started this RFC in this repo it also applies to the msp430-rt
repo (and the AVR target).

I've also never personally developed an purely stack-based application

You don't necessarily have to completely opt out of static variables. For
example, you can exclusively use .bss (zero out all your statics) and still
get binary size savings by only disabling the "data" feature and dropping
init_data.

Doing startup tasks themselves when all else fails

If you do that and don't use {cortex-m,msp430}-rt then you can't use
{cortex-m,msp430}-rtfm so it may not be a feasible option.

I'd rather see this feature opt-in

Unfortunately, we can't make it like that because "negative" (non additive)
Cargo features are discouraged and making "bss" and "data" enabled by default
makes it very hard (if not downright impossible) to disable the features from an
application (unless the application directly depends on cortex-m-rt and none of
it dependencies depends on cortex-m-rt).

@therealprof

if we could somehow automatically determine that there was no .bss or .data
section and automatically omit the code and the addresses from the binary.

Not possible with the current compilation pipeline. rustc would have to be doing
the linking for that to maybe work. And no, putting LLD in rustc won't fix this
either because LLD behaves exactly like an external linker than can be called
via library calls so it provides no extra optimization / information over ld / gcc.

@therealprof
Copy link
Contributor

@japaric

Not possible with the current compilation pipeline. rustc would have to be doing the linking for that to maybe work.

Not sure I follow. rustc emits the code in the first place. If there was a way to determine that we're not going to have any .bss or .data sections the code to clear/copy the data could be omitted and the linker doesn't even know about it.

And no, putting LLD in rustc won't fix this either because LLD behaves exactly like an external linker than can be called via library calls so it provides no extra optimization / information over ld / gcc.

The linker already automatically omits empty sections while substituting the values everywhere in the binary. Maybe it's possible conditionally dropping sections, then we'd simply have to put the zeroing and copy functions into separate sections and make sure they're dropped if _sdata == _edata or _sbss == _ebss respectively.

@japaric
Copy link
Member Author

japaric commented Jul 24, 2017

if there was a way to determine that we're not going to have any .bss or .data sections the code to clear/copy the data could be omitted

I don't think that can work because of how the pipeline is structured. Basically we don't know whether .bss / .data or any other section will be empty until after the LLVM pass is over (LLVM can optimize away unused static variables, move them to .rodata or inline them into code thus making a section empty). However dropping a chunk of Rust must be done on the first passes or at the lastest you could feed if false { zero_bss(..) } into LLVM to have it optimize away the routine, but whether that if should be if true or if false depends on the output of the LLVM pass. I think you should see the problem: you have a black box where the output is connected to the input so that won't fly (that's wouldn't even be a pipeline to begin with)

The linker ...

There's some conditional stuff among the linker script commands but I have never tried to do anything smart with it.

@protomors
Copy link
Contributor

I would also like if initialization code were omitted automatically if respective section is empty. Or et least so initialization would be enabled by default.

Maybe for such constrained environments it would be better to offer a possibility to bypass all initialization code completly? But is there even a point at using cortex-m-rt crate at all in such cases?

For instance this
would compile:
static mut STATE: bool = false;
but this would not:
static mut STATE: bool = true;

This is good. At first I was afraid that this change will allow use of uninitialized values if you forget to set required features. I guess this example will fail at linking? It would be nice to at least have some meaningfull error message but I don't know if it is possible.

@japaric About conditions in linker scripts. As far as I know there are only support for "?" operator in expressions. I doubt it will help in this case. For more complex cases people usually generate linker scripts using C preprocessor or just use different files.

@therealprof
Copy link
Contributor

However dropping a chunk of Rust must be done on the first passes or at the lastest you could feed if false { zero_bss(..) } into LLVM to have it optimize away the routine, but whether that if should be if true or if false depends on the output of the LLVM pass.

I was more thinking in terms of having enough hints that LTO can do its job to eliminate the initialisation.

@cr1901
Copy link

cr1901 commented Jul 25, 2017

Unfortunately, we can't make it like that because "negative" (non additive)
Cargo features are discouraged and making "bss" and "data" enabled by default
makes it very hard (if not downright impossible) to disable the features from an
application (unless the application directly depends on cortex-m-rt and none of
it dependencies depends on cortex-m-rt).

In light of this, I'll support the feature, as long as it's documented why it is so:

  • Brief version for those who want to "set and forget":

These features exist for "exceptional" circumstances, but are opt-in for dependency reasons. Unless you know you want to opt out, you should opt in.

  • Longer explanation for those who know "opting out is what they want":
    • Talk about possible use cases, valid feature configurations (can disable .bss, .data, both, none, etc), optional .uninit, etc.

@therealprof
Copy link
Contributor

I don't like having the initialisation as an opt-in feature at all. Rust is comparably hard to use as it is and even more so for embedded targets; the goal should be to remove barriers preventing people from using it, not introduce new ones.

@kjetilkjeka
Copy link
Contributor

The biggest concern by having .bss and .data as opt-in features is that it's highly non-intuitive. And the more things that behave in a non-intuitive manner, the harder the package will be to use.

So maybe it is worth changing something simple to something non-intuitive to accommodate for (something I see as) a highly specialized scenario. But what is the simplest alternative that won't require any action when using the initialization functions? Even if this is something that requires more work than just opting in two features it's more intuitive that an action is required and hence possibly worth it.

Maybe it would be better to expose some interface into the linker script so expert users could configure these kinds of things manually? Perhaps a flag that would both leave out .bss and .data while linking some mock r0? Or maybe some simpler solution exists?

The second biggest concern I have is by making it "standard" to opt in the initialization functions (as it should be), libraries that really didn't need to do so will doing it anyway since it's "standard". This might make it unnecessarily restrictive to use libraries when not using initializing functions. But again, with this limited flash, it might not be considered normal to use libraries.

@cr1901
Copy link

cr1901 commented Aug 5, 2017

I've thought about this a bit more:

Unfortunately, we can't make it like that because "negative" (non additive)
Cargo features are discouraged

I noticed you said "discouraged". Wouldn't this possibly fall under an exception, since .bss/.data is for the vast majority of use cases what you want?

and making "bss" and "data" enabled by default
makes it very hard (if not downright impossible) to disable the features from an
application (unless the application directly depends on cortex-m-rt and none of
it dependencies depends on cortex-m-rt).

I don't see how the reverse (making opt-in) solves this problem. Most helper crates are going to need .bss/.data regardless if its opt-in or out, correct? I think only bare_metal has no real use for .data/.bss?

Is it possible to "cascade" feature gates for a crate that depends on another crate? E.g. if you "disable" .data/.bss from, say, cortex-m-rt crate, the same feature is also disabled for cortex-m-rtfm (this is a contrived example).

In the worst case, if you opt-out of .bss/.data, you clearly have a reason to do so and accept responsibility/loss of convenient libraries. That being said, I was able to duplicate the functionality (inelegantly :P) of msp430-rt and a device crate without relying on the existence of zero_bss and init_data, at the cost of some unsafe code. I think forbidding statics at all is overkill, and permitting a small amount of unsafety in initialization is an acceptable tradeoff.

@japaric japaric removed this from the 0.4.0 milestone Apr 8, 2018
@jonas-schievink jonas-schievink added the RFC This issue needs your input! label Nov 26, 2019
@korken89
Copy link
Contributor

We should take a decision on this RFC.

@cr1901
Copy link

cr1901 commented Dec 30, 2019

My stance hasn't changed from my last message (I noticed I went back and forth throughout this thread)... what's wrong with making .bss and .data initialization default features and if a user wants to override/knows better, disable with a flag to cargo?

@japaric
Copy link
Member Author

japaric commented Jan 14, 2020

what's wrong with making .bss and .data initialization default features

default Cargo features are pretty much impossible to disable. Let's say you want to disable the default feature F of crate C; to do that you have to make sure that all the crates in your dependency graph that depend on crate C have disabled the default feature F. If even one of those crates fails to do that (*) then feature F will be enabled everywhere because Cargo does "union" of features.

(*) note that the crate that fails to do so may not even be under your control; it could be a third party dependency you pulled from crates.io (e.g. a PAC or HAL).

@cr1901
Copy link

cr1901 commented Jan 14, 2020

it could be a third party dependency you pulled from crates.io (e.g. a PAC or HAL).
Can't speak for HALs, but PACs could have a feature enabled by default that enables the initialization code. I don't know how painful that would be for Cortex-M world.

My experience is that the "union" of features problem doesn't really apply to (deeply) embedded Rust code because your dependency "tree" (before its turned into a graph) doesn't have that deep of a depth in the first place.

If you need to disable this feature, chances are you are not using a large set of dependencies because:

  1. You clearly need the space savings, so you have opted to do some things yourself.
  2. no-std limits the set of crates you can bring in in the first place.

This is a niche use case- if a user needs to disable the init code, then they can put in a bit of extra effort to do so.

@cr1901
Copy link

cr1901 commented Apr 26, 2020

I discovered that the GNU assembler has special code for optimizing out .bss and .data initialization on MSP430. I explain in better detail here.

My stance at this point is that this shouldn't be a feature. But perhaps the various rt crates and r0 can be modified to feed a sufficiently-smart llvm/GNU assembler the required data to optimize out init code?

@japaric
Copy link
Member Author

japaric commented Apr 28, 2020

I'm not going to push for this anymore but if anyone is interested in this feel free to submit a new proposal.

@japaric japaric closed this as completed Apr 28, 2020
rukai pushed a commit to rukai/cortex-m-rt that referenced this issue May 1, 2021
32: Build with 2018 edition, and fix the example r=korken89 a=jonas-schievink

r? @therealprof 

Co-authored-by: Jonas Schievink <jonasschievink@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
RFC This issue needs your input!
Projects
None yet
Development

No branches or pull requests

7 participants