Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmsis: add __BKPT #540

Closed
wants to merge 1 commit into from
Closed

cmsis: add __BKPT #540

wants to merge 1 commit into from

Conversation

japaric
Copy link
Member

@japaric japaric commented Jul 26, 2018

I overlooked this intrinsic in my previous PR because the reference implementation implemented it as a macro rather than as a function.

@japaric
Copy link
Member Author

japaric commented Jul 26, 2018

hold on. It seems this implementation errors at call site when invoked:

error: invalid operand for inline asm constraint 'i'

@japaric
Copy link
Member Author

japaric commented Jul 26, 2018

To preserve the immediate value and match the CMSIS implementation I see two alternatives:

  • We implement the intrinsic as a macro like the reference C implementation does:
macro_rules! __BKPT {
    ($value:expr) => {
        asm!(concat!("bkpt ", stringify!($value)) : : : : "volatile")
    };
}

With this __BKPT!(0) works at call site but it also requires the asm feature gate
(#![feature(asm)]) at call site. So, I think this is a no go.

  • We implement the function as a gigantic match:
#[inline]
pub unsafe fn __BKPT(value: u8) {
    match value {
        0 => asm!("bkpt 0" : : : : "volatile"),
        1 => asm!("bkpt 1" : : : : "volatile"),
        // ..
        254 => asm!("bkpt 254" : : : : "volatile"),
        _ => asm!("bkpt 255" : : : : "volatile"),
    }
}

This also works but produces terrible machine code when compiled without optimizations -- all the
arms of the match are kept in the binary.


The other option is to deviate from the CMSIS specification and omit the immediate value in the
function:

#[inline]
pub unsafe fn __BKPT() {
    asm!("bkpt" : : : : "volatile")
}

From the ARM documentation:

Syntax
BKPT #imm

imm is ignored by the processor. If required, a debugger can use it to store additional
information about the breakpoint.

(emphasis mine)

So, I don't much is lost by not providing a way to set the immediate value. The immediate value is
necessary to implement semihosting (the bkpt 0xAB instruction is used to implement "system
calls"), but a __BKPT(u8) function is not enough to implement the semihosting system calls
because the registers need to be set up in a certain way (and you need to set some clobbers); to
implement semihosting one would still need to use inline assembly or an external assembly file.

I would vote for providing the __BKPT function that has no arguments.

Thoughts? @Amanieu @paoloteti

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 26, 2018

This also works but produces terrible machine code when compiled without optimizations

FWIW, this is what we do for all others intrinsics having the same issue: constify_imm8!: https://github.com/rust-lang-nursery/stdsimd/blob/master/coresimd/x86/macros.rs#L3.

There is a proc macro you can use to enforce that the value passed as the immediate is a const: #[rustc_args_required_const(arg_index)]

The plan was to, once we have const generics, require these arguments to be const using that feature, somehow.

Also, by "terrible code" you probably mean that this bloats the binary (by 1-2kb) and will trash the icache. However, opt-level=1 is typically enough to remove this, so if someone uses this intrinsic a lot enabling opt-level=1 in debug builds is probably a good solution (and if 1kb of code size matter, probably necessary anyways).

@paoloteti
Copy link
Contributor

There are 2 versions of BKPT and the ARMC implementation is

#define __BKPT(value) __breakpoint(value)

void __breakpoint(int val)

Where: val is a compile-time constant integer whose range is:

  • 0 ... 65535: if you are compiling source as ARM code
  • 0 ... 255 : if you are compiling source as Thumb code

So I guess, constify_xxx marcos are not a good idea for 16bits values.

Not sure exist a llvm.arm.bkpt or llvm.dbg.bkpt you can use.

I don't know if it works on Rust, but may be you can do something like

asm ("
...
".inst 0xde01 | (mask)" 
..")

to modify on the fly the BKPT istruction using .inst.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 26, 2018

So I guess, constify_xxx marcos are not a good idea for 16bits values.

There is one x86 intrinsic that takes an imm16 but with only 4096 possible values. The constify macro with 4096 match arms incurred something like 15 min compile-time cost, per call site.

What we decided to do back then is to just not provide the intrinsic and wait for const generics, which should solve this issue (e.g. you can pass the const as a type level value __breakpoint::<1337>(); or with some rfcs, just do __breakpoint(1337) where the function takes a const argument).

For 256 values, the constify macros work just fine with opt-level=1, and even opt-level=0 if you don't care that much about bloated binaries and trashing the icache in debug builds. The #[rustc_args_required_const(arg_index)] attribute prevent callers from passing the intrinsic a non-const value, so we can always improve code-gen in the future in a backwards compatible way.

@alexcrichton has mentioned a couple of times that we should avoid exporting macros for this because exporting macros from the std library is too tricky.

So I think that you should weight:

  • How important are these intrinsics? As in, independently of the form in which they are provided, would it be possible to keep them unstable when we stabilize the rest? (this will require nightly Rust to use them for the forseable future, but we can change their API with const generics)
  • Are they used often? As in, would the code bloat from the constify macro (255 match arm version) be tolerable in debug builds (e.g. they are used once or twice per binary), or are they used a lot, and this would be unacceptable in debug builds?

I'd say that if __BKPT(value) is important (must be provided) but not used that often, then just go with a function, using the #[rustc_args_required_const(arg_index)] attribute to require the argument to be const and and use the constify macro internally. Maybe in 2019 we can remove the constify macro by using const generics, or some internal compiler hook.

For void __breakpoint(int val) with 16-bit immediate, we just have no reasonable way of implementing this in Rust right now beyond a macro that expands the value in place. If this version can wait till 2019, I'd recommend waiting. Otherwise, if @alexcrichton says that a macro is a killer, then you can provide something like this:

// coresimd:
pub unsafe fn __breakpoint_hack<T: BreakPointHack>(x: T) { 
    ... T::VALUE ...
}
pub trait BreakPointHack { const VALUE: i16; } 

// user code: 
struct Eight;
impl BreakPointHack for Eight { const VALUE: i16 = 8; }
__breakpoint_hack(Eight);

Once we get const generics, we can then deprecate __breakpoint_hack and add a proper __breakpoint intrinsic. I'd prefer not doing that and just not providing the intrinsic for now.

@japaric
Copy link
Member Author

japaric commented Jul 26, 2018

@paoloteti

There are 2 versions of BKPT

0 ... 65535: if you are compiling source as ARM code

0 ... 255 : if you are compiling source as Thumb code

Some processors can change between ARM mode and Thumb mode at runtime, I think. Should we provide two versions of BKPT one with signature fn(u8) and one with signature fn(u16)? The thumbv* targets always execute in Thumb mode so they would only have access to the u8 version.

@gnzlbg

How important are these intrinsics?

I use BKPT a lot but without the immediate value: that is I use asm!("bkpt"). The CMSIS API doesn't provide a function to call BKPT without an immediate value AFAICT.

In stable I have to call BKPT via FFI because inline assembly is not stable. This is very annoying because FFI always involves a function call so the debugger will end in the wrong stack frame and I'll have to manually pop the last frame to see from where __bkpt was called. Having a way to inline the BKPT instruction on stable would be a great improvement to the debugging experience.

Are they used often?

They don't appear that often in my code. I would say that, while developing / debugging, I will have at least one instance; at most I would probably have 3 or 4 instances.

When I use semihosting BKPT appears a lot most often because it's used in the implementation of print! like macros. But as I mention above __BKPT(u8) can't be used to implement semihosting system calls so the match bloat would not be an issue.

In conclusion: to me having a stable version of BKPT without immediate value is much more important than having an unstable way to use BKPT with an immediate value.

@paoloteti
Copy link
Contributor

@japaric

Some processors can change between ARM mode and Thumb mode at runtime, I think.

More than a runtime switch is a double support. Cortex-R boot-up in ARM mode, but then can use thumb istructions (boot code shall be in ARM mode). So I agree on an u8 signature for thumb* and u16 for ARM. Also immetiate value is not important, but signature shall default to u8 in any case.

@japaric
Copy link
Member Author

japaric commented Jul 27, 2018

Forwarding a comment by @nagisa from another thread (rust-embedded/wg#63 (comment)):

re __BKPT.

Adding const generics is very likely to be backwards compatible, so we could expose __BKPT() -> bktp 0 now and add a defaulted const generic later on without breaking anybody.

I personally haven’t had a need for bkpt <nonzero> ever in my life, and I’m sure that bkpt 0 will be sufficient for some 99.99% of use-cases.

@paoloteti
Copy link
Contributor

I'm getting old and I used bkpt with non-zero values only with sw in RELEASE mode and/or without link to source code or in general when was not easy to know, from the PC, witch path of the code hits the bkpt (due to optimizations) . BTW I agree that We can survive without immediate value on bkpt.

@gnzlbg gnzlbg self-requested a review July 30, 2018 09:47
Copy link
Contributor

@gnzlbg gnzlbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging more CMSIS intrinsics we should resolve whether CMSIS is the right spec to implement.

I've re-opened #437 to discuss this.

This discussion is something that has to happen for the RFC anyways, so we might just as well have it now.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 3, 2018

@japaric given the new PR #557 - how do you think this would fit in? Or should we just close this for now? (I understand that acle doesn't solve this, but we can just add one intrinsic to solve this specific issue once the acle story is settled).

@japaric
Copy link
Member Author

japaric commented Sep 3, 2018

I think we could add __breakpoint, instead of CMSIS's __BKPT, which is an ARMCC (ARM's own C compiler) intrinsic even though it's not in ACLE (it's not in ACLE 1.1 or 2.0; I haven't found older versions of ACLE).

The signature of __breakpoint is void __breakpoint(int val) and val must be in the range 0..255 when operating in T32 mode (this corresponds to the +thumb-mode LLVM target feature) and in the 0..65535 when operating in A32 mode (-thumb-mode). Given the lack of const generics perhaps we can support __breakpoint in A32 mode too but limiting the range to 0..255?

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 3, 2018

I think we could add __breakpoint, instead of CMSIS's __BKPT, which is an ARMCC (ARM's own C compiler) intrinsic even though it's not in ACLE

This is fine, and there is precedence for this for some of the x86 intrinsics (e.g. has_cpuid etc.). Just add a reference to the ARMCC compiler documentations in the intrinsic docs.

Given the lack of const generics perhaps we can support __breakpoint in A32 mode too but limiting the range to 0..255?

Improving this later is a backwards compatible change (enabling new code to compile that did not compile before), so I think this would be fine.

Care to open a PR that only adds the __breakpoint intrinsic? EDIT: I think it is fine to morph this PR into that too.

@japaric
Copy link
Member Author

japaric commented Sep 3, 2018

Closing in favor of #558

@japaric japaric closed this Sep 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants