Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with breaking changes on platform ? [BSDs related] #570

Open
semarie opened this issue Apr 7, 2017 · 86 comments
Open

How to deal with breaking changes on platform ? [BSDs related] #570

semarie opened this issue Apr 7, 2017 · 86 comments
Labels
Milestone

Comments

@semarie
Copy link
Contributor

semarie commented Apr 7, 2017

I open an issue on libc because it is here the problems will start to show up. Depending the solution or the way to deal with, modifications could occurs in rustc repository too.

At OpenBSD, we don't care about breaking API/ABI between releases. Once a release is done, the API/ABI is stable, but there is no guarantee that it will be compatible with the next release.

Currently, in the upcoming 6.2 version of OpenBSD (6.1-current), there is a breaking change that will affect libc : si_addr should be of type void *, not char * (caddr_t). Here the current definition in libc.

Under OpenBSD, we deal with ABI under LLVM by using a triple like: amd64-unknown-openbsd6.1. For Rust, instead we use an unversioned platform, resulting all OpenBSD versions to define the same ABI (which isn't properly right).

Do you think it possible to switch from *-unknown-openbsd to *-unknown-openbsd6.0, *-unknown-openbsd6.1, ... without having to duplicate all code in libc for each target ? and without having to add a new target in rustc for each OpenBSD release ?

Any others ideas on the way to deal with it ?

@alexcrichton
Copy link
Member

Unfortunately I don't really know how we'd handle this, I just figured that platforms wouldn't do this.

If this happens a lot we'll just need to document what's wrong and stop adding new bindings, it'll be up to crates to implement version compatibility.

@semarie
Copy link
Contributor Author

semarie commented Apr 9, 2017

I think it isn't just a "version compatibility" issue. My purpose isn't to have a compatibility layer for missing/removed functions or types.

The problem is OpenBSD triple is versioned, meaning that API/ABI of one version could be different from another version.

I checked some others system (running llvm-config --host-target to see if the triple is versioned or not), and it seems it is a common situation in not-Linux world:

  • x86_64-apple-darwin16.0.0
  • x86_64-unknown-freebsd12.0
  • x86_64-unknown-freebsd11.0
  • i386-unknown-openbsd5.8
  • x86_64-unknown-netbsd7.99

I also checked in LLVM source tree: the OS version is a part of the triple definition.
see getOSVersion() in include/llvm/ADT/Triple.h.

Maybe a concept is missing in Rust ? If target_os_version would be available, it would solve the issue: parts that are only defined in some OS version could be isolated from another OS version.

I don't think it is a problem only on OpenBSD. Any OS using OS-Version could be hitted. OpenBSD exposes it because we heavy use the ability to not be API/ABI compatible (it is a way to be able remove old stuff that deserve security).

@alexcrichton
Copy link
Member

Yeah there's no concept of a versioned target in rustc right now, and we're unfortunately not really capable of doing so right now.

Our only recourse is basically to take the subset which currently works across all revisions, put that in libc, and then otherwise let downstream crates bind versions that change over time.

@semarie
Copy link
Contributor Author

semarie commented Apr 11, 2017

I hope you are kidding: you are asking to remove siginfo_t type for OpenBSD from libc and so to break stack_overflow detection for OpenBSD (libstd relies on it). And even if we can drop stack_overflow detection, it doesn't resolv the intrinsic problem.

So I am looking to extend Target to include os-version information in the target specification.

@alexcrichton
Copy link
Member

Well, I'm not really kidding. If we feel we must fix this then we currently have no choice but to not add the bindings. If we don't want to do that then the fix must go elsewhere. I don't know the best way to fix this, just spitballing.

@asomers
Copy link
Contributor

asomers commented Apr 20, 2017

This isn't just a problem for OpenBSD. FreeBSD 12, when it comes out, will change a number of important types, like ino_t and struct stat. If libc's policy is to only bind the greatest common denominator between versions, then overtime it will shrink into irrelevance. Such a policy really just kicks the version compatibility can down the road.

Would it be possible to generate bindings dynamically at build time? When writing Ruby bindings, I've always preferred that approach to FFI. If not, then I think libc needs a way to distinguish between OS versions, just as it currently distinguishes between OSes.

@alexcrichton
Copy link
Member

I would personally be afraid of generating bindings at compile time. It just pushes the problem to consumers without giving them tools to deal with it.

I do think that this sounds like this needs a way for libc to distinguish between OS versions, but Rust currently has no tool for doing so really.

@asomers
Copy link
Contributor

asomers commented May 1, 2017

The problem just got worse. Linux 4.11, released today, added a new system call: statx. Until libc learns to understand versions, it can't add support for statx. I really think that cargo needs some sort of configure step analagous to autoconf's configure.
https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.11-Statx-System-Call

@alexcrichton
Copy link
Member

@asomers that's not quite true though, we can add bindings at any time. Rust supports Linux 2.6.18+ and there are a huge number of syscalls bound in libc not present in 2.6.18. It's up to crate authors to pick and choose apis for platform compatibility appropriately.

@asomers
Copy link
Contributor

asomers commented May 2, 2017

@alexcrichton I guess I was wrong about how libc's CI tests worked. Are you saying that libc's tests do not flag symbols defined in FFI but not present in the system's headers? If that's true, then a Rust program trying to use statx on Linux <= 4.10 would build but get ENOSYS at runtime, right? That's better than what happens if a Rust program tries to use aio_waitcomplete on FreeBSD 10, where the FFI binding would actually be wrong. But Linux is not immune from changing syscalls, either. The first example I could find was utimensat. Its signature changed in 2010, well after 2.6.18 was released. Any Rust program using libc will try to use the new version of utimensat, even when built for older systems.

Would you consider dynamically generating the bindings for select functions, even if most functions have static bindings? Right now, I don't see any way at all for consumers to deal with the versioning problem.

@alexcrichton
Copy link
Member

No, to be clear:

  • The libc crate is basically just a header file.
  • The libc crate is automatically verified on many platforms, but we have exceptions in libc-test/build.rs. It's not guaranteed that the verification passes on every instantiation of a platform.
  • If you use a function from libc you're referencing a symbol.
  • If that symbol doesn't actually exist on your system, you'll get a linker error.

Programs using statx will likely get a linker error and will then have to deal with that appropriately.

I would like to avoid dynamically generating the bindings, as that's typically not the actual solution to this problem. It makes cross compilation (even just across OS versions) much more difficult

@asomers
Copy link
Contributor

asomers commented May 2, 2017

Cross-compilation could be solved by overriding the build script's platform detection. For example, on FreeBSD there's basically only one symbol that a build script would need to detect: __FreeBSD_version. When cross-compiling, cargo could set that in an environment variable and the build script wouldn't try to detect it from the system headers.

But it sounds like you might have something else in mind when you say it's "not the actual solution to this problem". Do you? What is the "actual solution", @alexcrichton ?

@alexcrichton
Copy link
Member

To me the "actual solution" is precisely what we're doing right now. We list a bunch of symbols and authors need to be vigilant about which ones they use. This does not solve the use case of OpenBSD, however, if there are ABI breaking changes. We may need more than one solution but to me there are too many downsides to dynamically generating an API on Linux at least.

@asomers
Copy link
Contributor

asomers commented May 2, 2017

Not only does it not solve OpenBSD's use case; it doesn't solve the use case where operating systems make changes that don't break the ABI. Both FreeBSD and Linux occasionally change syscalls and provide backwards compatible syscalls with the old signature and syscall number but a new name. For example, FreeBSD 8's "compat7.shmctl" syscall is identical to FreeBSD 7's "shmctl". Similarly, operating systems make changes to system libraries and provide backwards compatibility by bumping the SHLIB version and providing the old libraries as optional packages.

Currently libc handles neither of these cases. Either the libc binding tracks the new function's signature, which breaks Rust programs at runtime on older versions, or the libc binding stays with the old signature, which breaks Rust programs at runtime on newer versions. It's simply not possible for the current libc to compile correctly on multiple versions of an operating system. Your previously suggested solution is to simply remove a binding whenever the OS changes it. But that would break any crates that use the binding, violate semver, and still result in runtime failures for crates that use the old libc but were built on a new OS.

You suggest that libc's consumers should be responsible for versioning issues, but I don't think that's possible. Let's take stat(2), for example, which will likely change in FreeBSD 12. Suppose that when FreeBSD 12 is released, somebody tries to compile the nix crate on it. The linker will be satisfied that stat is present in libc, but the signature will be totally wrong, so nix will fail at runtime. Cargo won't produce any kind of warning. If I understand you correctly, you suggest that stat should at this point be removed from libc. But that won't fix nix until somebody updates that crate's dependencies, and even then it will only change a runtime failure into a compile time failure. Must the nix developer then write a build script that checks __FreeBSD_version and reimplement all of stat's FFI bindings for FreeBSD 12? That would finally fix the problem. But according to crates.io, libc has 1268 dependent crates, and all of them would have to independently write the same build script and add the same FFI bindings for stat on FreeBSD 12.

Alternatively, libc could assume that all operating systems provide backwards but not forwards compatibility (sorry OpenBSD). Then it could pick a minimum supported version, and always link against that version's shared libraries, Currently Cargo doesn't provide a mechanism to specify an exact shared library version to link against, but that could be added. This would fix all of the runtime failures, but at substantial cost: dependent crates would lack access to new OS features, and both developers and users would have to install the compat library packages. Not only would new features that change APIs be unavailable, but the shared library lock would mean that entirely new functions would be unavailable as well, unlike the current situation where using newly added functions will generate link failures when building on an old OS.

In either case, developers will likely fork libc to update their favorite bindings, resulting in a Balkanization of libc and dependent crates that don't support older OS versions.

I understand that cross-compilation is a really cool feature, but I fear that you're underestimating the severity of this problem. Have you looked into how embedded cross development toolchains work? AFAIK the host system requires full headers for the target. Maybe Rust needs to do the same.

@alexcrichton
Copy link
Member

@asomers if you've got a proposal of what to do I'd recommend writing up an RFC, with so many dependencies changes such as what I think you're proposing can't be taken lightly.

@semarie
Copy link
Contributor Author

semarie commented May 6, 2017

@alexcrichton I pushed a WIP branch on my github repository. I hope code will be more explicity than my explaination about what I called having support for OS version.

Tree is at https://github.com/semarie/rust/tree/target-os-version . Please note my code isn't working for now.

Basically, it is:

  • extending Target to embedding a (possibility empty) os-version string
  • exposing the string as target_os_version symbol (in the same way than target_os)

It would be possible to do have conditionnal code against OS version (OpenBSD 6.1 or OpenBSD 6.0), in the same way we have conditionnal code against OS name (OpenBSD or FreeBSD).

@asomers
Copy link
Contributor

asomers commented May 6, 2017

Good work @semarie. BTW, I've been studying ELF symbol versioning and I think it would be possible to fix libc without modifying Rust itself. Basically, libc would need to grow a bunch of feature flags like "freebsd11+", "freebsd10+", etc, meaning "build code that will work on FreeBSD 11 or greater" and "build code that will work on FreeBSD 10 or greater". Of course, those flags could be conditionalized so they won't appear on other platforms. Then, for every symbol that differs between FreeBSD versions, libc will bind a different version depending on which feature flags are set. The link_name attribute will encode the specific ELF symbol version number used on the oldest OS version chosen. I don't have code yet, but I think this approach will work for FreeBSD and Linux. Does OpenBSD use ELF binaries or is it still using a.out?

Also, I've found several functions in glibc with multiple versions. Linux is not immune from this problem.

@semarie
Copy link
Contributor Author

semarie commented May 6, 2017

@asomers OpenBSD uses ELF on all platforms. but using ELF symbol versioning doesn't help for breaking changes if the OS doesn't use symbol versioning.

@alexcrichton
Copy link
Member

@semarie I'd personally probably reocmmend writing an RFC before sending that as a PR, I'm sure many others would have comments as well!

@raphaelcohn
Copy link
Contributor

raphaelcohn commented May 17, 2017

This has hit me too - in particular, with changes in Mac OS X major versions. However, a good solution probably isn't to use a version number in the target triple, as there's a distinction to be drawn between libc version and OS version; they do not necessarily go in lockstep. A classic example might be changes to Linux's uapi headers, which don't yet line up with changes in musl, say.

This problem is a general one: changes in third party (usually C) library APIs that are incompatible. It needs a good solution within Rust. It's a problem that's heavily compounded by set ups that use dynamic libraries (something I've come to see as more trouble than they're worth for secure or robust systems outside of the desktop. In practice, it's a far too difficult for most sysadmins to assess whether a security fix to a dynamic library affects more than on running program, and so they just go for the nuclear option of a reboot). Using autoconf like tests or dynamic bindings at runtime is probably the wrong way to solve this generally. Such approaches require too much of the system they are on (execute permissions, existence of compilation-associated tools, headers, etc). They are also deeply unfriendly to security audits and locked down systems (eg those built entirely from source). autoconf in particular makes the classic mistake that the build system is similar to deployment; it's always been an absolute beast to get things to cross-compile with it repeatedly, robustly and consistently. Too many things (eg time-of-date, location of shell interpreter, absolute sysroot paths, etc) creep into the deployed solution.

(Semantic version does absolutely nothing to solve this; in fact, semantic versioning is a deeply broken concept that's become popular recently. In practice, either a version is compatible or it isn't; semantic versioning is just the upstream's author's assessment. One man's inconsequential security version or minor change is another's nightmare. In practice, with large system set ups and deployments, I always encourage dev teams to think of only two kinds of version: likely-to-be-compatible security fix, and incompatible. Everything incompatible needs to go through the full test cycle before deployment. Security fixes can bypass that if urgent; risk vs reward and all that).

@comex
Copy link

comex commented May 25, 2017

Er, is the siginfo.h change in question actually ABI-breaking? char * and void * should have the same memory representation, so I'd expect that change to break the API (for newly compiled C code) but not the ABI.

Though it seems there's a more general problem to be solved here.

@semarie
Copy link
Contributor Author

semarie commented May 25, 2017

@comex yes, the change char * to void * is just an API break regarding OpenBSD. But it is an uncommitable change in crate libc without major version bump.

I started a discussion on internals, and I am working to submit a RFC.

@Rufflewind
Copy link

Related: rust-lang/rust#42681

fs::metadata() crashes on FreeBSD 12 due to layout change in stat.h

@mattmacy
Copy link

#775

I don't quite understand how Rust initially missed out on OS and ABI versioning quite so badly - deciding to assuming that structures and types are immutable over time or removing key structures from libc altogether. Nonetheless, Rust can conditionally compile based on configuration values, what is stopping this?

@semarie
Copy link
Contributor Author

semarie commented Sep 21, 2017

@mattmacy my understanding of the problem is:

  • build time configuration adds more complexity for crosscompiling (you need to target a particular OS ABI)
  • crosscompiling is used a lot in Rust infrastructure (for testing, for rustup...) So using OS ABI would mean a potentially high number of new targets to check, and binaries to produce, resulting an infrastructure more complex to maintain (it needs to scale)
  • it is only a problem for BSDs, and they are not a high priority

@semarie semarie changed the title How to deal with breaking changes on platform ? [OpenBSD] How to deal with breaking changes on platform ? [BSDs related] Sep 21, 2017
@asomers
Copy link
Contributor

asomers commented May 23, 2019

@gnzlbg @semarie's proposal would work well for OpenBSD, but at great cost to the Rust ecosystem. Adding target_os's for every version of FreeBSD and NetBSD would have even more cost. And it wouldn't solve the problem for rare and proprietary OSes. This is why I proposed an alternate solution at https://internals.rust-lang.org/t/pre-rfc-global-source-replacement-in-cargo-for-os-bindings/9383 . In a nutshell:

  • Rust would work on any OS based on one of the supported ones, including weird forks like the Nintendo Switch.
  • Rustup would only support one version of each OS. For example, Rustup would only provide a toolchain for the latest OpenBSD. Older versions of OpenBSD would be supported through OpenBSD's package manager.
  • Where possible, Rustup's toolchain would be backwards compatible with a range of OS versions. For FreeBSD, that means the toolchain would target FreeBSD 11 (but could built crates targeting FreeBSD 12). In a few years Rustup would retarget its toolchain to FreeBSD 12.
  • lib would no longer have to worry about the differences between different platform versions. In the case of FreeBSD, libc would target one particular version (probably 11), and a FreeBSD-12 specific libc would be provided by FreeBSD itself.
  • Most of the code changes would be confined to Cargo.

@gnzlbg
Copy link
Contributor

gnzlbg commented May 23, 2019

@asomers I remember your proposal and I think it is interesting as well. From my point of view, your proposal and @semarie 's both solve problems in the same domain, but they have quite different goals and constraints on the solution, which is what results in two very different approaches, both having different trade-offs, to the point that we actually could do both since they appear to me to be compatible with each other.

Maybe it would be worth it to have a mini-RFC here first about what goals and constraints do we have, and once we are on the same page about that, reconsider both solutions, and try to come up with something that satisfies them all.

@valpackett
Copy link
Contributor

Hmm, looks like you've decided to check it in libc's build.rs, nothing needed from the compiler:

libc/build.rs

Lines 19 to 23 in d5a599e

if std::env::var("LIBC_CI").is_ok() {
if let Some(12) = which_freebsd() {
println!("cargo:rustc-cfg=freebsd12");
}
}

but only for testing right now.

If you just do <= checks instead of specifically 11 and 12, and on __FreeBSD_version from /usr/include/osreldate.h instead of freebsd-version output, this could very well be the easiest solution for production as well! And in cross-compilation, just use 11 for max compatibility.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jun 8, 2019

@myfreeweb the current system is for private use within libc only. This allow us to implement FreeBSD12 APIs, keep track of the backwards incompatible changes, and enables testing whatever system we choose to use to support this.

@valpackett
Copy link
Contributor

valpackett commented Jun 8, 2019 via email

@Demi-Marie
Copy link

@gnzlbg @semarie's proposal would work well for OpenBSD, but at great cost to the Rust ecosystem. Adding target_os's for every version of FreeBSD and NetBSD would have even more cost. And it wouldn't solve the problem for rare and proprietary OSes. This is why I proposed an alternate solution at https://internals.rust-lang.org/t/pre-rfc-global-source-replacement-in-cargo-for-os-bindings/9383 . In a nutshell:

* Rust would work on any OS based on one of the supported ones, including weird forks like the Nintendo Switch.

* Rustup would only support one version of each OS.  For example, Rustup would only provide a toolchain for the latest OpenBSD.  Older versions of OpenBSD would be supported through OpenBSD's package manager.

OpenBSD doesn’t bump Rust versions in stable branches of the ports tree, so one needs to use OpenBSD-CURRENT if one wants a recent Rust on that platform. This solution also does not work for cross-compilation.

For the open-source OSs, one option would be to run bindgen on /usr/include from the most recent release. That is a lot of code, but one advantage is that it is very easy to automate, which makes maintenance simpler.

@semarie
Copy link
Contributor Author

semarie commented Mar 31, 2020

OpenBSD doesn’t bump Rust versions in stable branches of the ports tree.

It is by design: stable port tree receive only security updates. But nothing prevent a user to build recent rust release on OpenBSD stable.

@Demi-Marie
Copy link

OpenBSD doesn’t bump Rust versions in stable branches of the ports tree.

It is by design: stable port tree receive only security updates. But nothing prevent a user to build recent rust release on OpenBSD stable.

Understandable. That said, this prevents products that depend on the latest stable Rust (such as Firefox) from getting security updates. Also, security vulnerabilities occasionally do appear in Rust itself.

@asomers
Copy link
Contributor

asomers commented Jun 12, 2021

This RFC proposes a good solution: rust-lang/rfcs#3036 .

@64
Copy link

64 commented Jun 13, 2021

@asomers How would that work exactly? Have code inside the libc crate which is protected by #[cfg(min_libc_version = ...)]?

@asomers
Copy link
Contributor

asomers commented Jun 13, 2021

@asomers How would that work exactly? Have code inside the libc crate which is protected by #[cfg(min_libc_version = ...)]?

Exactly. Then consumers can set the min_libc_version in their own .cargo/config files.

@RalfJung
Copy link
Member

RalfJung commented Dec 4, 2024

TIL that emscripten also likes to change their ABI a lot (see e.g. #3962). Isn't that pretty much the same problem? I am a bit surprised that the discussion above seems to be entirely about the BSDs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests