Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove direct usage of fork(2) in most tests, use trampolined execution instead #113

Merged
merged 8 commits into from
Feb 27, 2023

Conversation

pawelchcki
Copy link
Contributor

What does this PR do?

This addresses long standing issue with fork being inherently unsafe and unreliable way to run tests - and sidecar itself on production environments.

To enable running tests directly on linux - a new LD_PRELOAD based trampoline was added. It will automatically kick off when the spawn_worker crate is used from a linked executable.

However 1 caveat exists that enables this. The tests need to be compiled with "-rdynamic" flag which will export symbols so that they are available for LD_PRELOAD to look up and execute

Motivation

Evils of fork

Additional Notes

Fork is evil

How to test the change?

@pawelchcki pawelchcki requested review from a team as code owners February 21, 2023 12:48
ddtelemetry-ffi/build.rs Show resolved Hide resolved
ddtelemetry/tests/flock.rs Show resolved Hide resolved
@pawelchcki pawelchcki merged commit 8dbefe5 into main Feb 27, 2023
@pawelchcki
Copy link
Contributor Author

Thanks @paullegranddc and @r1viollet for the reviews 🙇

Comment on lines +78 to +86
pub fn recv_passed_fd() -> Option<OwnedFd> {
let val = env::var(ENV_PASS_FD_KEY).ok()?;
let fd: RawFd = val.parse().ok()?;

// check if FD number is valid
nix::fcntl::fcntl(fd, nix::fcntl::FcntlArg::F_GETFD).ok()?;

Some(unsafe { OwnedFd::from_raw_fd(fd) })
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me that we should probably close all non passed file descriptors, in order to not keep alive the applications files and socket open, since the sidecar could outlive the parent process

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the beauty of using exec! we don't have to!
Assuming we can trust most cases where people might have some sockets, and if they don't like those sockets to be shared with other processes they will do set CLOSEXEC

....

Now I realized that its the APM world we're talking about here :| - and maybe we should close them

enum ChildStdio {
Inherit,
Owned(OwnedFd),
Ref(libc::pid_t),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A raw fd should be of type std::os::fd::RawFd, libc::pid_t is confusing

Comment on lines +172 to +176
if fd.as_raw_fd() >= 0 && fd.as_raw_fd() <= libc::STDERR_FILENO {
Ok(ChildStdio::Owned(fd.try_clone()?))
} else {
Ok(ChildStdio::Ref(fd.as_raw_fd()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this one. Why couldn't we clone non stdin/stdout/stderr file descriptors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code follows what Command::spawn is doing. Also since stdio fd's are not CLOSEXEC by default then we don't need to dupe them.

Comment on lines +324 to +327
// we're stripping the close on exec flag from the FD
// to ensure we will not modify original fd, whose expected lifetime is unknown
// we should clone the FD that needs passing to the subprocess, keeping its lifetime
// as short as possible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this, on fork all of the processes fd gets cloned, so I am not sure what cloning an fd to a "smaller" lifetime improves..
On the CLOEXEC argument I am split.

  • Passing a fd with the flag set should be an error since the creator of the fd specified that they don't want them to get passed to forks
  • On the other hand, by default try_clone on io-lifetimes fds adds this flag https://doc.rust-lang.org/src/std/os/fd/owned.rs.html#94 . But IMO the fix should be either stopping to clone OwnedFds all over the place and use Borrowed ones instead, or use our own clone function that skips this flag.

@bantonsson bantonsson deleted the pawel/spawn_ld_preload branch March 7, 2024 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants