Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more ptrace_* wrappers #666

Closed
wants to merge 5 commits into from
Closed

Conversation

marmistrz
Copy link
Contributor

@marmistrz marmistrz commented Jul 10, 2017

Btw. I think that the nix::sys::ptrace::ptrace function should be marked as unsafe. It allows to change arbitrary registers and addresses in a child process - just as dereferencing a pointer is unsafe in Rust.

I was unable to use libc::user_regs_struct, the Rust compiler complains that it can't find the members.
By the way, we're missing PTRACE_GETREGS.

I tried to use the user_regs_struct in the following way:

extern crate libc;

fn main() {
    println!("{}", libc::user_regs_struct::rdx);
}

which bails out with

file: 'file:///home/marcin/proj/user_regs/src/main.rs'
severity: 'Error'
message: 'no associated item named `rdx` found for type `libc::user_regs_struct` in the current scope'
at: '4,22'
source: ''

@vincenthage
Copy link

vincenthage commented Jul 10, 2017

Hi,
wouldn't it be more useful to have the Register enum like this:

pub enum Register {
    SysArgNum,
    SysArg1,
    SysArg2,
   ...
}

and then use cfg to make the link between the enum and the real register, either with a macro or a function:

#[cfg(all(target_os = "linux", any(target_arch = "x86_64")))]
#[inline]
pub fn get_reg(regs: &user_regs_struct, register: Register) -> c_long {
    match register {
        SysArgNum => regs.orig_rax,
        SysArg1 => regs.rdi,
        SysArg2 => regs.rsi,
        ...
    }
}

As it is implemented in this PR, the Register enum is redundant with the content of user_regs_struct,
and doesn't make it more easy to use.

For user_regs_struct, you can use it like this:

/// Copy all `tracee`'s general purpose registers into a dedicated cache.
/// Returns either `Ok(regs)` or `Err(Sys(errno))` or `Err(InvalidPath)`.
#[inline]
pub fn fetch_regs(pid: pid_t) -> Result<user_regs_struct> {
    let mut regs: user_regs_struct = unsafe { mem::zeroed() };
    let p_regs: *mut c_void = &mut regs as *mut _ as *mut c_void;

    ptrace(PTRACE_GETREGS, pid, null_mut(), p_regs)?;

    Ok(regs)
}

See here how to use all this (though I've implemented it like a macro so that it's faster, it can be converted to functions with big matches).

@marmistrz
Copy link
Contributor Author

marmistrz commented Jul 10, 2017

@vincenthage for me this could be an extra feature. I think that our wrappers should support all registers, since it's what raw ptrace supports. I don't want to cripple existing functionality.

How did you manage to get the user_regs_struct members recognized?

@vincenthage
Copy link

vincenthage commented Jul 10, 2017

I don't understand your question, as user_regs_struct is a structure. On each syscall intercepted, the process' registers will contain the arguments' addresses.

Nix's wrappers should indeed support all registers, but I wonder if a list of field names, where some of them wouldn't be used (or even exist?) depending on the processor type, is a good solution.
The registers calling conventions (application binary interface) are different for every processor (even on a same one, they can differ by whether the program is run in 32-bits or 64-bits mode, see here), so in every case, at some point, you will have to have one cfg for each configuration.

@marmistrz
Copy link
Contributor Author

@vincenthage see the first post. (I tried to use...)

And even if we did as you suggest: what if the user requests the 3rd argument when we have a syscall which only accepts 2? Or uses that after PTRACE_SINGLESTEP. This would automatically make this function unsafe.

We could have an extra wrapper to answer the question what is the register is used for storing nth argument, but I'd treat it as an extra feature. @Susurrus, what do you think?

I'd suggest we left this as it is and focus on another thing: do we declare pokedata/peekdata as unsafe? I'm not fully convinced but I think we should.

@vincenthage
Copy link

vincenthage commented Jul 11, 2017

Let's just start again so that we agree on the basics (it will also be useful for anyone looking into this the first time).
Every syscall arguments/components are like these:

pub enum Register {
   SysArgNum,
   SysArg1,
   SysArg2,
   SysArg3,
   SysArg4,
   SysArg5,
   SysArg6,
   SysArgResult,
   StackPointer,
   InstrPointer,
   RtldFini,
   StateFlags,
   UserArg1
}

Each of them corresponds to a certain field of user_regs_struct. Each field of this structure is a memory address (or just a numeric value, like SysArgNum), or NULL if unused.
For instance, SysArg1, the address to the first argument of any syscall, is rdi on x86_64 processors, and ebx on x86 processors.
user_regs_struct itself is different on every type of processors, which makes all this way more difficult. And libc currently has defined this structure only for x86_64 and x86. Yay.

Every time the tracee is stopped, you can get the content of the syscall by 2 ways (let's say we want the 1st argument):

  • by using the ptrace syscall to fetch only one of the registers. This is done by giving the corresponding offset of the user_regs_struct's field. This is what you're trying to do, if I understood correctly. So something like let offset_arg1 = offset(Register, RDI) and then ptrace(PEEKUSER, pid, offset_arg1, null_mut()).
  • by using the ptrace syscall to fetch all of the regs (user_regs_struct) at once (with ptrace(GETREGS)), and then going to the field(s) you want. So: let regs = fetch_regs(), and then get_reg(regs, SysArg1) which will return the value of the field rdi in the user_regs_struct structure if the mapping is done.

Now, if we agree on all this:

And even if we did as you suggest: what if the user requests the 3rd argument when we have a syscall which only accepts 2?

The get_reg function just redirects to one of the fields of user_regs_struct. If it's not used by the syscall, it will be 0/null, no issue. And worst case, if the user only wants one argument, then they can still use the ABI (Register to user_regs_struct correspondance) to know which register (and so which offset) to use, and call ptrace manually with PEEKUSER. But this would be useless, because to effectively use any of the arguments, you need to know which syscall this is. And so you would first need to get SysArgNum, and then call ptrace again for the other argument. So, we might as well just get all the regs at the same time, it's faster.

Again, the Register class you wrote is redundant with user_regs_struct, and only corresponds to the x86_64 version of it, so it will only have a meaning for this precise type of processor. For instance, in the x86 version of it, R15 doesn't even exist. I don't think writing a (redundant) wrapper for all the other versions of it is that useful.
edit: OK, this wrapper was asked in the corresponding issue, nevermind then.

Now, it all depends on what you mean by a high-level wrapper. But in my opinion, a high-level wrapper shouldn't involve having the user manipulate offsets to find the arguments of a syscall.

What I propose is directly giving a translated version of user_regs_struct, Registers, that would have the memory address of all the syscall's arguments. That would be very convenient for any user, though it might be a bit too high-level. And while this is practical for reading values, it's impractical for writing values. This needs more discussion.

The best intermediate might be a get_reg! macro, because you can use it for reading

let regs = fetch_regs(pid)?;
let sysnum = get_reg!(regs, SysArgNum)

but also writing:

let new_regs = regs.clone();
get_reg!(new_regs, SysArgNum) = 42;

and this could be allied with a function and a match on every enum to convert a user_regs_struct into Registers and vice-versa.

@vincenthage
Copy link

On a side note, there is this offset macro that might work for your wrap, but it would require lazy_static to use it I think.

@marmistrz
Copy link
Contributor Author

All the members of the user_regs_struct are longs. Not to mention that my experience was: peeking wrong register means reading garbage which is completely nondeterministic. (I used PEEKUSER, no experience with GETREGS, but nix::sys::ptrace doesn't support GETREGS anyway)
Even if ptrace returned 0/NULL for all the registers which are not in use, we'd have absolutely no way of finding out if's 0 because that's the arguments - or that it's simply unused.

Moreover, I don't see a ptrace routine to check the number of arguments in a syscall. This would probably require some FFI work which I don't know if I have time for.

And your suggested approach with fetch_regs is something I won't accept - if I only need the value from 2 registers, why fetch all >= 7 (6 arguments + RAX + ...). As far as I remember assembly on x86_64 - you don't put the argument number into any register either.

The whole point of these high-level wrappers is that they wrap the existing, atrocious ptrace API. Casting integers to const* c_void is something that shouldn't appear in user Rust code. This should be hidden from the programmer - and the current nix::sys::ptrace::ptrace is just a 1-1 wrapper when it comes to the input arguments.

My idea is: add a fn getreg(...) which would just give the appropriate value of enum Register for the nth argument. We should not constrain the possibilites of ptrace and sys::nix::ptrace::ptrace should eventually get deprecated.

@vincenthage
Copy link

vincenthage commented Jul 12, 2017

You win. The lighter the wrap, the better anyway.

About GETREGS, it's not that bad actually if you zero the structure beforehand; all the unused arguments are deterministically set at 0. Though I haven't seen any advanced benchmarks comparing one GETREGS call vs multiple PEEKUSER calls, so I can't say which one is faster.

@marmistrz
Copy link
Contributor Author

Is this behavior documented?

Let's get down to the other issue: what about unsafe?
I think that peekdata, pokedata should definitely be unsafe, since they can crash the traced process. What do you think?

I asked a related question on users.rust-lang.org: https://users.rust-lang.org/t/when-should-we-mark-our-functions-unsafe/11834

@vincenthage
Copy link

vincenthage commented Jul 13, 2017

All PtraceRequest that can crash the tracee should be unsafe indeed.

PtraceRequest that can write are probably also concerned (simultaneous writes / memory unsafe / tracee crash by giving invalid arguments), so the poke* and set* families should all be unsafe.

It might actually be simpler to make all ptrace operations unsafe by default, and then whitelist the ones that we can control (traceme and syscall for instance).

@marmistrz
Copy link
Contributor Author

Marked that.
The register offsets are not in the libc crate. Do we leave them hardcoded as they are for now?

And one more thing about the types. The sys::nix::ptrace::ptrace returns Result<c_long>. I think the user API should avoid these types. What do you think about returning Result<u64> in peek*? I think it would be reasonable.

@Susurrus
Copy link
Contributor

The register offsets are not in the libc crate. Do we leave them hardcoded as they are for now?

This won't be merged until we figure out if those constants can be added to libc and get them added if so. We try not to define any FFI values or functions within nix directly.

@vincenthage
Copy link

vincenthage commented Jul 14, 2017

What do you think of this:

use std::ptr;
use libc::user_regs_struct;
use nix::sys::ptrace::ptrace::PTRACE_PEEKUSER;

pub type UserRegisters = user_regs_struct;

// the one in macros.rs
macro_rules! offset_of {
    ($ty:ty, $field:ident) => {
        &(*(0 as *const $ty)).$field as *const _ as usize
    }
}

macro_rules! peekuser {
    ($pid:expr, $reg:ident) => {
        {
            let reg_offset = unsafe {offset_of!(UserRegisters, $reg)} as *mut c_void;
            ptrace(PTRACE_PEEKUSER, $pid, reg_offset, ptr::null_mut())
        }
    }
}

which can be used like this:

#[test]
fn test_peekuser() {
    let pid = -1;
    let reg_value = peekuser!(pid, r15);

    assert!(reg_value.is_err());
}

whereas using an non existing register field is impossible:

peekuser!(pid, r9999);

yields:

error[E0609]: no field `r9999` on type `libc::user_regs_struct`
   --> src/register/regs.rs:80:64
    |
80  |             let reg_offset = unsafe {offset_of!(UserRegisters, $reg)} as *mut c_void;
    |                                                                ^^^^ did you mean `r9`?
...
100 |     let reg_value = peekuser!(pid, r9999);
    |                     --------------------- in this macro invocation

error: aborting due to previous error(s)

With this, no need for a wrapper for each version of user_regs_struct.

@vincenthage
Copy link

vincenthage commented Jul 14, 2017

Also, @marmistrz, from what I gathered the size of c_long can change depending on the processor. It's 8 bytes in 64bits, but 4 in 32bits. If so, we can't replace it by a fixed type like u64.

Though, I agree that a light wrapper (like pub type Word = c_long; ) would make ptrace's functions cleaner.
We could also define it as struct Word(c_long) and implement useful functions for it (has_value, is_null, to_ptr, to_value) if we need it.

edit: Also found out that the return value of ptrace should be c_ulong instead of c_long. Negative memory adresses don't make much sense.

@marmistrz
Copy link
Contributor Author

marmistrz commented Jul 17, 2017

Yeah, I like it! :) This one's cross platform, the API is nice. Should I try adding the offsets to libc nevertheless?
ptrace returns a long: man ptrace

       long ptrace(enum __ptrace_request request, pid_t pid,
                   void *addr, void *data);

I'll ask on the Rust forum about the return types, I don't really like the idea of leaking C types to Rust. The ptrace API will not be portable anyway.

/edit: There's a small problem with the macro.... From #rust-beginners:

<kimundi> marmistrz: For what its worth though, afaik this macro is using UB to do this, and the proper way would be compiler support
(...)
<WindowsBunny> Dereferencing a nullptr is undefined behavior

@vincenthage
Copy link

vincenthage commented Jul 17, 2017

There are safe versions of the offset_of macro, like here.
And according from the comments it compiles down to constants so it's quick.
edit: Ah wait it's for an enum. Well I'm sure there is one for structures too. There is this crate for instance that does it.

About replacing c_ulong, wouldn't usize do the job? It's equivalent to u32 on x86 processors and u64 on 64bits.

@vincenthage
Copy link

vincenthage commented Jul 17, 2017

Version 2:

macro_rules! offset_of {
    ($ty:tt, $field:ident) => {
        {
            let base: $ty = unsafe { ::std::mem::uninitialized() };
            let offset_field = &(*(&base as *const $ty)).$field as *const _ as usize;
            let offset_base = (&base as *const $ty) as *const _ as usize;
            let offset = offset_field - offset_base;
            
            ::std::mem::forget(base);
            offset
        }
    }
}

This one uses an instance of the structure to calculate the offset.
So instead of doing : 0.field - 0 equivalent to 0.field
it does: base.field - base.

If the compiler doesn't optimise it into a constant, we can use lazy_static! to instantiate a constant user_regs_struct base that would be used everytime for this (specialized) macro.

@marmistrz
Copy link
Contributor Author

Does std::mem::uninitialized does heap or stack allocation? It seems to be optimized: https://godbolt.org/g/i4DqKx
Asked about the C types here: https://users.rust-lang.org/t/leaking-c-types-in-rust-interfaces/11898

@marmistrz
Copy link
Contributor Author

I noticed another problem: depending on the context we may want to put a signed or unsigned value into the register. I'm not so sure if a cast to usize will always do what we want. (it could possibly modify the values in some way or panic)

@Susurrus Susurrus changed the title High level improvements for the ptrace module. Add support for PTRACE_GETREGS Jul 18, 2017
@Susurrus Susurrus mentioned this pull request Jul 18, 2017
@marmistrz
Copy link
Contributor Author

First of all, I didn't add PTRACE_GETREGS. I added other wrappers.
Second of all - do you have an idea why the automated tests fail? On Android use of self::ptrace::* fails, but I just added an unsafe here.

In my new test it complains about Word not recognized, but this works on my system.

Copy link
Contributor

@Susurrus Susurrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A variety of minor changes are necessary. Please also run cargo doc --no-deps --open to look at the resulting documentation and make sure it's pretty.

This also needs a rebase due to the latest changes (some which affect ptrace)

type Word = usize;

/// Makes the `PTRACE_SYSCALL` request to ptrace
pub fn syscall(pid: Pid) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called ptrace_syscall, we're prefixing our helper functions with ptrace_ to make it more clear. syscall is already a POSIX function, so we don't want any confusion here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has to, I'll change it. But IMHO what I do now is cleaner. With my approach you just use nix::sys::ptrace and call it like ptrace::syscall(), with the suggested approach: use nix::sys::ptrace::{ /* an endless list of imports */ } or you have to use a star, which is not a great idea in general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point as these are all namespaced unlike in C. Could you have a separate commit then that renames the existing ptrace_* wrappers? That'll have to be documented in the CHANGELOG entry as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I just meant a separate commit within this PR, not a new PR. But that's fine too.

use super::*;

#[test]
fn test_types() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document why this test is necessary.

ptrace(ptrace::PTRACE_PEEKDATA, pid, addr as *mut c_void, ptr::null_mut()).map(|r| r as Word)
}

/// Makes the `PTRACE_PEEKDATA` request to ptrace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is wrong.


/// Makes the `PTRACE_PEEKDATA` request to ptrace
///
/// This function allows to access arbitrary data in the traced process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this under a # Safety section, which is a special section understood by rustdoc

}

/// Sets the process as traceable with `PTRACE_TRACEME`
pub fn traceme() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, should be ptrace_traceme

/// Sets the process as traceable with `PTRACE_TRACEME`
pub fn traceme() -> Result<()> {
unsafe {
ptrace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put first argument on first line and align subsequent arguments with it only if they take up more than 99 columns of space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran rustfmt on the code (commit 2a295c6) and it's how rustfmt does it.

@@ -140,3 +150,110 @@ pub fn ptrace_setsiginfo(pid: Pid, sig: &siginfo_t) -> Result<()> {
Err(e) => Err(e),
}
}

//-------------------------- Second part: a high-level wrapper for ptrace ----------------------//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments need to be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with them? They clearly separate the sections

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're ugly mostly, the code has a sane ordering and these comments are mostly clutter. Generally source files are written so that low level stuff is at the top and high-level stuff is at the bottom. This is a convention from the C days when declarations had to be in order. It's not necessary in Rust, but it's still followed generally.


//-------------------------- Second part: a high-level wrapper for ptrace ----------------------//

#[cfg(target_arch = "x86_64")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why your tests are failing, you need to either limit peekuser() to x86_64, since that's the only play this register is defined for, or expand Register to more platforms.

@Susurrus
Copy link
Contributor

First of all, I didn't add PTRACE_GETREGS.

You can edit the title as well, feel free to change it to something more appropriate. The previous title didn't help me know which issue this was when reading it, so I changed it.

do you have an idea why the automated tests fail? On Android use of self::ptrace::* fails, but I just added an unsafe here.

It's as the error says, Register is undefined for no-x86_64 targets.

@Susurrus Susurrus changed the title Add support for PTRACE_GETREGS Add more ptrace_* wrappers Jul 19, 2017
@marmistrz
Copy link
Contributor Author

marmistrz commented Jul 19, 2017 via email

@Susurrus
Copy link
Contributor

Yes, because otherwise it's a consistency issue. You should have 3 commits here, the first is renaming the existing functions, the second is all of your changes, and the third is updating the CHANGELOG.

Copy link
Contributor

@Susurrus Susurrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also extend the list of UnsupportedOperations reported by ptrace to include the new wrappers you've introduced. We'd like to get people using the wrappers instead of the raw ptrace() function.

type Word = usize;

/// Makes the `PTRACE_SYSCALL` request to ptrace
pub fn syscall(pid: Pid) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I just meant a separate commit within this PR, not a new PR. But that's fine too.

@@ -140,3 +149,124 @@ pub fn ptrace_setsiginfo(pid: Pid, sig: &siginfo_t) -> Result<()> {
Err(e) => Err(e),
}
}

#[cfg(target_arch = "x86_64")]
// We're going to export it anyway
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments come before attributes.

@@ -68,9 +73,13 @@ mod ffi {
}
}

/// Performs a ptrace request. If the request in question is provided by a specialised function
/// A low-level wrapper for `ptrace`. If available, the higher-level wrappers should be considered instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a short one-line description with a blank line and then more description below.

///
/// # Safety
/// This function allows to access arbitrary data in the traced process
/// and may crash the inferior if used incorrectly and is thus marked `unsafe`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "the inferior" here? That doesn't read right to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, gdb always talks about the inferior - the process that's spawned by gdb and traced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the term used on the man pages? That's the reference docs we link to, so we should use that terminology. I believe they use "tracer" and "tracee".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmistrz Any comment on this?

/// Makes the `PTRACE_PEEKDATA` request to ptrace
///
/// # Safety
/// This function allows to access arbitrary data in the traced process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "allows for accessing"

/// Makes the `PTRACE_POKEDATA` request to ptrace
///
/// # Safety
/// This function allows to access arbitrary data in the traced process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same grammar nitpicks as above.

/// This function allows to access arbitrary data in the traced process
/// and may crash the inferior or introduce race conditions if used
/// incorrectly and is thus marked `unsafe`.
pub unsafe fn pokedata(pid: Pid, addr: usize, val: Word) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some tests for these if possible, even if they just return errors, if for no other reason than to test the ergonomics of our wrappers. For example here I don't know how you'd generate addr since it's supposed to be a raw pointer type.

Copy link

@vincenthage vincenthage Aug 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One simple test that doesn't require any memory allocation is to cancel a syscall by changing its syscall number to 0 during the enter stage, and then reverting it during the exit stage.

@marmistrz
Copy link
Contributor Author

is it ok now?

fn ptrace_peek(request: Request, pid: Pid, addr: *mut c_void, data: *mut c_void) -> Result<c_long> {
let ret = unsafe {
unsafe fn ptrace_peek(request: Request, pid: Pid, addr: *mut c_void, data: *mut c_void) -> Result<c_long> {
let ret = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to have this in braces, just make these two separate lines instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed that and reformatted to avoid exceeding 80 columns of length.

Copy link
Contributor

@Susurrus Susurrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Biggest question is does it make more sense to make syscall*! a method on Register? There's a runtime overhead but it clarifies the docs and unifies things a little nicer. This was discussed somewhat in the original discussion I believe, so it may already have been considered.

ptr::null_mut(),
ptr::null_mut(),
).map(|_| ()) // ignore the useless return value
ptrace_other(Request::PTRACE_ATTACH, pid, ptr::null_mut(), ptr::null_mut()).map(|_| ()) // ignore the useless return value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line's too long, but it's obvious what the final map() is doing, so just remove the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#[cfg(target_arch = "x86_64")]
#[allow(non_camel_case_types)]
#[derive(Debug, PartialEq)]
/// Represents all possible ptrace-accessible registers on x86_64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this comment before the #[cfg...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#[derive(Debug, PartialEq)]
/// Represents all possible ptrace-accessible registers on x86_64
pub enum Register {
R15 = 8 * ::libc::R15 as isize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have all these registers documented, which I think I requested in my earlier review. Could you do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, R15 is just an... R15 register. If someone's using ptrace, it they have to know what the register does anyway.

I don't see any added value in documenting it as

R15 = 8 * ::libc::R15 as isize, /// the R15 register

People using ptrace are not kids.

///
/// 0th argument is considered to be the syscall number.
/// Please note that these mappings are only valid for 64-bit programs.
/// Use syscall_arg32 for tracing 32-bit programs instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please link to syscall_arg32 and put it in backticks. You can see how we link to other types by looking at src/sys/wait.rs:57.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// # }
#[cfg(target_arch = "x86")]
#[macro_export]
macro_rules! syscall_arg {

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

/// ```
/// # #[macro_use] extern crate nix;
/// # fn main() {
/// assert_eq!(syscall_arg!(1), nix::sys::ptrace::Register::RDI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a good example because a) it uses an assert (this should be a test if anything) and b) it doesn't demonstrate actual usage. No one is going to write code like this. How would you actually use syscall_arg! in practice? As an argument to pokeuser, right? This applies to all 3 of these macros.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's similar to what the official docs do: https://doc.rust-lang.org/std/fmt/fn.format.html
And it's not trivial to assert in doc tests without all the boilterplate.

The main thing here is to explain the syntax. The rest is obvious.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no issues with the boilerplate, I'm suggesting that examples be a demonstration to the user of how this is used. While yes, you show the syntax to use with this macro, we can do better by showing it in a broader context, where it's used as an argument to one of the ptrace wrapper functions. So I suggest you copy one of the lines where this is used from the test code (like as an argument to pokedata) and stick it here and make this code block no_run so it will at least be compile-checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a valid struct Pid, which has to be obtained from somewhere (a user will wonder what the heck is pid here). What about just mentioning that the purpose is to achieve arch-independent syscall argument lookups?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave the boilerplate commented-out here as well. Do what you need to do to get it to compile, but having something like ptrace::peekuser(child, syscall_arg!(0)) As the example would be better in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need valid values and can't get them, just mark this as only a compilation test by using ```no_run` to start the example. The point of an example isn't to test the code it's to show the user how they might use the code and provide a little additional context. The example I provided above, when annotated with a little description, would suit both of those goals.

@@ -1,9 +1,14 @@
#![cfg(all(target_os = "linux", any(target_arch = "x86",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already specified in test/sys/mod.rs, why are you respecifying it here? You probably want to restrict just your new tests a little more rather than the whole module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I don't remember. Fixed that.

@marmistrz
Copy link
Contributor Author

I think that having this compile-time checked is a superior benefit.

Could we please get it merged before Friday? I want to rebase a project onto the upstreamed functions before I go on a short vacation.

#[cfg(target_arch = "x86_64")]
#[allow(non_camel_case_types)]
#[derive(Debug, PartialEq)]
pub enum Register {
Copy link
Contributor

@Susurrus Susurrus Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a logical ordering to these registers? I would at least expect R10..R15 to be in order and they're not. I'm not certain how the docs for these are generally arranged, but maybe we can do a better grouping here?

Additionally is there no reference documentation we can link to here to give users more background on these registers? Some canonical reference? The kernel has some docs that touch on this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the order from user_regs_struct. Please, don't make nix an operating systems kindergarten. As said in another comment, there's a lot of other things that you need to know, to be able to use ptrace.

/// ```
/// # #[macro_use] extern crate nix;
/// # fn main() {
/// assert_eq!(syscall_arg!(1), nix::sys::ptrace::Register::RDI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no issues with the boilerplate, I'm suggesting that examples be a demonstration to the user of how this is used. While yes, you show the syntax to use with this macro, we can do better by showing it in a broader context, where it's used as an argument to one of the ptrace wrapper functions. So I suggest you copy one of the lines where this is used from the test code (like as an argument to pokedata) and stick it here and make this code block no_run so it will at least be compile-checked.

@Susurrus
Copy link
Contributor

Yep we can shoot for Friday and I think it's doable given we're mostly fixing little things at this point and I think the API is solid.

RCX = 8 * ::libc::RCX as isize,
RDX = 8 * ::libc::RDX as isize,
RSI = 8 * ::libc::RSI as isize,
RDI = 8 * ::libc::RDI as isize,

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

@marmistrz
Copy link
Contributor Author

@Susurrus, please remember I'm UTC+2. I'd like to have it merged before Friday morning UTC+2.

Copy link
Contributor

@Susurrus Susurrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to get to this last night but this is about the busiest week to get me to review things. So here's what should be a final batch of comments.

@@ -55,6 +158,8 @@ fn test_ptrace_cont() {
use nix::unistd::fork;
use nix::unistd::ForkResult::*;

let _m = ::FORK_MTX.lock().expect("Mutex got poisoned by another test");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to all the other changes in this commit. Can you break this out into a separate commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply removed that for now. This can be added once the changes are squashed.

///
/// # Safety
/// This function allows to access arbitrary data in the traced process
/// and may crash the inferior if used incorrectly and is thus marked `unsafe`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmistrz Any comment on this?

/// # }
#[cfg(target_arch = "x86")]
#[macro_export]
macro_rules! syscall_arg {

This comment was marked as resolved.

/// ```
/// # #[macro_use] extern crate nix;
/// # fn main() {
/// assert_eq!(syscall_arg!(1), nix::sys::ptrace::Register::RDI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave the boilerplate commented-out here as well. Do what you need to do to get it to compile, but having something like ptrace::peekuser(child, syscall_arg!(0)) As the example would be better in my opinion.

WaitStatus::PtraceSyscall(child) => {
match syscall_no {
None => {
let no = ptrace::peekuser(child, syscall_arg!(0)).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to run the tests as a 32-bit process on a 64-bit architecture and so these tests will fail because they use syscall_arg!? If so we should either fix the tests to work regardless or disable the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address my question here?

RCX = 8 * ::libc::RCX as isize,
RDX = 8 * ::libc::RDX as isize,
RSI = 8 * ::libc::RSI as isize,
RDI = 8 * ::libc::RDI as isize,

This comment was marked as resolved.

@Susurrus
Copy link
Contributor

@marmistrz Looked like we were pretty close on this. Sorry we missed your original target date, but any interest in polishing this off for merging?

@marmistrz
Copy link
Contributor Author

Interest yes, but unfortunately severe lack of time. This will have to wait for another week or two at least.

@marmistrz
Copy link
Contributor Author

is this PR still relevant?

@Susurrus
Copy link
Contributor

@marmistrz Who is supposed to answer that question? As you proposed this PR, I guess I'd ask you if it's still relevant: do you want it in nix and are you willing to finish polishing it off to merge? If no to either of those, let's go ahead and close this.

@marmistrz
Copy link
Contributor Author

marmistrz commented Apr 15, 2018

@Susurrus more precisely, I'm asking if the changes accepted during the last half a year have not obsoleted this PR. In the meantime the features could have been partially implemeneted or the API may have gone in a completely different direction, which could make the proposed API out place.

I'll have a while to polish it off, but not overly much, so I want to get it merged with as little hassle as possible.

@Susurrus
Copy link
Contributor

I don't know what's happened with the API, thought the CHANGELOG is comprehensive and I'd suggest you check there first, though git log when you specify a file will tell you what commits touched it, so that'd be easy to run too. Once you've determined that, please rebase this API if there's anything still worth merging.

@marmistrz
Copy link
Contributor Author

marmistrz commented Apr 16, 2018

Fixed the things mentioned in the last review. Should I now squash it all into one commit

ptrace::peekuser(child, syscall_arg!(0)) As the example would be better in my opinion.

this can't be easily asserted, so I'd leave it as it is.

@marmistrz
Copy link
Contributor Author

@Susurrus tests pass, can we merge?

Copy link
Contributor

@Susurrus Susurrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we're pretty close here. I think the examples could be improved to give users a more real-world use case to understand our API here.

@@ -80,6 +80,7 @@ This project adheres to [Semantic Versioning](http://semver.org/).
([#739](https://github.com/nix-rust/nix/pull/739))
- Expose `signalfd` module on Android as well.
([#739](https://github.com/nix-rust/nix/pull/739))
- Added nix::sys::ptrace::detach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed, I assume it was a rebase error.

- Added specialized wrappers: `sys::ptrace::{peek, poke}{user, data}`
and macros: `syscall_arg`, `syscall_arg32` for register-to-argument
mappings. Using the matching routines
with `sys::ptrace::ptrace` is now deprecated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some weird indentation going on with this line.

/// ```
/// # #[macro_use] extern crate nix;
/// # fn main() {
/// assert_eq!(syscall_arg!(1), nix::sys::ptrace::Register::RDI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need valid values and can't get them, just mark this as only a compilation test by using ```no_run` to start the example. The point of an example isn't to test the code it's to show the user how they might use the code and provide a little additional context. The example I provided above, when annotated with a little description, would suit both of those goals.

/// ```
/// # #[macro_use] extern crate nix;
/// # fn main() {
/// assert_eq!(syscall_arg!(1), nix::sys::ptrace::Register::RDI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please improve this example in line with my comments about syscall_arg32!.

//! Interface for `ptrace`
//!
//! For detailed description of the ptrace requests, consult [`ptrace`(2)].
//! [`ptrace`(2)]: http://man7.org/linux/man-pages/man2/ptrace.2.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render correctly on Rust 1.25.0.

WaitStatus::PtraceSyscall(child) => {
match syscall_no {
None => {
let no = ptrace::peekuser(child, syscall_arg!(0)).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address my question here?

@marmistrz
Copy link
Contributor Author

Let's close it. This PR is over two years old and the API has substantially change since this PR was opened.

@marmistrz marmistrz closed this Aug 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants