Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StdOut Observer Missing last_result in Implementation #2936

Open
DarrionRamos opened this issue Feb 4, 2025 · 10 comments
Open

StdOut Observer Missing last_result in Implementation #2936

DarrionRamos opened this issue Feb 4, 2025 · 10 comments

Comments

@DarrionRamos
Copy link

Hello, I was looking to utilize the standard output observer and tried to run the example code provided (https://docs.rs/libafl/latest/libafl/observers/stdio/struct.StdOutObserver.html). But running the code returns error E0046 that the last_result is missing in the implementation for line 36 ExportStdXObserver's OT: MatchNameRef.

I am not very familiar with rust so it is possible that I have not set something up correctly. I have copied the code into the /fuzzers/structure_aware/baby_fuzzer_multi workspace and added track_hit_feedbacks to the Cargo file which seems to have built everything else fine (I did have to replace stdout and stderr with output in lines 50 and 51). But the Error with the last_result shows a todo!() so it is not clear to me if this feature is complete or not.

So, is the stdout observer example something that should be currently working? If so how can I properly set it up and are there any executors that are supported with it other than CommandExecutor?

What I am trying to test is a bit abnormal in that LibAFL is not able to actually execute the real target system. Instead the idea is that it will generate the inputs for the system that I can then distribute to the appropriate programs. The difficulty I am having is with understanding how to get LibAFL to understand the coverage information I produce externally that is not coming from a built in observer/feedback pair. I was planning on piping the coverage in through stdin and then using the stdout observer to recognize it.

I am sure that there is a better way that this can be accomplished but I am having difficulty understanding the internals of LibAFL and how I could go about getting external coverage/feedback utilized by the mutator/fuzzer. If anyone has any pointers of what I should look into or other ideas would be greatly appreciated, thanks.

@domenukk
Copy link
Member

domenukk commented Feb 4, 2025

What executor do you want to use? The executor needs to support StdOutObserver, yes.
Right now it looks like at least NyxExecutor and CommandExecutor support that. I feel like at least forkserver executor support could also be added, but what exactly do you need?

Usually the way to deliver observations back to LibAFL would be through custom observers via shared memory, for example. StdOut is more targeted to acutally observer the output of the target (i.e., look at ASan reports or similar).

So probably it makes sense to take one step back? :)

Maybe it also makes sense to ask in the fuzzing discord https://discord.gg/7Va7K4Fx

@DarrionRamos
Copy link
Author

Hi Domenukk, thanks for your response. You're right, shared memory makes more sense for this.

For the executor, I do not really even need one. I would like to use LibAFL to create a fuzzer that takes in external coverage and a starting corpus, and outputs a new input after coverage from the previous run has returned. I was just using the inprocess executor as a workaround since it seems to me that an executor is required in the setup.

Regarding the observer, I have gone through a lot of the API, linked resources, as well as the code. But as a rust newbie I still don't have a great understanding of the system. How exactly do the feedback and observer modules communicate? Is there a simple modification I could make to one of the existing observers? Any resources you can recommend would be appreciated.

I will also post a question in the discord if you would rather continue there as that seems more appropriate.

@riesentoaster
Copy link
Contributor

riesentoaster commented Feb 6, 2025

I'm assuming you're somehow passing the mutated input you receive in the InProcessExecutor to another process which actually calls your target?

Regarding Executors. Observers, and Feedbacks: The executor is responsible for running the target and updating the observers. Certain observers can be configured to be updated automatically, even without explicit executor interaction, e.g. coverage information of a target instrumented in such a way that coverage is written to a global variable/memory location which is then accessed by e.g. a StdMapObserver. Other observers, such as StdOutObserver need to be "filled" manually in the executor. Check out the implementation of CommandExecutor, and how it updates the observer after the subprocess is finished:

if let Some(h) = &mut self.configurer.stdout_observer() {
let mut stdout = Vec::new();
child.stdout.as_mut().ok_or_else(|| {
Error::illegal_state(
"Observer tries to read stderr, but stderr was not `Stdio::pipe` in CommandExecutor",
)
})?.read_to_end(&mut stdout)?;
let mut observers = self.observers_mut();
let obs = observers.index_mut(h);
obs.observe(&stdout);
}

After the execution is finished, the fuzzer runs the feedbacks, which may request data from one (or multiple) of the observers and reduce this raw information down to a bool: should this input be added to the corpus? Or even the solutions? The output from different feedbacks can be combined with the feedback_<and/or>[_fast]! macros.

So if you need custom handling of data from the execution back to LibAFL to determine whether an input should be added to the corpus, I would suggest looking at implementing a custom executor (there is an example for a very simplistic implementation, you may want to call your other process in here, and then write your output to observers, e.g. a StdOutObserver). You can use a combination of provided and custom observers and feedbacks to store and process the information generated during an execution. Writing custom implementations for either structure is fairly straightforward for simple cases, although check to see if there is something in LibAFL that does what you need it to do!

I hope this helps :)

@DarrionRamos
Copy link
Author

Hi Riesen, thanks for taking the time to share some information. Yes, I am passing the mutated input to another process to handle the execution. I have been working on using shared memory with the StdMapObserver. I am using file backed memory and opening a file descriptor in the fuzzer but I am still working on how to properly setup the memory in rust and link it to the observer. It seems that there is some custom AFL handling for shared memory. I will have some time tomorrow where I can look into it more and hopefully can get it working without a custom executor.

@riesentoaster
Copy link
Contributor

Check out libafl_bolts::shmem for how "pure" LibAFL fuzzers use shared memory.

Alternatively (and maybe easier) you can just open the shared memory, and if you can access it as a slice, you can pass that to StdMapObserver.

@DarrionRamos
Copy link
Author

DarrionRamos commented Feb 15, 2025

Thanks for the info. I had thought the libAFL shmem was required but I was able to get shared memory working for now with just using memmap2's MmapMut. But I still have a couple of issues.

The first is that I cannot open the file as read only. I am assuming this is because normally the executor directly modifies the mapped region. However, this also occurs with the custom executor example. Ideally, I open the file in read only in the libAFL process to avoid races. Is this possible?

I also am having trouble running an in-process executor fuzzer with just using the mem mapped region I created. I currently am using a HitcountsMapObserver with a MaxMapFeedback. But when running, LibAFL reports that the corpus is empty from a lack of instrumentation (but generating the initial corpus does not throw an error). I am not sure why this is happening as the fuzzer will work fine if I replace the mem map observer/feedback with a static SIGNALS map as seen in the examples and these do not seem very different to me.

I also verified that the observer has the proper contents of the mem map. Perhaps I am still not understanding how the executor is affecting things. But I have the same issue when running with the custom executor example that should not update the observer from my understanding.

Any feedback is appreciated, thanks.

@riesentoaster
Copy link
Contributor

The first is that I cannot open the file as read only. I am assuming this is because normally the executor directly modifies the mapped region. However, this also occurs with the custom executor example. Ideally, I open the file in read only in the libAFL process to avoid races. Is this possible?

Are you talking about the pseudo-file for mmap? LibAFL needs write access to that to reset the map between runs.

I also am having trouble running an in-process executor fuzzer with just using the mem mapped region I created. I currently am using a HitcountsMapObserver with a MaxMapFeedback.

If I understand you correctly, you missed a step: I think you always™️ want a StdMapObserver (or a ConstMapObserver would work as well I think) as a base for map observers, which can then be passed to a HitcountsMapObserver or whatever processing wrapper you want to use as a base. These base observers are responsible for many things that aren't done by the wrappers, like resetting the map between runs.

Maybe @domenukk can confirm though, this is just my current understanding.

But when running, LibAFL reports that the corpus is empty from a lack of instrumentation (but generating the initial corpus does not throw an error). I am not sure why this is happening as the fuzzer will work fine if I replace the mem map observer/feedback with a static SIGNALS map as seen in the examples and these do not seem very different to me.

This essentially means that your fuzzer doesn't add any of the inputs to your corpus, either because they're not deemed interesting at all, or because they're added to the solutions. Which of those I am not sure from your description.

I also verified that the observer has the proper contents of the mem map. Perhaps I am still not understanding how the executor is affecting things. But I have the same issue when running with the custom executor example that should not update the observer from my understanding.

If you look at the definition of the Observer trait, there are two important functions: pre_exec and post_exec. They are called by StdFuzzer::execute_input and do pre- and postprocessing. For StdMapObserver, pre_exec resets the map. For HitcountsMapObserver, pre_exec calls pre_exec of the inner observer (which would then do its thing like resetting the map), and post_exec does the binning.

The data in observers can come from a few different sources:

  • From pre_exec and post_exec, like in TimeObserver
  • From the Executor directly (think StdOutObserver)
  • Through some other way, like coverage instrumentation in the target changing the shared memory map underlying a map observer

How you get your data into the observer doesn't matter, as long as it is there.

Well, at least if you only add data between pre_exec and post_exec is called, if your observer does anything with that.

Hope this helps. Lmk if you have more questions!

@DarrionRamos
Copy link
Author

Are you talking about the pseudo-file for mmap? LibAFL needs write access to that to reset the map between runs.

Ok that makes sense thanks.

If I understand you correctly, you missed a step: I think you always™️ want a StdMapObserver (or a ConstMapObserver would work as well I think) as a base for map observers, which can then be passed to a HitcountsMapObserver or whatever processing wrapper you want to use as a base. These base observers are responsible for many things that aren't done by the wrappers, like resetting the map between runs.

Yes, I think you are correct. I do have a StdMapObserver as a base, I should have included that in my post.

This essentially means that your fuzzer doesn't add any of the inputs to your corpus, either because they're not deemed interesting at all, or because they're added to the solutions. Which of those I am not sure from your description.

If they are added to the solutions they would be found in the /crashes folder correct? In that case this is not the problem because they are not found there. The other case would be that they are not interesting, but if I keep everything the same in the fuzzer and just swap out the mmap observer for the SIGNALS map it works. So the generated inputs and the feedbacks are not changing. I have also tried using only a StdMapObserver for the mmap.

I think then there must be something then that I am not doing correctly with the mmap. Perhaps I do need to use libafl_bolts::shmem? Here is how I am currently handling it:

pub fn main() {
    let mut file = OpenOptions::new().read(true).write(true).create(false).open(&"/var/run/shm/s2e_coverage").unwrap();
    let mut contents = Vec::new();
    file.read_to_end(&mut contents).unwrap();
    let mut mmap = unsafe {MmapMut::map_mut(&file).unwrap()};
    assert_eq!(&contents[..], &mmap[..]);

    let mut buf_slice = mmap.deref_mut();
    let metadata = fs::metadata("/var/run/shm/s2e_coverage").unwrap();

    // Create an observation channel using the signals map
    let observer = unsafe {
        HitcountsMapObserver::new(StdMapObserver::from_mut_ptr("shared_mem", buf_slice.as_mut_ptr(), 
                                  metadata.len().try_into().unwrap())).track_indices()
    };
...

@riesentoaster
Copy link
Contributor

riesentoaster commented Feb 15, 2025

If they are added to the solutions they would be found in the /crashes folder correct?

Yes, if you are using an OnDiskCorpus or something similar for the solutions and pass it /crashes as its dir.

The other case would be that they are not interesting, but if I keep everything the same in the fuzzer and just swap out the mmap observer for the SIGNALS map it works. So the generated inputs and the feedbacks are not changing. I have also tried using only a StdMapObserver for the mmap.

Not sure I understand. Are you talking about the SIGNALS static arrays in the example fuzzers? How would those get changed with your target/executor setup? In the examples, they are set manually from the target, while your target probably doesn't do that. You'd want to setup your coverage instrumentation to do this job: set (or count) certain values in the coverage map as they are executed.

Have you checked your coverage instrumentation? Is there actually some data in the shared memory region after a target execution?

@DarrionRamos
Copy link
Author

Yes I was referring to the static SIGNALS array in the examples. I was using it for debugging purposes, but I got confused and forgot it was being set from the target. Knowing now that the mmap gets reset before each execution, my issue is probably that the data in the mmap is not synchronized properly to be present when it needs to be. I have just been using a dummy coverage generation program for testing. So I think my problem will be solved once I fully implement the coverage so that the data is input at the correct time. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants