Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Handling of Custom Inputs #2422

Merged
merged 37 commits into from
Sep 18, 2024

Conversation

riesentoaster
Copy link
Contributor

@riesentoaster riesentoaster commented Jul 17, 2024

I've finally gotten around to doing something about #2202. It's currently still very much work in progress.

Still missing:

  • Optional Input parts and crossover mutations
    • Ideally, I'd find a way to extract the mapping logic from the actual mutators, so that it can be reused in the crossover mutators as well
    • This would also possibly require a reversal of ownership of the mapping logic within mutators, as the logic would need to be called inside-out instead of outside-in
  • Possibly additional mappers, I'm open to ideas
  • (Improved) Documentation, including examples
  • Tests on all introduced parts
  • Not sure if this is possible, but it would be cool if the names of the mapping mutators included the names of the inner mutator — I've failed at wrangling the lifetime checker to a working implementation so far
  • Not sure if this is possible either, but mapping on tuple_list_type would drastically reduce code duplication

Feel free to leave any suggestions or questions!

@riesentoaster riesentoaster marked this pull request as ready for review August 7, 2024 13:45
@riesentoaster
Copy link
Contributor Author

Besides the optional improvements outlined above, I think this is done.

}
}

impl<S, M> Mutator<Vec<u8>, S> for MutVecMappingMutator<M>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Vec<u8> this dude should operate on HasMutatorBytes, then it'll work on whatever input you pass in

Copy link
Contributor Author

@riesentoaster riesentoaster Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think HasMutatorBytes is implemented for Vec<u8>, this is the exact reason MutVecMappingMutator is even necessary.

This allows modelling byte array parts in a custom Input as Vec<u8>, without having to manually write a mapping layer to a input that implements HasMutatorBytes, such as MutVecInput. MutVecMappingMutator does nothing except for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not implemented for Vec but it's implemented for MutVecInput and MutVecInput, in turn, is implemented for &mut Vec<u8>, so I think you can easily do this without hard-coding a Vec<u8>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what MutVecMappingMutator does. I just figured it'd be one more step every user of havoc_mutations_mapped would need to do themselves, and it'd be required basically always, and it can be centrally done, so I opted to include this logic in the library.


impl CustomInput {
/// Returns a mutable reference to the byte array
pub fn byte_array_mut(&mut self) -> &mut Vec<u8> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just return a MutVecInput instead, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above.

Yes, it could. But it's more logic I can do centrally instead of having to do it at every custom input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it more logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the additional into call, mapping from &mut Vec<u8> to MutVecInput. And since this mapping needs to be done for every input part, I decided to do it centrally.

@domenukk
Copy link
Member

domenukk commented Aug 8, 2024

One other thing: It'd be pretty amazing to #[derive] mappings for each sub part of an input (maybe even recursiveley), so we suddenly end up with a cheap grammar fuzzer

@riesentoaster
Copy link
Contributor Author

One other thing: It'd be pretty amazing to #[derive] mappings for each sub part of an input (maybe even recursiveley), so we suddenly end up with a cheap grammar fuzzer

That's the end goal 😎.

One step at a time though.

@domenukk
Copy link
Member

domenukk commented Aug 9, 2024

@riesentoaster please take a look at c443dbe of #2479

This is how I would like it.
The custom input now needs to implement:

    pub fn mutate_byte_array<M: for<'a> Mutator<MutVecInput<'a>, S>, S>(
        &mut self,
        state: &mut S,
        mutator: &mut M,
    ) -> Result<MutationResult, Error> {
        let mut byte_array = MutVecInput::from(&mut self.byte_array);
        mutator.mutate(state, &mut byte_array)
    }

It's more code than returning a Vec, but this can easily be auto-generated with a macro.
And it will also work for any non-vec, non-bytes, and higher-level inputs.

Please take a look/comment :)

/edit: (Especially given the end goal to derive these, the extra code should not make any difference)

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Aug 10, 2024

I see — thank you for your suggestion!

I think I personally would prefer the simplicity of my initial version for most inputs. If you look at the file havoc_mutations.rs, you see that the most complicated thing about this are the types, the actual logic in this file is fairly straightforward. So it'd be easy to create another additional function that returns mapped havoc_mutations with your interface, if having this level of flexibility is actually needed regularly.

Alternatively, it's fairly straightforward to just create the mappings yourself in your fuzzer — that's the entire point of this architecture: you can plug your mappings together (including creating custom mappings for any custom type) as you like. An example of this is what I did with Option wrapped parts. So the flexibility is there regardless of the simplified interface we provide in havoc_mutations.rs.

Overall, I think it's just a question of taste. And I prefer the version where I don't need to expect that the users remember to call .mutate() manually to get it to work. 😄

@riesentoaster
Copy link
Contributor Author

(Still to do: comments/unit tests for mapping mutators)

// Creating mutators that will operate on input.byte_array
// For now, due to a limitation in lifetime management (see the MappedInput trait),
// the types have to be partially specified
let mapped_mutators: (MappedInputFunctionMappingMutator<_, _, MutVecInput<'_>>, _) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why the extra type specifyer is necessary here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MappedInput maps the type to itself. The compiler can't deal with that.

I would be very happy about a better solution, but I could not find one, even after way too much time spent trying. Feel free to give it a shot though :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get around the extra type if we always reuturn a BytesSubInput?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytesSubInput is significantly less general than WrapsInput, because it only works for byte array style inputs. It would probably also introduce a performance penalty (because of the extra logic e.g. compared to MutVecInput, which is just a plain wrapper).

@riesentoaster
Copy link
Contributor Author

Alright, cleanup done.

The failing CI tests seem unrelated to what changed here, not sure what's up with them.

@@ -14,7 +14,7 @@ use libafl::{
fuzzer::{Fuzzer, StdFuzzer},
generators::RandPrintablesGenerator,
inputs::{BytesInput, HasTargetBytes},
mutators::scheduled::{havoc_mutations, StdScheduledMutator},
mutators::{havoc_mutations::havoc_mutations, scheduled::StdScheduledMutator},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should re-export the og havoc_mutations to not break all code out there, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That's a fair point. For code we control, I'd much rather have the import from the source (mutators::scheduled has nothing to do with havoc_mutations anymore).

Unfortunately, deprecating re-exports is not supported yet (see issue). So I guess we're stuck with just re-exporting them and add a comment telling people to do the other thing and explaining why the re-export is there. Maybe link the issue as well.

Opinions?

where
S: HasCorpus + HasMaxSize + HasRand + UsesInput,
I: HasMutatorBytes,
F: for<'b> Fn(&'b S::Input) -> Option<&'b [u8]>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this F return a [u8] and not a WrappedInput?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because WrappedInput implementations generally cannot be used with immutable references. You only get an immutable reference from the input chosen from the corpus, so you can't create a MutVecInput or something similar.

@riesentoaster
Copy link
Contributor Author

Alright:

  • Mapping mutators's names now include the name of their inner mutators
  • Users no longer have to specify types manually
  • mapped_havoc_mutations no longer takes a function returning an Option — this was significantly harder than I thought, because of the same issue I struggled with for the mapping mutators: lifetimes with a mix of owned objects and borrows. I ended up implementing a similar trait to MappedInput that ensures the lifetimes are correct and that includes the actual logic (so wrapping the non-Option slices with Some).

Regarding the new import path for havoc_mutations(): As discussed in person, I haven't added a pub use, since this is not deprecatable.

Thank you for your support and ideas today, btw!

}

/// Returns an immutable reference to the byte array wrapped in [`Some`]
pub fn byte_array_optional<'a>(&'a self) -> &'a [u8] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is no longer correct

}

/// Returns an immutable reference to the optional byte array
pub fn optional_byte_array_optional<'a>(&'a self) -> Option<&'a [u8]> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably also drop this extra optional dude here

@domenukk
Copy link
Member

You can fix the names in a follow-up PR. Great work :)

@domenukk domenukk merged commit 2c676f0 into AFLplusplus:main Sep 18, 2024
101 checks passed
.merge(havoc_crossover_with_corpus_mapper(
&CustomInput::byte_array_optional,
))
.map(ToMappedInputFunctionMappingMutatorMapper::<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait why is this still with extra generics here?

@riesentoaster riesentoaster mentioned this pull request Sep 18, 2024
domenukk pushed a commit that referenced this pull request Sep 20, 2024
* code cleanup

* removing another unnecessary borrow

* cleaning up the cleanup
@riesentoaster riesentoaster mentioned this pull request Oct 3, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants