Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Is it possible to use radamsa as a library? #28

Open
joxeankoret opened this issue Nov 24, 2016 · 23 comments
Open

[QUESTION] Is it possible to use radamsa as a library? #28

joxeankoret opened this issue Nov 24, 2016 · 23 comments

Comments

@joxeankoret
Copy link

Hi!

I would like to embed radamsa in a few different places as a library instead of having to call a binary on the command line from my own fuzzers. Is there a (recommended) way of doing so?

Thanks in advance!

@aoh
Copy link
Owner

aoh commented Nov 24, 2016

Hi,

You are probably thinking about a scenario where the linked radamsa would constantly get something to mutate? Supporting that kind of use is something I've been planning to add for a while now.

You already can start radamsa with a fixed set of samples to serve a stream of fuzzed data (radamsa -n inf -o 31337 samples/*), and embed a trivial wrapper which just grabs the next testcase from localhost:31337. The trouble is that that you can't extend the sample set on the fly this way.

@joxeankoret
Copy link
Author

joxeankoret commented Nov 24, 2016

Yes, something similar. Right now I'm using my own port of this: https://github.com/trailofbits/grr/tree/master/third_party/radamsa

...and I have written a Python wrapper for it. I have also used the socket mode for other things but... is not actually what I want.

@aoh
Copy link
Owner

aoh commented Nov 25, 2016

Grr seems to use a fun approach. The only issue I have with it, is that when you run radamsa once for each output it doesn't get a chance to collect data about the inputs. Therefore some mutations which may be useful will never occur. There should be some way to either pass state between the runs, or run the radamsa in a separate process and have the state there, which is why the TCP mode was originally added.

A few solutions would come to mind:

  • add an incremental mode to radamsa, so that it can store information gathered so far between runs, after which something like the current wrapper would work better
  • extend the TCP mode to handle the kind of use required in a library setting, and maybe bundle functions along which start a background radamsa automatically and call it
  • the same, but with stdio redirection to background process
  • add support for librarization upstream to owl, and get libradamsa as a result

Are there some issues why the background process approach doesn't work well in your test setups?

@joxeankoret
Copy link
Author

The main reason is performance. There is no comparison between using an in-memory mutation engine in N machines (running independently on each one) than using network sockets. The other one is that, often, the same mutation engine needs to use different data sources, thus, making it required to open a listening socket (i.e., a radamsa listening instance) for each and all formats I want to fuzz. An easy example: PDFs. They have a lot of different "formats" being used inside a single PDF document.

@aoh
Copy link
Owner

aoh commented Nov 26, 2016

The last one is probably the correct solution. Owl should have some builtin support for building programs to be used as libraries in other C-programs. Then it would be possible to run radamsa incrementally without losing state from within one process.

@aoh
Copy link
Owner

aoh commented Nov 30, 2016

Current plan: radamsa (and owl programs in general) work by decoding the program to run from a fasl image, encode the command line parameters as a corresponding lisp object, run the program on that data and return the likely integer value the program returns. When used as a library, the programs could have a boot/init function to decode the image, and you could then correspondingly have a lisp object -> lisp object library call for any compiled function, which automatically en- and decodes the object. This way the same heap state could be used, and the library function being called would even remain a purely functional one with state.

In practice, you'd link a libradamsa and have something like radamsa(void *ptr, size_t s, &result, &result_size).

@joxeankoret
Copy link
Author

joxeankoret commented Nov 30, 2016

That would be perfect! One thing: it would be great to be able to set the seed too. Something like void radamsa_init(unsigned long long seed);?

@aoh
Copy link
Owner

aoh commented Dec 4, 2016

That would end up working without radamsa-specific modifications in the planned solution, because you'd initially boot up the embedded radamsa anyway with a fake command line, on which you can give the seed and other settings as usual.

@aoh
Copy link
Owner

aoh commented Dec 21, 2016

owl-lisp/owl#15 is waiting for spare time. I got DoS'd by various kinds of extra work in December.

@joxeankoret
Copy link
Author

I understand, don't worry :)

@aoh
Copy link
Owner

aoh commented Feb 10, 2017

Oh, by the way, https://github.com/aoh/ni might be of interest here. It's a quick port of some radamsa-mutations to C, which should be easy to embed.

@joxeankoret
Copy link
Author

Thanks! But doesn't look like comparable. It seems "ni" doesn't try to infer the grammar from the inputs.

@hacksysteam
Copy link

@aoh Could you explain this

That would end up working without radamsa-specific modifications in the planned solution, because you'd initially boot up the embedded radamsa anyway with a fake command line, on which you can give the seed and other settings as usual.

If we do radamsa seed.pdf -o mutated.pdf every time for each seed, are we missing on radamsa specific mutation?

What about running radamsa -r seeds/*.pdf -o mutated.pdf every time we want a mutated sample, are we still missing on radamsa specific mutation?

@aoh
Copy link
Owner

aoh commented Jun 1, 2017

In both cases yes, though not by much. Some mutations are only possible if radamsa has had a chance to look at another file, or the same file from a different position. If you have sample files with '' and '', then the first output will never have something like '', because radamsa only learned about one of the attributes while generating the first fuzzed output.

As a workaround, if it's not easy to make sets of files at a time for testing, you can add --seek 2 especially in the latter case to your existing test scripts to allow more cross pollination between sample files. This won't be necessary after issue #24 is fixed, but you still should consider making sets of files at a time to allow radamsa to filter out duplicate testcases.

@hacksysteam
Copy link

hacksysteam commented Jun 1, 2017

@aoh thank you very much for the explanation. According to you what could be the best way to run radamsa when I have like n seeds in a directory seeds and I want to get full benefits of radamsa mutation and generation considering that I need to get the mutated sample by giving a seed file and also the recursive mode in radamsa

My current setup is:

  1. seeds in a seed folder

a. mutation using the seed file. I every time run radamsa seed.pdf -o mutated.pdf to get a mutated sample
b. mutation using recursive mode. I every time run radamsa -r ./seeds/*.pdf -o mutated.pdf to get a mutated sample

@aoh
Copy link
Owner

aoh commented Jun 1, 2017

In both cases you could generate a bunch of files and serve them as they are needed. Something like

# make a file for fuzzed files if necessary
mkdir -p fuzzed
# check if more files need to be generated to fuzzed
ls fuzzed | grep -q radamsa || radamsa -n 1000 -o fuzzed/radamsa-%n.out seeds/*
# give the next file
mv "$(ls fuzzed | head -n 1)" mutated.pdf

@aoh
Copy link
Owner

aoh commented Aug 8, 2017

First tests of calling some compiled lisp code from c passed. The code looks roughly as what was planned above https://github.com/aoh/owl-lisp/blob/develop/c/lib.c#L49

Next step is to add a suitable function to radamsa and try it out from C using some similar wrapper.

@illera88
Copy link

Can't wait to have a working version. Do you have an estimated date to start testing it?

Thanks a lot!

@KnoooW
Copy link

KnoooW commented Sep 29, 2017

need it as library too... hope to see this as soon as possible

@aoh
Copy link
Owner

aoh commented Sep 20, 2019

In case this is still relevant, fix is mostly done at https://gitlab.com/akihe/radamsa/issues/28 . Next versions will likely have a libradamsa.c and radamsa.h.

@joxeankoret
Copy link
Author

That's awesome! Thank you very much!

@hacksysteam
Copy link

Fantastic!!

@0ca
Copy link

0ca commented Sep 20, 2019

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants