Skip to content

R function type signatures inferring

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

PRL-PRG/signatr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

---
output: md_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# signatr

<!-- badges: start -->
<!-- badges: end -->

The goal of signatr is to infer functions signatures by fuzzing their inputs.

## Installation

You can install `signatr` with `devtools::install_github("PRL-PRG/signatr")` in R.
Apart from CRAN dependencies, it also requires the following ones from github. 
Some need some more preparation before using `devtools::install`:

- [sxpdb](https://github.com/PRL-PRG/sxpdb/): R value database
- [generatr](https://github.com/reallyTG/generatr): fuzzing utilities
- [contractr](https://github.com/PRL-PRG/contractr):type signature parsing and checking for R
- [argtracer](https://github.com/PRL-PRG/argtracer): trace R values and store them in the R value database. It requires a modified R interpreter called R-dyntrace. It can be installed from [here](https://github.com/PRL-PRG/R-dyntrace/tree/r-4.0.2) for version 4.0.2. Then add the resulting `R` binary in `bin` in your `PATH` and run the installation.

## Demo

The following is a short demonstration of the basic signatr functionnalities i.e., how to create the value database by running R code and then how to use it for fuzzing.

```{r}
library(signatr)
```

To generate a database of values, we need some code to run. One way to get it is to extract it from an existing R package, for example `stringr`:

```{r}
extract_package_code("stringr", output_dir = "demo")
```

This will extract all the runnable snippets from the package documentation and tests into the given directory. For example:

```{r}
cat(readLines("demo/examples/str_detect.Rd.R", n = 5))
```


Next, we trace the file by executing it and recording all the calls using the `trace_file` function:

```{r}
trace_file("demo/examples/str_detect.Rd.R", db_path = "demo.sxpdb")
```

The resulting database is stored as demo.sxpdb. In this example, after running the `str_detect.Rd.R`  file, the database contains 20 unique values. This can be repeated for all the other files. To trace multiple files in parallel, one database is created per file, which are merged using the merge_dbs function once they are all available. This also allows us to run larger experiments on multiple machines.

Once the database is ready, we can start fuzzing. The fuzzer has a number of configration points, but the easiest starting point is the `quick_fuzz` helper function:

```{r}
R <- quick_fuzz("stringr", "str_detect", "demo.sxpdb", budget = 100, action = "infer")
```

The infer_call_signature function infers types for each call argument and return value using the type annotation language, specifically, the `contractr` type inference tool provided by the work is used. `infer_call_signature` returns a data frame with details for each call. The data frame includes the inferred call signature in the `result` column:

```{r}
R
```

About

R function type signatures inferring

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •