Skip to content
This repository has been archived by the owner on Apr 26, 2019. It is now read-only.
/ secloud-taint Public archive

Classification of sinks and sources in node.js API.

License

Notifications You must be signed in to change notification settings

kefth/secloud-taint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TaintClassify

SeCloud project on classifying node.js sinks and sources. Based on OWASP list of JavaScript vulnerabilities. Inspired by the paper by Rasthofer et al. A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks

App for classifying can be found in the secloudapp folder.

JSON data format

Data is extracted from the multiple files downloaded from node.js and located in json folder.

Currently only the 'textRaw' and 'params' are taken into account. Those are aggregated in data.json.

Format is as follows:

{
    "cl": 0,
    "params": [
        "value",
        "message"
    ],
    "textRaw": "assert(value[, message])"
}

Param "cl" refers to the class. There are three classes in this dataset:

    neither:    0
    source:     1
    sink:       2

For unknown class:

cl: -1

The python file that handles parsing is processJSON.py

Features

For handcrafted features to be used as input look at helperJSON.py

Currently features are binary(is a feature present) and extracted from method names. Features are based on OWASP list of JavaScript vulnerabilities e.g. get usually is a source of information. There are 15 such features extracted.

Issues

  • Dataset is small with 265 hand annotated examples.
  • Hand crafted features do not cover all possible cases of a source or a sink in Node.js hence some valuable info for classification is missing.

About

Classification of sinks and sources in node.js API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published