This is the project on ransomware detection using machine learning technique.
In our experiments, malware pcaps were referenced from Malware-traffic-analysis
- Python 2.7
- dpkt
- sklearn
We gathered 155 different Ransomwares from Feb.2015 to Sep.2016, and seperated them into 7 main families:
- cerber:9
- CrypMic:28
- Cryptowall:34
- CryptXXX:34
- Locky:16
- Teslacrypt:22
- Other:16
We use most three families for experiments: CryptMic, Cryptowall, and CryptXXX
1. Put your ransomware pcap files in malware\_pcap and normal\_pcap
2. Run start.sh to extract http headers from pcap files
Use pcap_Parser -p if you want to parse TCP payloads
We use PCA to reduce dimensions of initial payloads.
python ./visual/pca.py [extracted_data]
It will produced a pickle file, and then use
python ./visual/show.py [pickle_file]
This will show the structure of those payloads.
After PCA as pre-training, we could use those principle components to fit in different ML models. As an example, we use SVM in sklearn to classify those payloads.
python ./model/svm.py [pickle_file]
The 5-folds cross validation results of Cryptmic are
SVM (C=1, linear kernel)
fold-1: 0.82227159
fold-2: 0.83194444
fold-3: 0.815
fold-4: 0.82361111
fold-5: 0.82272548
Our Results are published on IEICE IA 2016