GitHub - dongchirua/GASubgraph: using GA to find an important subgraph

GAVulExplainer

genetic algorithm for Finding Subgraph

@article{GAVulExplainer,
title = {Graph-based explainable vulnerability prediction},
journal = {Information and Software Technology},
pages = {107566},
year = {2024},
issn = {0950-5849},
doi = {https://doi.org/10.1016/j.infsof.2024.107566},
url = {https://www.sciencedirect.com/science/article/pii/S095058492400171X},
author = {Hong Quy Nguyen and Thong Hoang and Hoa Khanh Dam and Aditya Ghose},
keywords = {Graph neural network, Explanation, Vulnerability}
}

Get started (Python version >= 3.9)

1a. (CPU) create env with command `conda env create -f binder/environment.yml`
1b. (GPU) create env with command `conda env create -f binder/environment-cu11.3.yml`
2. activate env with `conda activate ga_subgraph`
3. Download Reveal dataset: https://bit.ly/3bX30ai
4. We flowed and used Joern which was provided along with Reveal paper at https://github.com/VulDetProject/ReVeal/blob/master/code-slicer/joern/README.md. Using Joern to parse data

In case you want to install yourself, below are major libs we used

1. PyTorch
2. PyTorch Geometric
3. PyTorch Lightning
4. networkx
5. DEAP
6. nltk
7. gensim

How to use GAVulExplainer

In case, you want to use our prepared example (example.py), download data.zip at https://drive.google.com/file/d/1eQBfx3OAOZLJrmX2wby5S_Z_HiWW0BT9/view?usp=sharing, unzip data.zip, and weights.zip at project level.

In order to ultilize GAVulExplainer for other tasks, please follow below instruction

from ga_subgraph.explainer import GASubX
from ga_subgraph.fitness import classifier
from ga_subgraph.individual import Individual
k_node = 5  # explanation size
# foo_sample is PyTorch Geometric Data
ga_explainer = GASubX(saved_model, classifier, device, Individual,.)
ga_subgraph, _ = ga_explainer.explain(foo_sample, k_node, verbose=False)

Documents of GASubX

:param blackbox: PyTorch model
:param classifier: Function to get probability from model, example: `ga_subgraph.fitness.classifier`
:param device: cuda or cpu
:param IndividualCls: Class to store individual representation
:param n_gen: how many generation to perform
:param CXPB: crossover probabitliy
:param MUTPB: mutation probability
:param tournsize: factor control selection function
:param subgraph_building_method: function to construct subgraph
:param max_population: control max individual for every generation
:param offspring_population: control number of offsprint individuals

preproduce our experiments (vary explanation size)

unzip data.zip, and weights.zip
run python do_statistic 4 cuda. 4 is explanation size, cuda is device
the script will ultilize multi-processors to perform explaination parallelly
at do_statistic line 106: config DataSet
at do_statistic line 52: config pretrained model
we share raw result for undirect graphs at statistics folder, direct graphs at statistics_undirected

project structure

weights folder: we stored pretrain classifer here
data: store data, word2vec model
binder: we locked lib versions for this project
ga_subgraph: our implementation for GAVulExplainer
visualization: helpers for visualize
vulexp: helpers for tranining vulnerability predictor, data processing, and SubgraphX. We demonstrate in example.py.
vulexp/reveal_data.py: class handle Reveal dataset

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.vscode		.vscode
binder		binder
ga_subgraph		ga_subgraph
linevul		linevul
statistics		statistics
statistics_undirected		statistics_undirected
visualization		visualization
vulexp		vulexp
.gitignore		.gitignore
LineVul.ipynb		LineVul.ipynb
README.md		README.md
RUN.me		RUN.me
check_classifier_on_testset.py		check_classifier_on_testset.py
do_statistic.py		do_statistic.py
do_statistic_cpg.py		do_statistic_cpg.py
example.py		example.py
example_cpg.py		example_cpg.py
train_GIN_cpg.py		train_GIN_cpg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GAVulExplainer

Get started (Python version >= 3.9)

How to use GAVulExplainer

preproduce our experiments (vary explanation size)

project structure

About

Releases

Packages

Languages

dongchirua/GASubgraph

Folders and files

Latest commit

History

Repository files navigation

GAVulExplainer

Get started (Python version >= 3.9)

How to use GAVulExplainer

preproduce our experiments (vary explanation size)

project structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages