Skip to content

Notebooks exploring the Canadian Institute of Cybersecurity's IoT dataset.

Notifications You must be signed in to change notification settings

Madhav-Malhotra/cicIoT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploring the CIC-IoT-2023 Dataset

The University of New Brunswick's Canadian Insitute of Cybersecurity published the open source CIC-IoT-2023 dataset. This dataset promotes research on how to detect 7 kinds of cyberattacks across 100 IoT devices.

Leading a team of 15 at the University of Waterloo, this repository contains useful notebooks for sampling, preprocessing, visualising, and training models on the CIC-IoT-2023 dataset.

We've republished the dataset on Kaggle to make it easier to use

Notebook descriptions

  1. downsampling.ipynb - This notebook samples 0.1, 0.5, 1, 5, and 10% of the rows from each cyberattack class from the dataset. This reduces the dataset size from 14GB to 12-600 MB, making it easier to perform feature visualisation and feature selection. Kaggle. Blog.
  2. heatmaps.ipynb - This notebook tries to understand which of the around 50 features are most important for training ML models. It notes some of the problems with simple correlational analysis and heatmaps. Kaggle. Blog.
  3. greyWolf.ipynb - This notebook finds useful features from the 46 total features in the dataset. It uses the Grey Wolf Optimiser to do this. Kaggle. Blog.
  4. geneticAlgorithm.ipynb - This notebook reduces the 46 features in the dataset to 20 uesful features. It also visualises how the genetic algorithm works while doing this. Kaggle. Blog.
  5. unsupervisedClustering.ipynb - This notebook compares the negative selection algorithm, k-means clustering, and DBSCAN to generate signatures of benign network requests. Kaggle. Blog.

Useful Links

  1. CIC-IoT-2023 Dataset on Kaggle
  2. Our team blog has more details about the notebooks
  3. The original paper describing the dataset

About

Notebooks exploring the Canadian Institute of Cybersecurity's IoT dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published