Skip to content

yunyueye/honeypot

Repository files navigation

honeypot

An algorithm about detecting ICS honeypot

The dataset(data_no_ip.csv) contains three common ICS honeypots, Modbus, ATG, and S7.

The processed data is located in the 'data' folder, containing data for three different protocols. It can be directly used for running the proposal_algorithm_1.py script.(These files can be obtained by running process_*.py)

This program needs to be run in a Python3 or higher environment, and requires the libraries numpy and pandas.

example: """ feature_data = np.loadtxt("data/numpy_array_data_atg") label_data = np.loadtxt("data/numpy_label_data_atg") """ The above two lines of code indicate the loading of data from the ATG honeypot.

The final test results are stored in the variable 'test_recognition'.:
    The i-th element in the list: the sample that was identified as a honeypot in the i-th iteration.
    The i-th element: (index of sample, p(y=1|X), iteration epoch)
    
"""
feature_data = np.loadtxt("data/numpy_array_data_xx")
label_data = np.loadtxt("data/numpy_label_data_xx")
"""
you can load other protocol ICS honeypot data

process_.py : After running the file, you will get two files: numpy_array_data_ and numpy_label_data_*. Put these two files into the data folder and then you can run proposal_algorithm_1.py. (I have put all the processed files into the "data" file)


selectKfeature.py: You can run selectKfeature.py to test whether your python environment contains the required packages


Explanation of part of the code in proposal_algorithm_1.py:

# Process test dataset
test_output = [[] for i in range(len(selected_feature_index))] 
for i in range(len(selected_feature_index)):   # Iterating over the optimal feature list
    optimal_i = selected_feature_index[0:i+1]  # Combining the first i features: (f1, f2, ..,fi)
    feature_data = test_feature_array[:, optimal_i]  # Fetching data for the corresponding feature
    feature_size = feature_data.shape          
    for sample_i in range(feature_size[0]):    # index of sample; Calculating the probability for the i-th sample  
        f_sample = list(feature_data[sample_i])   # Feature data for the i-th sample      
        for f_i in range(len(poster_p_list[i])):  # Looking up the table in the calculated posterior probability table
            if f_sample == poster_p_list[i][f_i][0]: # Finding the posterior probability for the current feature combination
                f_p_0 = poster_p_list[i][f_i][1][0]  # p(y=0|(f1, f2, ..,fi))  
                f_p_1 = poster_p_list[i][f_i][1][1]  # p(y=1|(f1, f2, ..,fi))  
                break
        # Using Bayes' formula to calculate the probability of p(y=1) for the sample
        sample_p_1 = prior_p_1 * f_p_1 / (prior_p_1 * f_p_1 + prior_p_0 * f_p_0)                           
        test_output[i].append(sample_p_1)  # Saving the results
        """test_outout: The i-th sublist of the result represents the probability 
        of each sample being a honeypot in the i-th calculation; 
        The i-th calculation of probability uses the first i features (f1, f2, ..,fi)"""

About

An algorithm about detecting ICS honeypot

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages