Phishing URL Detection

The objective of the project is to analyse a web url and identify it as a legitimate or a phishing url.

At its core, this is a 2-class classification problem. The project uses data mining and machine learning techniques to address the problem of flagging a given url as phishing or legitimate.

Data:

Source: Mendeley Data
A CSV file that contains 88647 datapoints consisting of both legitimate and malicious websites.
Each url is represented using 111 features based on which a url is labeled as phishing url or not.
The dataset is fairly unbalanced with the number of legitimate urls roughly double the number of malicious urls.

Problem Statement:

Phishing is a type of social engineering attack often used to steal user data, including login credentials and credit card numbers. It occurs when an attacker, masquerading as a trusted entity, dupes a victim into opening an email, instant message, or text message. These attacks are becoming very common as more and more people start using online banking and the internet becomes ubiquitous.

Potential Methods:

A strategy is used to implement data mining algorithms to identify phishing websites by tracking the URL, in an attempt to limit phishing attacks. Attacks happen when attackers modify the filepath and subdomain or introduce an error to resemble the original website. To identify such websites, one needs to go through the URL and identify each of the elements in it. The strategies for this to work can be Logistic regression, Naive Bayes Classifier, KNN and XGB Boost.

Measurement of Success

The project's purpose is to find the most accurate data mining algorithm in order to identify a fraudulent website that takes personal information from consumers.

Team Members:

Anujot Singh
Ashutosh Ojha
Ilisha Aggarwal
Subhadra Rangaswamy

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
data		data
images		images
notebooks		notebooks
.gitignore		.gitignore
Phishing URL Detection.pptx		Phishing URL Detection.pptx
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phishing URL Detection

Data:

Problem Statement:

Potential Methods:

Measurement of Success

Team Members:

About

Releases

Packages

Contributors 4

Languages

leoAshu/phishing-url-detection

Folders and files

Latest commit

History

Repository files navigation

Phishing URL Detection

Data:

Problem Statement:

Potential Methods:

Measurement of Success

Team Members:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages