ioc-parser

IOC Parser is a tool to extract indicators of compromise from security reports in PDF format. A good collection of APT related reports with many IOCs can be found here: APTNotes.

Usage

iocp.py [-h] [-p INI] [-i FORMAT] [-o FORMAT] [-d] [-l LIB] FILE

FILE File/directory path to report(s)
-p INI Pattern file
-i FORMAT Input format (pdf/txt/html)
-o FORMAT Output format (csv/json/yara)
-d Deduplicate matches
-l LIB Parsing library

Usage as a package

Import IOC_Parser and create iocp object with 'data' output format. 'data' output format allows you to get any parsed IOCs as a dict.

from ioc_parser.iocp import IOC_Parser
iocp = IOC_Parser(output_format='data')

Adding a host to a whitelist after creating iocp object. IOC_Parser constructor parses any whitelist_*.ini files supplied in the basedir, but this allows you to add whitelists inline.

whitelist_host_str = "{}$".format("example.com")
whitelist_dict = {"Host": whitelist_host_str}
wl = WhiteList(whitelist_dict=whitelist_dict)
iocp.whitelist.update(wl)

Open a file and pass the file object and path to the parse_pdf_pdfminer method. This specifies which pdf parser to use, alternatively you can specify which pdf parser to use in the IOC_Parser constructor and use parse_pdf here. Or use the default pdf parser.

with open(pdf_path, "rb") as f:
    iocp.parse_pdf_pdfminer(f, pdf_path)

iocs = iocp.handler.get_iocs() # Returns a dictionary of any IOCs found

get_iocs() returns a dictionary in the following format:

{
    "Email": {
        "file": "report1.pdf",
        "match": "domains@winmsn.com",
        "page": 4,
        "path": "./downloaded_files/report1.pdf",
        "type": "Email"
    },
    "IP": {
        "file": "report1.pdf",
        "match": "213.200.66.26",
        "page": 8,
        "path": "./downloaded_files/report1.pdf",
        "type": "IP"
    }
}

Requirements

One of the following PDF parsing libraries:

PyPDF2 - pip install pypdf2
pdfminer - pip install pdfminer

For HTML parsing support:

BeautifulSoup - pip install beautifulsoup4

For HTTP(S) support:

requests - pip install requests

Name	Name	Last commit message	Last commit date
Latest commit ttufts Added parsing of docx files Sep 25, 2015 a35e344 · Sep 25, 2015 History 73 Commits
whitelists	whitelists	Fixed merge	Sep 23, 2015
.gitignore	.gitignore	Revert "Fixed error 'IOC_Parser' object has no attribute 'dedup_store'"	Mar 30, 2015
LICENSE	LICENSE	Initial commit	Jan 31, 2015
README.md	README.md	Added usage examples for usage as a package	Jul 9, 2015
__init__.py	__init__.py	Added __init__.py	Jul 8, 2015
iocp.py	iocp.py	Added parsing of docx files	Sep 25, 2015
output.py	output.py	Merged with armbues	Sep 17, 2015
patterns.ini	patterns.ini	Handling defanged IOCs	Sep 18, 2015
requirements.txt	requirements.txt	Adds requirements.txt for PIP install	Aug 3, 2015
whitelist.py	whitelist.py	Pass whitelist dict into WhiteList constructor	Jul 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ioc-parser

Usage

Usage as a package

Requirements

About

Releases

Packages

Languages

License

ttufts/ioc_parser

Folders and files

Latest commit

History

Repository files navigation

ioc-parser

Usage

Usage as a package

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages