DumpDB

DumpDB imports credential dumps into a database to improve search performance.

There are two types of databases that will be created; one type stores the breach sources and the other type stores the dumped records. There should be a single sources-type database which stores where each record comes from (e.g. it could come from adobe2013 or collection1). There will be one or more databases which store the dumped records, these will be indexed and searched seperately.

Installation

This project requires Go version 1.15 or later. You will also need access to a MariaDB (recommended) or MySQL server.

go get -u github.com/darkmattermatt/dumpdb

Example Usage

Initialise the databases

go run github.com/darkmattermatt/dumpdb init -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d adobe2013,collection1

Import the dumped data

go run github.com/darkmattermatt/dumpdb import -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d adobe2013 -p adobe /path/to/data.tar.gz /more/data.txt
go run github.com/darkmattermatt/dumpdb import -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d collection1 -p collections /path/to/data.tar.gz /more/data.txt

Search the indexed data

go run github.com/darkmattermatt/dumpdb search -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d adobe2013,collection1 -Q "email LIKE '%@example.com' LIMIT 10"

General Info

Verbosity:

Output levels are as follows:

FATAL: Only show errors and search results
RESULT: Only show errors and search results
WARNING: Nonfatal errors (usually occurring in one of the query threads)
INFO: The default level, provides minimal information at each step of the process
VERBOSE: Tells you what's going on
DEBUG: Spews out data

Global Parameters:

config='': Config file
v=3: Verbosity. Set this flag multiple times for more verbosity
q=0: Quiet. This is subtracted from the verbosity

Init

Initialise a database for importing.

Parameters:

databases+: One or more positional arguments of databases to initialise
databases="": Comma separated list of databases to initialise
conn=: connection string for the MySQL. Like user:pass@tcp(127.0.0.1:3306)
sourcesDatabase="": Initialise the following database as the one to store sources in
engine="Aria": The database engine. Aria is recommended (requires MariaDB), MyISAM is supported for MySQL
indexes="email_rev": Comma separated list of columns to index in the main database. Email_rev is strongly recommended to enable searching by @email.com

Process

Process files or folders into a regularised tab-delimited text file.

Parameters:

filesOrFolders+: One or more positional arguments of files and/or folders to import
parser=: The custom line parser to use. Modify the internal/parseline package to add another line parser
batchSize=4e6: Number of lines per output file. 1e6 = ~64MB, 16e6 = ~1GB
filePrefix="[currentTime]_": Temporary processed file prefix

File Processing

.tar.gz, .tgz: Decompress and open tarball, process each file
.txt, .csv: Create bufio.Scanner
bufio.Scanner: Process each line

Import

Import files or folders into a database.

Parameters:

filesOrFolders+: One or more positional arguments of files and/or folders to import
parser=: The custom line parser to use. Modify the internal/parseline package to add another line parser
conn=: Connection string for the SQL database. Like user:pass@tcp(127.0.0.1:3306)
database=: Database name to import into
sourcesDatabase=: Database name to store sources in
compress=false: Pack the database into a compressed, read-only format. Requires the Aria or MyISAM database engine
batchSize=4e6: Number of results per temporary file (used for the LOAD FILE INTO command). 1e6 = ~64MB, 16e6 = ~1GB
filePrefix="[database]_": Temporary processed file prefix

Notes:

By default, only the mysql user is able to read/write to the database file directly. A workaround is to run go build . and then sudo -u mysql ./dumpdb import ...
Only files with whitelisted file extensions are processed (to avoid trying to import a binary file as a text file). Currently supported extensions are .tar.gz, .tgz, .txt, .csv.

Search

Search multiple dump databases simultaneously.

Parameters:

query="": The WHERE clause of a SQL query. Yes it's injected, so try not to break your own database
columns="all": Comma separated list of columns to retrieve from the database
conn=: Connection string to connect to MySQL databases. Like user:pass@tcp(127.0.0.1:3306)
databases=: Comma separated list of databases to search
sourcesDatabase="": Database name to resolve sourceIDs to their names from

Notes:

The query is injected into the SQL command which means that any LIMIT statements are applied per database

External Libraries

This project makes use of several excellent open-source libraries, listed below:

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
cmd		cmd
configs		configs
internal		internal
pkg		pkg
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DumpDB

Installation

Example Usage

General Info

Init

Process

File Processing

Import

Search

External Libraries

About

Releases

Packages

Contributors 2

Languages

License

DarkMatterMatt/dumpdb

Folders and files

Latest commit

History

Repository files navigation

DumpDB

Installation

Example Usage

General Info

Init

Process

File Processing

Import

Search

External Libraries

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages