Linux_Forensic_Harvester

Harvest Linux forensic data for operational triage of an event.

This tool will produce a considerable amount of Json logs.

If you just want to run it, download the "lin_fh" binary.

This tool's output is meant to be used by forensic practioners to investigate suspicious events on live Linux systems.

Gnome Autostart Locations:
  ~/.config/autostart
KDE Autostart Locations:
  $HOME/.kde/Autostart
  $HOME/.config/autostart
  $HOME/.config/plasma-workspace/env
  $HOME/.config/plasma-workspace/shutdown
Misc. Autostart Locations:
  /etc/xdg/autostart
  /var/spool/cron
Services:
  /etc/init.d
  /etc/systemd
  $HOME/.config/systemd/user
Udev rules:
  /usr/lib/udev/rules.d
  /usr/local/lib/udev/rules.d
User cron jobs:
  /var/spool/cron/crontabs

Linux Forensic Harvester
    Author: Brian Kellogg
    License: MIT
    Disclaimer: 
        This tool comes with no warranty or support. 
        If anyone chooses to use it, you accept all responsibility and liability.

Must be run as root.

Usage:
    lin_fh [options]
    lin_fh -fksl
    lin_fh [--ip <ip> --port <port>] [--depth <depth>]
    lin_fh [--ip <ip> --port <port>] [--limit]
    lin_fh [--i <ip> -p <port>] [--suidsgid] [--limit]
    lin_fh (-s, --suidsgid) [--limit]
    lin_fh (-r <regex> | --regex <regex>) [-ls] [-d <depth>]
    lin_fh --max <bytes> [--limit] [-d <depth>]
    lin_fh (-l | --limit)
    lin_fh --start <start_time> [-d <depth>]
    lin_fh --end <start_time> [-d <depth>] [-ls]
    lin_fh --start <start_time> --end <end_time>
    lin_fh -s [-d <depth>]
    lin_fh [-sl] (-x <hex> | --hex <hex>)
    lin_fh (-h | --help)

Options:
    -d, --depth <depth>     Max directory depth to traverse [default: 5]
    -f, --forensics         Gather general forensic info
    -h, --help              Print help
    -l, --limit             Limit CPU use
    -k, --rootkit           Run rootkit hunts
    -m, --max <bytes>       Max size of a text file in bytes to inspect the content
                            of for interesting strings [default: 100000]
                            - Text files will always be searched for references
                              to other files.
  Remote logging:
    -i, --ip <ip>           IP address to send output to [default: NONE]
    -p, --port <port>       Destination port to send output to [default: 80]
  Time window:
    This option will compare the specified date window to the file's 
    ctime, atime, or mtime and only output logs where one of the dates falls 
    within that window. Window start is inclusive, window end is exclusive.
    --start <UTC_start_time>    Start of time window: [default: 0000-01-01T00:00:00]
                                - format: YYYY-MM-DDTHH:MM:SS
    --end <UTC_end_time>        End of time window: [default: 9999-12-31T23:59:59]
                                - format: YYYY-MM-DDTHH:MM:SS
  Custom hunts:
    -r, --regex <regex>     Custom regex [default: $^]
                            - Search file content using custom regex
                            - Does not support look aheads/behinds/...
                            - Uses Rust regex crate (case insensitive and multiline)
                            - Tag: RegexHunt
    -s, --suidsgid          Search for suid and sgid files
                            - This will search the entire '/' including subdirectories
                            - Can take a long time
                            - /dev/, /mnt/, /proc/, /sys/ directories are ignored
    -x, --hex <hex>         Hex search string [default: FF]
                            - Hex string length must be a multiple of two
                            - format: 0a1b2c3d4e5f
                            - Tag: HexHunt

Note:
  Must be run as root.

  A log with data_type of 'Rootkit' will be generated if the size of file read into
  memory is less that the size on disk. This is a simple possible root kit identification
  method.
  - See: https://github.com/sandflysecurity/sandfly-file-decloak
  
  To capture network output, start a netcat listener on your port of choice.
  Use the -k option with netcat to prevent netcat from closing after a TCP connection is closed.

  Files larger than 256MB will not be hashed.

  Files larger than '--max' will not be inspected for interesting strings.

To compile

sudo apt install musl-tools
rustup target add x86_64-unknown-linux-musl
cargo build --release

To do

~~Further procfs parsing~~
~~Expand on interesting strings to capture in "FileContent" data_type~~
Add static examination of binaries, including interesting strings
Add other persistence mechanisms
~~Report on local users, /etc/passwd, and group, /etc/groups, membership~~
~~Identification of "interesting" log entries~~
~~Output via network comms~~
~~Web shell detection~~
~~Shell histories~~
~~Setuid / setgid~~
Traps
Document parent and child data type relation
Add more interesting strings / commands to search for in file contents specific to Linux
...

Output format

Output is in Json for import into ELK or any other Json indexer. I may add other log formats.

Tool use

No configuration files are currently included. Everything is compiled in to acheive easier remote use of the tool. Just copy file to host and run. Pipe / redirect the output with standard Linux tools. At some point I will probably add a network send option.

About the logs

parent_data_type - if a log was generated due to something found in another log this field will hold the data_type of the parent log that caused this log to be generated (e.g. file path was found in a file's content and therefore the tool went and gathered metadata on that file referenced in the first file's content)
data_type - the source of telemetry the log is reporting on
tags - tags are added to this array field when something interesting is found by a built-in hunt Anything of interest (a hunt, e.g. for rootkits or interesting stings/content) will be noted in the tags field.

Information gathered on:

Cron jobs
- Data type: Cron
Drive mounts
- Data type: MountPoint
Groups
- Data type: LocalGroup
Interesting File Content
- Encoded strings
  - Tag: Encoding, Base64, Obfuscation
- File referenced in a file's content
  - Tag: FilePath - If a file's forensic data was harvested due to it being referenced in another file this tag is added
- IPs (v4 and v6)
  - Tag: IPv4, IPv6
- Shell code
  - Tag: ShellCode
- UNCs
  - Tag: Unc
- URLs
  - Tag: Url
- Web shells
  - Tag: WebShell
- Custom hex search
  - Tag: Hex
- Custom Regex
  - Tag: Regex
- Right to left trickery
  - Tag: RightLeft
- Shell references (sh, bash, zsh, ...)
  - Tag: Shell
- Possible suspicious commands
  - Tag: Suspicious
Link files
- Data type: ShellLink
Loaded Kernel Modules
- Data type: KernelModule
Network connections (via procfs)
- Data type: NetConn
Possible rookit
- Data type: Rootkit
Processes (via procfs)
- Data type: Process
- Process file (file of the process on disk)
- Process' open files
  - Data type: ProcessOpenFile
- Process' loaded libraries
  - Data type: ProcessMap
- Process' mem mapped files
  - Data type: ProcessMaps
Users
- Data type: LocalUser

Rootkit detection techniques

NOTE: Live machine analysis for rootkits is not entirely reliable. Well written rootkits will probably not be able to be discovered reliably with live machine forensics.

Any logs generated due to a rootkit hunt will have Rootkit set as their parent_data_type
File data that is found in memory mapped read files not found via a standard file read
- Tag: DataHidden
Directory with hidden contents
- Tag: DirContentsHidden
Tainted kernel module information
- Tag: KernelTaint
Hidden processes
- Tag: ProcHidden
World readable run lock files
- Tag: ProcLockWorldRead
Odd run lock files
- Tag: ProcLockSus
Legit process mimicry
- Tag: ProcMimic
Processes thread mimicry
- Tag: ThreadMimic
Hidden sys modules
- Tag: ModuleHidden
Raw packet sniffing processes
- Tag: PacketSniffer
Process takeovers
- Tag: ProcTakeover
Proccess run as root with socket and no deps outside of libc
- Tag: ProcRootSocketNoDeps
Odd character devices
- Tag: CharDeviceMimic

See:

Some file contents are examined looking for other interesting strings. For example, if another file is referenced within a file, that file's metadata will also be retreived. Other strings of interest found in file contents are reported: IPs, file paths, URLs, shellcode, Base64 and misc encodings, and UNC paths.

Process information is retreived via ProcFS parsing.

The "data_type" field is used to report what the metadata in that log is pulled from. e.g. File, FileContent, Process, ... .
The "parent_data_type" field is used to report if that log was generated due to examining another data_type. e.g. the "FileContent" data_type may trigger a "File" data_type if a file path is found in a file's contents.

The network connection logs do not show originator or responder perspectives simply because procfs reports the IPs as local and remote. You can make a good guess as to whether a network connection is incoming or outgoing based upon which port is higher than the other. But, this will not always yeild the correct direction.

If you want to change the field name(s) of any fields please edit the struct field names in the data_def source file.

Disclaimer

This tool comes with no warranty or support. If anyone chooses to use it, you accept all responsability and liability.

// file paths we want to watch all files in
const WATCH_PATHS: [&str; 14] = [
    "/etc",
    "/home",
    "/lib/modules",
    "/proc",
    "/root",
    "/srv",
    "/tmp",
    "/usr/lib/systemd/system",
    "/usr/local/var/www/html",
    "/usr/share/nginx/html",
    "/usr/share/nginx/www",
    "/var/log",
    "/var/spool/cron",
    "/var/www",
    ];
// files mime types whose content we want to look at for interesting things
const WATCH_FILE_TYPES: [&str; 25] = [
    "abiword",
    "/pdf",
    "/pkix-cert+pem",
    "/rtf",
    "/vnd.iccprofile",
    "/x-desktop",
    "/x-object",
    "/x-pcapng",
    "/x-perl",
    "/x-sh",
    "/x-tcl",
    "/xml",
    "bittorrent",
    "excel",
    "javascript",
    "json",
    "msword",
    "officedocument",
    "opendocument",
    "powerpoint",
    "presentation",
    "stardivision",
    "text/",
    "wordperfect",
    "yaml",
    ];

Name		Name	Last commit message	Last commit date
Latest commit History 228 Commits
.cargo		.cargo
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linux_Forensic_Harvester

To compile

To do

Output format

Tool use

About the logs

Rootkit detection techniques

See:

Disclaimer

About

Releases 11

Packages

Languages

License

theflakes/Linux_Forensic_Harvester

Folders and files

Latest commit

History

Repository files navigation

Linux_Forensic_Harvester

To compile

To do

Output format

Tool use

About the logs

Rootkit detection techniques

See:

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Languages

Packages