Robofinder is a powerful Python script designed to search for and retrieve historical robots.txt
files from Archive.org for any given website. This tool is ideal for security researchers, web archivists, and penetration testers to uncover previously accessible paths or directories that were listed in a site's robots.txt
.
- Fetch historical
robots.txt
files from Archive.org. - Extract and display old paths or directories that were once disallowed or listed.
- Save results to a specified output file.
- Silent Mode for unobtrusive execution.
- Multi-threading support for faster processing.
- Option to concatenate extracted paths with the base URL for easy access.
- Debug mode for detailed execution logs.
- Extract old parameters from robots.txt files.
Install Robofinder quickly and securely using pipx
:
pipx install git+https://github.com/Spix0r/robofinder.git
To install manually:
git clone https://github.com/Spix0r/robofinder.git
cd robofinder
pip install -r requirements.txt
If installed via pipx
:
robofinder -u https://example.com
For manual installation:
python3 robofinder.py -u https://example.com
-
Save output to a file:
robofinder -u https://example.com -o results.txt
-
Silent Mode (minimal output to console):
robofinder -u https://example.com -s
-
Concatenate paths with the base URL:
robofinder -u https://example.com -c
-
Extract Paramters:
robofinder -u https://example.com -p
-
Enable Debug Mode:
robofinder -u https://example.com --debug
-
Multi-threading (default: 10 threads):
robofinder -u https://example.com -t 10
Combine options for tailored execution:
robofinder -u https://example.com -t 10 -c -o results.txt -s
Running Robofinder on example.com
with 10 threads, silent mode, and saving just the paramters to the results.txt
:
robofinder -u https://example.com -t 10 -o results.txt -s -p
Contributions are highly welcome! If you have ideas for new features, optimizations, or bug fixes, feel free to submit a Pull Request or open an issue on the GitHub repository.