An easy-to-use, powerful crawler for OLX, that allows collecting various non-sensitive data about ads on the site.
- π¦Ύ Enough performance
- π Anonymous, especially via Tor
- βοΈ Non-sensitive data
- π Filtering by keywords
- βοΈ Commands chaining
Demonstration of experience with Selenium for Web Scraping πͺ. Analyzing non-sensitive data about ads on the site π§. No ready solutions for collecting data from the site π’.
You will need to install only Google Chrome, thats all. No need manual installation of WebDriver binary. @SergeyPirogov thank you for WebDriver Manager.
- Clone the Repository
- Install this Package (
./setup.py install
) or install dependencies from Pipfile (pipenv install
)
olx ads --help # Show help for ads command and exit
olx ads "https://www.olx.ua/uk/zhivotnye/koshki/" # Collect all ads with cats
olx ads --no-free ... # Only paid ads
olx ads --no-paid ... # Only free ads
olx ads --kind --title --price --location ... # Collect extra fields
olx ad --help # Show help for ad command and exit
olx ad "https://www.olx.ua/d/uk/obyavlenie/laskovye-shotlandskie-malyshi-IDNyrf4.html" # Collect ad details
olx ad --keywords keywords.txt ... # Filter by keywords
olx ad --title --description --author --profile --price --location ... # Collect extra fields
olx ads --progress ... # Show progress
olx ads --no-headless ... # Disabled headless mode
olx ads --proxy "socks5://..." # Use proxy server
olx ads --all ... # Collect all fields
olx ads --no-link ... # Skip link field
olx ads "https://www.olx.ua/uk/zhivotnye/koshki/" | olx ad --all --progress > ads.csv # Commands chaining
ππ First off, thanks for taking the time to contribute! ππ
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/awesome-feature
) - Commit your Changes (
git commit -m 'Add awesome feature'
) - Push to the Branch (
git push origin feature/awesome-feature
) - Open a Pull Request
Leave a β if you think this project is cool or useful for you.
olx-crawler
is licenced under the MIT License. See the LICENSE
for more information.