Welcome to my repository! This collection features various web crawlers and automation tools I've developed. These projects demonstrate the power of Python libraries such as Selenium, Beautiful Soup, and others, for tasks like data extraction, site reading, audio conversion, and natural language processing.
- Project Overview
- Features
- Libraries and Tools Used
- Project Sections :
- Site Reader and Learners
- Data Extraction Bots
- Audio Book Conversion Bots
- NLP Learning Chat Bots
- Installation
- Usage
This repository contains all the major and minor web crawlers I've created. These projects are designed to interact with websites, extract data, convert text to speech, and even provide chatbot functionality. They are built using powerful Python libraries to automate tasks and simplify the user experience.
- Automated Web Crawling: Interact with dynamic content using Selenium and parse HTML with Beautiful Soup.
- Text to Speech Conversion: Convert text to speech using pyttsx3, supporting SAPI5, Google, and Bing APIs.
- Speech to Text Conversion: Facilitate voice interaction with bots and web crawlers.
- Natural Language Processing: Use NLP for chatbots that respond based on crawled data.
-
🔗 Selenium: A browser automation tool that allows interaction with dynamic web content, perfect for handling JavaScript-heavy websites learn more.
-
🍲 Beautiful Soup: A Python library for parsing HTML and XML documents, making it easy to extract data from web pages learn more.
-
🔊 pyttsx3: A text-to-speech conversion library in Python, compatible with multiple speech engines including SAPI5, Google, and Bing. This allows you to create audio content from text data seamlessly learn more.
-
🌐 url_request: A Python module used for fetching URLs. It allows performing HTTP requests, handling responses, and is vital for downloading content from the web learn more.
-
🤖 NLP Retrieval Based Chat Bots: Chatbots that are developed using Natural Language Processing, retrieving and responding based on data collected from web crawlers. This provides a more interactive way of engaging with the extracted information.
These tools are designed to read content from educational and informational websites, making it easier for users to consume information through either text or audio.
Bots created to extract specific data from websites, such as product information, news articles, or any other content that needs to be scraped and analyzed.
These bots convert textual content into audio books, using text-to-speech libraries like pyttsx3, providing an easy way to consume content on the go.
Chatbots that utilize the data gathered by web crawlers to engage in conversation with users, providing insights, answering queries, and assisting with learning.
To install the required dependencies, use the following command:
pip install -r requirements.txt
Each project section comes with its own usage instructions. Refer to the specific folder for detailed steps on how to run the scripts and what inputs are needed.