crawler
Crawl a site to generate knowledge files to create your own custom GPT from a URL
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A python script to scrape URL's from major search engines.
Create and modify PDF documents in any JavaScript environment
A next-generation crawling and spidering framework.
Scrapy, a fast high-level web crawling & scraping framework for Python.
HTTP client made for scraping based on got.
A Python module to bypass Cloudflare's anti-bot page.
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
A simple browser extension to bypass YouTube's age verification, disable content warnings and watch age restricted videos without having to sign in!
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper
Extract clean data from anywhere, powered by vision-language models ⚡
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML