Skip to content
View maledorak's full-sized avatar
🤔
hmmm...
🤔
hmmm...

Organizations

@rumblefishdev @emmetify

Block or report maledorak

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

crawler

16 repositories

Crawl a site to generate knowledge files to create your own custom GPT from a URL

TypeScript 20,489 2,161 Updated Jan 23, 2025

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

Python 192 32 Updated Jul 28, 2024

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

TypeScript 16,592 738 Updated Jan 24, 2025

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

TypeScript 22,464 1,809 Updated Jan 23, 2025

A python script to scrape URL's from major search engines.

Python 15 1 Updated Dec 13, 2022

Python scraper based on AI

Python 17,370 1,458 Updated Jan 22, 2025

Create and modify PDF documents in any JavaScript environment

TypeScript 7,209 692 Updated Jul 17, 2024

A next-generation crawling and spidering framework.

Go 12,898 673 Updated Jan 20, 2025

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 53,884 10,614 Updated Jan 23, 2025

HTTP client made for scraping based on got.

TypeScript 581 48 Updated Nov 20, 2024

A Python module to bypass Cloudflare's anti-bot page.

Python 4,622 494 Updated Feb 23, 2024

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/

TypeScript 7,494 586 Updated Jan 21, 2025

A simple browser extension to bypass YouTube's age verification, disable content warnings and watch age restricted videos without having to sign in!

JavaScript 2,273 102 Updated Nov 16, 2024

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper

Python 26,953 2,112 Updated Jan 23, 2025

Extract clean data from anywhere, powered by vision-language models ⚡

Python 1,217 78 Updated Jan 2, 2025

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Python 3,853 272 Updated Dec 28, 2024