extractor

Star

Here are 215 public repositories matching this topic...

fhamborg / news-please

Sponsor

Star

news-please - an integrated web crawler and information extractor for news that just works

Updated Oct 14, 2024
Python

tatuylonen / wiktextract

Star

Wiktionary dump file parser and multilingual data extractor

multilingual parser lua dictionary extractor templates wikitext scribunto wiktionary wiktionary-parser

Updated Dec 23, 2024
Python

StanGirard / seo-audits-toolkit

Star

SEO & Security Audit for Websites. Lighthouse & Security Headers crawler, Sitemap/Keywords/Images Extractor, Summarizer, etc ...

python crawler dashboard analysis seo extractor serp headers summarizer audits lighthouse internal-links seo-tools link-extractor securityheader

Updated Feb 6, 2023
Python

AlexMathew / scrapple

Star

A framework for creating semi-automatic web content extractors

python crawler tutorial extractor scraping web-scraper selector css-selector web-scraping scrapy scrapers beautifulsoup xpath-expression lxml selector-expression

Updated Nov 1, 2024
Python

MikeMeliz / TorCrawl.py

Star

Crawl and extract (regular or onion) webpages through TOR network

python crawler osint extractor tor onion

Updated Nov 16, 2024
Python

opensemanticsearch / open-semantic-etl

Star

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database

Updated Oct 9, 2022
Python

lipoja / URLExtract

Star

URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.

extractor extract urls hacktoberfest

Updated Feb 29, 2024
Python

hxz393 / BrutalityExtractor

Star

适用于高性能系统的多进程解压缩软件(A multiprocess decompression software for high-performance system)

Updated Nov 19, 2023
Python

theLSA / burp-sensitive-param-extractor

Star

burpsuite extension for check and extract sensitive request parameter

checker parameters extractor burp-plugin burpsuite sensitive

Updated Nov 29, 2020
Python

mefistotelis / pylabview

Star

Python reader of LabVIEW RSRC files (VI, CTL, LLB). File format description on the Wiki.

extractor reverse-engineering labview python3 fileformat

Updated Nov 14, 2024
Python

DanielJDufour / date-extractor

Sponsor

Star

Extract dates from text

python nlp parser time parse datetime date extractor iso taiwan chinese french arabic temporal kurdish sorani extract-dates

Updated Jan 27, 2021
Python

lucasayres / url-feature-extractor

Star

Extracting features from URLs to build a data set for machine learning. The purpose is to find a machine learning model to predict phishing URLs, which are targeted to the Brazilian population.