#

html-parser

Here are 85 public repositories matching this topic...

miso-belica / jusText

Heuristic based boilerplate removal tool

python text-extraction html-parser html-parsing

Updated May 9, 2024
Python

pywebcopy

rajatomar788 / pywebcopy

Locally saves webpages to your hard disk with images, css, js & links as is.

python html crawler web webpage mirror html-parser archive-tool

Updated Jul 31, 2024
Python

ispras / dedoc

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

html pdf ocr table-of-contents excel html-parser docx documents doc scanned-documents txt document-analysis odt pdf-parser table-recognition docx-parser document-content-extraction logical-structure-extraction

Updated Dec 25, 2024
Python

alphanome-ai / sec-parser

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual (semantic) structure of the document.

Updated Jul 13, 2024
Python

sihaelov / harser

Easy way for HTML parsing and building XPath

python html parser html-parser xpath

Updated Jul 6, 2022
Python

kata198 / AdvancedHTMLParser

Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modification, and formatting. Also XPath.

python html parser formatter tree dom tags attributes filter html-parser create getelementbyid dom-tree getelementsbyclassname getelementsbyname getelementsbytagname

Updated Jul 5, 2023
Python

menggatot / youtube-watch-history-to-csv

This project allows you to convert your YouTube watch history HTML file from Google Takeout into a CSV file that can be used by the universalscrobbler.com to Scrobble manually in bulk.

scrobble youtube csv lastfm youtube-dl html-parser google-takeout yt-dlp youtube-watch-history universalscrobbler

Updated Apr 16, 2024
Python

YiraBot-Crawler

OwenOrcan / YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

open-source machine-learning data-mining scraping python3 text-extraction web-scraping html-parser robots-txt data-extraction seotools command-line-tool beginner-friendly contributions-welcome big-data-analytics seo-analysis good-first-issue sitemap-parser web-crawlers

Updated Nov 24, 2024
Python

yannickperrenet / bookmarkdown

✅ Parse your browser's exported HTML bookmark file to Markdown.

markdown html-to-markdown html-parser brave html-to-md brave-browser

Updated Sep 8, 2021
Python

html5

viur-framework / html5

Pure Python HTML abstraction layer, parser and interpreter

python html library framework html5 html-parser viur pyjs pyodide

Updated Nov 5, 2022
Python

vincentlaucsb / pgreaper

A Python library for loading data from various formats into PostgreSQL databases.

python csv-converter sql postgresql convert-data html-parser sql-database sqlite3-database sql-table

Updated Oct 18, 2017
Python

jedmitten / humble_catalog

A script to parse the saved Humble Bundle library HTML

html-parser humblebundle

Updated Aug 13, 2019
Python

Bystroushaak / pyDHTMLParser

Lightweight HTML/XML parser for quick and dirty web scraping.

python parser library html-parser parsing-library

Updated Oct 21, 2022
Python

Epicfisher / TouhouDiscordBot

A Work-In-Progress Discord bot based on the largely popular Touhou series by ZUN.

bot discord-music-bot discord discord-bot discord-api html-parser touhou-project discord-py touhou quote-parser

Updated May 4, 2020
Python

NullpoGah / reestr

Сбор данных из реестра российского ПО с сайта https://reestr.minsvyaz.ru

python excel pandas html-parser dataframe beautifulsoup4

Updated Feb 21, 2019
Python

joncutrer / streamlit-data-extraction-tools

Multipage Streamlit app that brings together several html data extraction tools.

html html-parser data-extraction streamlit streamlit-webapp streamlit-application

Updated Aug 8, 2021
Python

karambir / ugc-colleges

Python Script to extract college names from UGC, India website.

python crawler extract python-script html-parser college ugc

Updated Jul 29, 2012
Python

MichaelE919 / ncaa-stats-webscraper

Python webscraping module for NCAA Basketball Stats

python3 requests html-parser webscraping openpyxl beautifulsoup4

Updated Dec 8, 2022
Python

rsharifnasab / telegram_export_analyzer

this script can analyze number of telegram messages by time

python html telegram python3 html-parser html-parsing telegram-desktop beautifulsoup beautifulsoup4

Updated Feb 24, 2020
Python

tonezz / total-wine-scraper

Python scraper for TotalWine.com data 🍷

python html scraper csv html-parser scraping-websites csv-export

Updated Jun 13, 2018
Python

Improve this page

Add a description, image, and links to the html-parser topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the html-parser topic, visit your repo's landing page and select "manage topics."