Skip to content

Cheerio in Production

iangkuo edited this page Oct 11, 2022 · 72 revisions

Companies using cheerio in production

  • The Old Net uses Cheerio to remove modern JavaScript that would break vintage computer web browsers.
  • Sysco Labs uses Cheerio to parse scraped websites
  • Scraper API uses Cheerio to parse scraped websites
  • AfterShip uses Cheerio to parse the couriers tracking results.
  • Walmart uses Cheerio to host the server rendering of its mobile website
  • Cloudup uses Cheerio to provide a better viewing experience for certain websites
  • Kimono uses Cheerio to parse the scraped websites
  • Courseoff uses cheerio to scrape college course catalogs and schedule listings
  • Iframely uses cheerio to parse specific domains and generic patterns, such as microformats or oEmbed. Plus, to analyze detected embed codes, make it responsive, check SSL support, etc.
  • Higher Ed Careers Canada uses Cheerio automatically to verify details about job postings and to add "nofollow" to links in submitted HTML.
  • Workray uses Cheerio to parse Job Alert emails from Job Boards and extract the job listings, or extract application information from application confirmation emails.
  • BotFactory uses Cheerio to parse wishlists from Amazon, AliExpress as well as couriers tracking results
  • ZenLocator uses Cheerio to strip out JavaScript from customer templates, and in pre-rendering custom dashboard controls.
  • GitHub Trending API uses Cheerio to scrape GitHub trending projects.
  • InspectorHub uses Cheerio to parse the e commerce top products
  • Vingle uses Cheerio to detect suspicious content, and to parse XML-based content data.
  • OWEB uses Cheerio to parse scraped websites.
  • Affiliate Stats Tracker uses Cheerio to extract data scraped using Puppeteer.
  • REVOL uses Cheerio to parse scraped websites
  • Remotehour uses Cheerio to modify meta attributes
  • Talon uses Cheerio to parse insurance carrier and health plan portals.
  • BikeSleepBike uses Cheerio to parse blog posts about bikepacking and bicycle travel.
  • JishinAlert uses Cheerio to parse scraped disaster prevention data from government sources such as NHK, National Research Institute for Earth Science and Disaster Resilience (NIED), and the Japan Meteorological Agency.
  • Tibbo uses Cheerio to parse XML files and output HTML topics for its documentation platform.

Libraries Built with cheerio

  • x-ray is a web scraper
  • Backbone.LayoutManager
  • breakdance is a HTML-to-markdown converter that uses cheerio to parse the HTML
  • itteco/iframely is the library behind Iframely
  • fruit-loops Walmart's isomorphic javascript environment
  • CheerioBin run Cheerio and jQuery commands simultaneously
  • AkashaCMS is a content management system which produces static HTML files. It uses Cheerio extensively for DOM manipulation of generated HTML pages before writing to disk.
  • Postxml is a tool for transforming html/xml with plugins based on cheerio.
  • CheerioGetCssSelector Extends cheerio to get a unique css selector for any cheerio element.
  • jsonframe-cheerio brings a crazy simple way to input/output json structured data
  • temme Concise and convenient jQuery-like selector for node crawlers.
  • Jason the Miner harvests data at the <html> mine. Cheerio enables Jason to express simple yet powerful schemas definition allowing DOM element selection, matching and extraction.
  • SeaSite New approach to simple static website generation using jQuery-like selectors. Convenient predefined plugins and tasks solve every-day problems for complex website building.
  • icsd-scraper Retrieves details about professors and courses from University of the Aegean department ICSD to help students to their academic projects. It uses Cheerio for DOM manipulation and data collection.
  • Tumblweed is a fully cross-platform Tumblr blog downloader, using Cheerio to scrape posts to extract embedded media for download.
  • Typi is a scraper that uses headless browser along with cheerio to scrape anything that can be viewed by a real user
  • @luxdamore/nuxt-prune-html Nuxt module to prune html before sending it to the browser (it removes elements matching CSS selector(s)), useful for boosting performance showing a different HTML for bots by removing all the scripts with dynamic rendering.
  • scrapio is a very simple json template based scraper, using cheerio.
  • iam-floyd uses Cheerio to generate code from the AWS documentation
  • cairn an npm package and CLI tool for saving the web page as a single HTML file.
  • @get-set-fetch/scraper - web scraper supporting multiple databases and headless clients

Feel free to add your company or library using cheerio!

Clone this wiki locally