Skip to content

victornpb/benchmark-html-parser-libraries

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Benchmark HTML Parser Libraries

A Benchmark of javascript libraries for parsing HTML (CPU/RAM)

Results

List of result

Date: 2021-01-19T03:13:42.044Z

Library ms/file (Mean) Module Startup RAM MB/file (Mean) Max Lib Overhead Delta
html-dom-parser.js 31.3028ms ±13.5821ms 6.76851ms 49.1mb ±4.91mb 53.3mb 0.434mb 23.7mb
html-parser.js 24.2716ms ±16.6921ms 1.57127ms 65.7mb ±12.3mb 74.3mb 0.0614mb 44.9mb
html5.js 100.166ms ±127.430ms 16.3293ms 77.5mb ±6.58mb 82.3mb 2.58mb 50.6mb
html5parser.js 2.49572ms ±2.60323ms 0.422035ms 52.9mb ±9.60mb 61.3mb 0.0451mb 26.5mb
htmlparser.js 25.9345ms ±141.275ms 1.48171ms 55.9mb ±6.57mb 59.3mb 0.201mb 31.1mb
htmlparser2-dom.js 29.8189ms ±13.3958ms 20.6422ms 50.0mb ±5.74mb 54.7mb 3.30mb 23.2mb
htmlparser2.js 26.2480ms ±11.6788ms 24.9610ms 48.3mb ±4.27mb 51.2mb 3.44mb 19.0mb
jsdom-fragment.js 105.638ms ±48.3693ms 422.421ms 108mb ±10.5mb 132mb 40.5mb 42.7mb
jsdom-node-locations.js 102.662ms ±44.9402ms 489.440ms 109mb ±10.8mb 134mb 39.3mb 42.0mb
jsdom.js 125.054ms ±53.2103ms 397.705ms 117mb ±11.0mb 141mb 36.4mb 53.3mb
libxmljs.js 4.02398ms ±2.63628ms 4.58750ms 43.3mb ±5.67mb 49.6mb 0.520mb 16.1mb
neutron-html5parser.js 3.18042ms ±1.99863ms 1.01811ms 45.8mb ±7.86mb 50.1mb 0.0696mb 22.0mb
parse5-fragment.js 35.1238ms ±12.5502ms 13.9843ms 60.7mb ±7.56mb 73.4mb 2.31mb 30.4mb
parse5.js 36.3975ms ±13.3324ms 15.6430ms 62.0mb ±8.06mb 74.4mb 2.45mb 33.2mb
sax-wasm.js 10.2434ms ±5.75753ms 13.4242ms 90.7mb ±29.3mb 135mb 7.26mb 99.0mb
sax.js 11.1351ms ±12.4832ms 2.33130ms 62.7mb ±10.4mb 67.2mb 0.352mb 38.6mb

Notes:

  • Max = The maximum amount of memory seen during all the tests.
    (You should see higher values in the real world when parsing multiple files in sequence,
    normally garbage collection isn't guaranteed to happen after each parse, like here. This is the amount of ram you will typically need for parsing a single file)

  • Delta = The amount of RAM being used at the end of the benchmark after a forced Garbage Colletion.
    This shows how good or bad the library is at releasing its resources.

  • Lib Overhead = Memory usage just after importing the library and running the setup()
    minus the baseline memory usage before importing the library.


Device summary

Node: 14.15.1 V8: 8.4.371.19-node.17 NPM: 6.14.8 OS: Mac OS X macOS Catalina 10.15.7 darwin x64 19.6.0 Device: Apple Inc. MacBookPro15,1 | CPU Intel® Core™ i7-8750H 2.20GHz 6C/12T | RAM 16 GB | GPU Intel Intel UHD Graphics 630 Built-In 1536 MB / AMD Radeon Pro 555X PCIe 4096 MB


Running the benchmark

  1. Clone the repository
  2. Run:
    npm install
    npm start

Attribution

Original HTML files from @AndreasMadsen