A Benchmark of javascript libraries for parsing HTML (CPU/RAM)
Date: 2021-01-19T03:13:42.044Z
Library | ms/file (Mean) | Module Startup | RAM MB/file (Mean) | Max | Lib Overhead | Delta |
---|---|---|---|---|---|---|
html-dom-parser.js | 31.3028ms ±13.5821ms | 6.76851ms | 49.1mb ±4.91mb | 53.3mb | 0.434mb | 23.7mb |
html-parser.js | 24.2716ms ±16.6921ms | 1.57127ms | 65.7mb ±12.3mb | 74.3mb | 0.0614mb | 44.9mb |
html5.js | 100.166ms ±127.430ms | 16.3293ms | 77.5mb ±6.58mb | 82.3mb | 2.58mb | 50.6mb |
html5parser.js | 2.49572ms ±2.60323ms | 0.422035ms | 52.9mb ±9.60mb | 61.3mb | 0.0451mb | 26.5mb |
htmlparser.js | 25.9345ms ±141.275ms | 1.48171ms | 55.9mb ±6.57mb | 59.3mb | 0.201mb | 31.1mb |
htmlparser2-dom.js | 29.8189ms ±13.3958ms | 20.6422ms | 50.0mb ±5.74mb | 54.7mb | 3.30mb | 23.2mb |
htmlparser2.js | 26.2480ms ±11.6788ms | 24.9610ms | 48.3mb ±4.27mb | 51.2mb | 3.44mb | 19.0mb |
jsdom-fragment.js | 105.638ms ±48.3693ms | 422.421ms | 108mb ±10.5mb | 132mb | 40.5mb | 42.7mb |
jsdom-node-locations.js | 102.662ms ±44.9402ms | 489.440ms | 109mb ±10.8mb | 134mb | 39.3mb | 42.0mb |
jsdom.js | 125.054ms ±53.2103ms | 397.705ms | 117mb ±11.0mb | 141mb | 36.4mb | 53.3mb |
libxmljs.js | 4.02398ms ±2.63628ms | 4.58750ms | 43.3mb ±5.67mb | 49.6mb | 0.520mb | 16.1mb |
neutron-html5parser.js | 3.18042ms ±1.99863ms | 1.01811ms | 45.8mb ±7.86mb | 50.1mb | 0.0696mb | 22.0mb |
parse5-fragment.js | 35.1238ms ±12.5502ms | 13.9843ms | 60.7mb ±7.56mb | 73.4mb | 2.31mb | 30.4mb |
parse5.js | 36.3975ms ±13.3324ms | 15.6430ms | 62.0mb ±8.06mb | 74.4mb | 2.45mb | 33.2mb |
sax-wasm.js | 10.2434ms ±5.75753ms | 13.4242ms | 90.7mb ±29.3mb | 135mb | 7.26mb | 99.0mb |
sax.js | 11.1351ms ±12.4832ms | 2.33130ms | 62.7mb ±10.4mb | 67.2mb | 0.352mb | 38.6mb |
-
Max = The maximum amount of memory seen during all the tests.
(You should see higher values in the real world when parsing multiple files in sequence,
normally garbage collection isn't guaranteed to happen after each parse, like here. This is the amount of ram you will typically need for parsing a single file) -
Delta = The amount of RAM being used at the end of the benchmark after a forced Garbage Colletion.
This shows how good or bad the library is at releasing its resources. -
Lib Overhead = Memory usage just after importing the library and running the setup()
minus the baseline memory usage before importing the library.
Node:
14.15.1
V8:8.4.371.19-node.17
NPM:6.14.8
OS: Mac OS X macOS Catalina 10.15.7 darwin x64 19.6.0 Device: Apple Inc. MacBookPro15,1 | CPU Intel® Core™ i7-8750H 2.20GHz 6C/12T | RAM 16 GB | GPU Intel Intel UHD Graphics 630 Built-In 1536 MB / AMD Radeon Pro 555X PCIe 4096 MB
- Clone the repository
- Run:
npm install npm start
Original HTML files from @AndreasMadsen