So many browser bookmarks – there are 1,900 URLs in my bookmarks.
And in just one year, 120 of those URLs ceased to exist.
A simple PHP prototype script provided a slow way (~1 URL per second) of checking for dead links.
I switched to Python to leverage its threading capabilities and speed up the process. Then I finally got round to adding cURL Multi to the PHP script.
$ php bookmarks_checker.php
1883 links being checked ...
error | https://www.nxytimes.com/ | 0 | 4.999007 | nxytimes
<...>
See generated logfile bookmarks_checker.log
URL parse time: 177.642 s
95 links failed
1788 links verified
- Python 3
- Python 2
- PHP and cURL
The scripts by default will attempt to load a file in the same directory called bookmarks.html
An alternative filename can be specified on the command-line.
The scripts parse the file and try to access each URL, printing a list of URLs that cannot be accessed (which will intermittently include a false positive).
python3 bookmarks_checker.py
python bookmarks_checker_py2.py
(or make the file executable and run directly e.g. ./bookmarks_checker.py
)
-h
or --help
displays help text.
-f <file>
allows an alternatively-named file to be loaded instead of the default bookmarks.html
php bookmarks_checker.php [file]
Bookmarks > Show All Bookmarks > Import and Backup > Export Bookmarks to HTML
Access Chrome's Bookmark Manager with:
Ctrl + Shift + O
or
chrome://bookmarks/
then click Organize > Export bookmarks to HTML file ...
or
hamburger icon > Export bookmarks
Setting DEBUG = True
will show each URLs as access is attempted, and either the successful response or the failure error message.
Doug Hellmann, jfs, and philshem for threading pools in Python.
Scripts are released under the GPL v.3.