The HTTP Archive tracks how the Web is built

This repo contains the source code powering the HTTP Archive data collection.

What is the HTTP Archive?

Successful societies and institutions recognize the need to record their history - this provides a way to review the past, find explanations for current behavior, and spot emerging trends. In 1996 Brewster Kahle realized the cultural significance of the Internet and the need to record its history. As a result he founded the Internet Archive which collects and permanently stores the Web's digitized content.

In addition to the content of web pages, it's important to record how this digitized content is constructed and served. The HTTP Archive provides this record. It is a permanent repository of web performance information such as size of pages, failed requests, and technologies utilized. This performance information allows us to see trends in how the Web is built and provides a common data set from which to conduct web performance research.

Name		Name	Last commit message	Last commit date
Latest commit History 1,598 Commits
archives		archives
bin		bin
bulktest		bulktest
custom_metrics		custom_metrics
docs		docs
images		images
lists		lists
.gitignore		.gitignore
.htaccess		.htaccess
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
about.php		about.php
addsite.php		addsite.php
admin.php		admin.php
apple-touch-icon-precomposed.png		apple-touch-icon-precomposed.png
apple-touch-icon.png		apple-touch-icon.png
charts.inc		charts.inc
compare.php		compare.php
comparedates.php		comparedates.php
crawl-data.php		crawl-data.php
crawls.inc		crawls.inc
dbapi.inc		dbapi.inc
download.php		download.php
downloads.php		downloads.php
favicon.ico		favicon.ico
filmstrip.js		filmstrip.js
findurl.php		findurl.php
frame.php		frame.php
har.css		har.css
har.js		har.js
har_to_pagespeed		har_to_pagespeed
harviewer.js		harviewer.js
index.php		index.php
interesting-images.js		interesting-images.js
interesting.php		interesting.php
news.php		news.php
package.json		package.json
pages.inc		pages.inc
patchwork.js		patchwork.js
patchwork.php		patchwork.php
removesite.php		removesite.php
requests.inc		requests.inc
robots.txt		robots.txt
runs.js		runs.js
runs.php		runs.php
schema.js		schema.js
settings.inc		settings.inc
sorttable-async.js		sorttable-async.js
stats.inc		stats.inc
status.inc		status.inc
style.css		style.css
tablesort.js		tablesort.js
trends.inc		trends.inc
trends.php		trends.php
ui.inc		ui.inc
urls.inc		urls.inc
urls.php		urls.php
utils.inc		utils.inc
viewsite.php		viewsite.php
websites.php		websites.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The HTTP Archive tracks how the Web is built

What is the HTTP Archive?

About

Releases

Packages

Languages

License

paulcalvano/httparchive

Folders and files

Latest commit

History

Repository files navigation

The HTTP Archive tracks how the Web is built

What is the HTTP Archive?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages