Skip to content
This repository has been archived by the owner on Oct 29, 2019. It is now read-only.
/ cdxj Public archive

Golang package implementing the CDXJ file format used by OpenWayback 3.0.0+ to index web archive contents

License

Notifications You must be signed in to change notification settings

datatogether/cdxj

Repository files navigation

CDXJ

GitHub Slack GoDoc License

Golang package implementing the CDXJ file format used by OpenWayback 3.0.0 (and later) to index web archive contents (notably in WARC and ARC files) and make them searchable via a resource resolution service. The format builds on the CDX file format originally developed by the Internet Archive for the indexing behind the WaybackMachine. This specification builds on it by simplifying the primary fields while adding a flexible JSON 'block' to each record, allowing high flexiblity in the inclusion of additional data.

License & Copyright

Copyright (C) 2017 Data Together
This program is free software: you can redistribute it and/or modify it under the terms of the GNU AFFERO General Public License as published by the Free Software Foundation, version 3.0.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the LICENSE file for details.

Getting Involved

We would love involvement from more people! If you notice any errors or would like to submit changes, please see our Contributing Guidelines.

We use GitHub issues for tracking bugs and feature requests and Pull Requests (PRs) for submitting changes

Installation

Use in any golang package with:

import "github.com/datatogether/cdxj"

Development

Coming Soon

About

Golang package implementing the CDXJ file format used by OpenWayback 3.0.0+ to index web archive contents

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages