Skip to content

Latest commit

 

History

History
443 lines (368 loc) · 13 KB

README.md

File metadata and controls

443 lines (368 loc) · 13 KB

@rowanmanning/feed-parser

A well-tested and resilient Node.js parser for RSS and Atom feeds.

Table of Contents

Introduction

This is a Node.js-based feed parser for RSS and Atom feeds. The project has the following aims:

  • Run automated tests against real-world feeds. It's currently tested against ~40 feeds via Sample Feeds. This ensures that we support real feeds rather than just the specifications.

  • Related to the point above, be as lenient as possible with feed parsing.

  • Keep up to date with the latest Node.js versions, including dropping support for end-of-life versions.

  • Maintain compatibility with the great parts of node-feedparser, e.g. resolving relative URLs.

Requirements

This library requires the following to run:

Usage

Install with npm:

npm install @rowanmanning/feed-parser

Load the library into your code:

const { parseFeed } = require('@rowanmanning/feed-parser');

// or

import { parseFeed } from '@rowanmanning/feed-parser';

You can use the parseFeed function to parse an RSS or Atom feed as a string. The return value is an object representation of the feed:

const feed = parseFeed('<channel> etc. </channel>');
console.log(feed.title);

This will try to parse even invalid feeds, but if no data can be pulled out an error will be thrown with a code property set to INVALID_FEED.

This library does not parse feeds via a URL, you can do so relatively easily with fetch:

const response = await fetch('https://github.com/rowanmanning/feed-parser/releases.atom');
const feed = parseFeed(await response.text());

Parsed feed

The feed object returned by parseFeed has the following properties.

Feed

Represents an RSS or Atom feed.

Property Type Notes
authors FeedAuthor[] The feed authors. Always an array but sometimes empty if no authors are found.
categories FeedCategory[] The feed categories. Always an array but sometimes empty if no categories are found.
copyright string | null The feed's copyright notice.
description string | null A short description of the feed.
generator FeedGenerator | null The software that generated the feed.
image FeedImage | null An image representing the feed.
items FeedItem[] The content items in the feed. Always an array but sometimes empty if no items are found.
language string | null The language the feed is written in.
meta FeedMeta Meta information about the format of the feed.
published Date | null The date the feed was last published.
self string | null A URL pointing to the feed itself.
title string | null The name of the feed.
updated Date | null The date the feed was last updated at.
url string | null A URL pointing to the HTML web page that this feed is for.

FeedAuthor

Represents the author of a Feed or FeedItem.

Property Type Notes
email string | null The author's email address.
name string | null The author's name.
url string | null A URL pointing to a representation of the author on the internet.

FeedCategory

Represents the content category of a Feed or FeedItem.

Property Type Notes
label string | null The category display label.
term string The category identifier. Often the same as the label.
url string | null A URL pointing to a representation of the category on the internet.

FeedGenerator

Represents software that generated a Feed.

Property Type Notes
label string | null The name of the software that generated the feed.
url string | null A URL pointing to further information about the generator.
version string | null The version of the software used to generate the feed.

FeedImage

Represents an image for a Feed.

Property Type Notes
title string | null The alternative text of the image.
url string The image URL.

FeedItem

Represents an RSS item or Atom entry in a Feed.

Property Type Notes
authors FeedAuthor[] The feed item authors. Always an array but sometimes empty if no authors are found.
categories FeedCategory[] The feed item categories. Always an array but sometimes empty if no categories are found.
content string | null The feed item content.
description string | null A short description of the feed item.
id string | null A unique identifier for the feed item.
image FeedImage | null An image representing the feed item.
media FeedItemMedia[] Media associated with the feed item. Always an array but sometimes empty if no items are found.
published Date | null The date the feed item was last published.
title string | null The title of the feed item.
updated Date | null The date the feed item was last updated at.
url string | null A URL pointing to the HTML web page that this feed item represents.

FeedItemMedia

Represents a piece of media attached to a FeedItem.

Property Type Notes
image string | null A URL pointing to an image representation of the media. E.g. a video cover image.
length number | null A length of the media in bytes.
mimetype string | null The full mime type of the media (e.g. `image/jpeg`).
title string | null The title of the media.
type string | null The type of the media (the first part of the mime type, e.g. `audio` or `image`).
url string A URL pointing to the media.

FeedMeta

Represents meta information about a Feed.

Property Type Notes
type "atom" | "rss" The name of the type of feed.
version "0.3" | "0.9" | "1.0" | "2.0" The version of the type of feed.

Supported feed formats

Standards

Feeds that adhere to the following standards are supported and most properties will be parsed:

The following XML namespaces are also parsed, and more data will be parsed out for RSS feeds that implement these:

Leniency

Feeds in the real world rarely comply strictly with the standards and can sometimes be invalid XML. We try to be as lenient as possible, only throwing errors if no data can be pulled out of the feed. We test against a suite of real-world feeds.

Contributing

The contributing guide is available here. All contributors must follow this library's code of conduct.

License

Licensed under the MIT license.
Copyright © 2022, Rowan Manning

Credit

This library takes inspiration from the following: