GitHub - ziord/robin: XML/HTML parser and processing library for JavaScript and TypeScript

[VIEW DOCUMENTATION]

Robin is an XML parser and processing library that supports a sane version of HTML. It features a set of DOM utilities, including support for XPath 1.0 for interacting with and manipulating XML/HTML documents. Typical use-cases would be processing XML or HTML files, web scraping, etc. Worthy to note that robin is a non-validating parser, which means that DTD structures are not used for validating the markup document.

Quick Start

All samples below are for the Node.js runtime.

Parsing a Document

JavaScript

const { Robin } = require("@ziord/robin");

const robin = new Robin("<tag id='1'>some value<data id='2'>123456</data></tag>", "XML"); // use "XML" mode - which is the default mode - for XML documents ("HTML" for HTML documents)

// pretty-printing the document
console.log(robin.prettify());

// alternatively
// const root = new Robin().parse("...some markup...");
// console.log(root.prettify());

TypeScript

import { Robin } from "@ziord/robin";

const robin = new Robin("<div id='1'>some value<span id='2'>123456</span></div>", "HTML"); // mode "HTML" for HTML documents
console.log(robin.prettify());

Finding an Element Using the DOM API

_{By Name}

JavaScript

// find "data" element
const element = robin.dom(robin.getRoot()).find("data");

// pretty-print the element
console.log(element.prettify());

TypeScript

// find "data" element
import { ElementNode } from "@ziord/robin";

const element = robin.dom(robin.getRoot()).find<ElementNode>("span")!;

// pretty-print the element
console.log(element.prettify());

_{By Filters}

JavaScript

const { DOMFilter } = require("@ziord/robin");

const root = robin.getRoot();
// find the first "data" element
robin.dom(root).find({filter: DOMFilter.ElementFilter("data")});

// find the first element having attribute "id"
robin.dom(root).find({filter: DOMFilter.AttributeFilter("id")});

// find the first element having attributes "id", "foo"
robin.dom(root).find({filter: DOMFilter.AttributeFilter(["id", "foo"])});

// find the first element having attribute "id"="2"
robin.dom(root).find({filter: DOMFilter.AttributeFilter({ id: "2" })});

// find the first "data" element having attribute "id"="2"
robin.dom(root).find({filter: DOMFilter.ElementFilter("data", { id: "2" })});

The TypeScript variant pretty much follows the same logic. There are also lots of other utility functions available in the API.

Finding an Element Using XPath

_{By Queries}

JavaScript

// find "data" element
const element = robin.path(robin.getRoot()).queryOne("/tag/data");

// pretty-print the element
console.log(element.prettify());

TypeScript

// find "data" element
import { ElementNode } from "@ziord/robin";

const element = robin.path(robin.getRoot()).queryOne<ElementNode>("//span")!;

// pretty-print the element
console.log(element.prettify());

The XPath API also provides other utilities such as query, and queryAll

Finding an Attribute

_{From an element}

JavaScript

// find "attributeKey" attribute
const attribute = element.getAttributeNode("attributeKey");
console.log(attribute.prettify());

_{From the DOM using the DOM API}

JavaScript

// find "attributeKey" attribute from any "foo" element
const attribute = robin.dom(robin.getRoot()).findAttribute("foo", "attributeKey");
console.log(attribute.prettify());
console.log("key:", attribute.name.qname, "value:", attribute.value);

_{From the DOM using the XPath API}

TypeScript

import { AttributeNode } from "@ziord/robin";
// find "attributeKey" attribute from any "foo" element
const attribute = robin.path(robin.getRoot()).queryOne<AttributeNode>("//foo[@attributeKey]/@attributeKey")!;
console.log("key:", attribute.name.qname, "value:", attribute.value);

Finding a Text

_{From the DOM using the DOM API}

TypeScript

import { TextNode } from "@ziord/robin";
// find any text
const text = robin.dom(robin.getRoot()).find<TextNode>({text: { value: "some part of the text", match: "partial-ignoreCase" }})!; // match: "partial" | "exact" | "partial-ignoreCase" | "exact-ignoreCase"
console.log(text.stringValue());

_{From the DOM using the XPath API}

TypeScript

import { TextNode } from "@ziord/robin";
// find any text
const text = robin.path(robin.getRoot()).queryOne<TextNode>("(//text())[1]")!;
console.log(text.stringValue());
console.log(text.prettify());

Finding a Comment

TypeScript

import { CommentNode } from "@ziord/robin";
// find a comment
const comment = robin.dom(robin.getRoot()).find<CommentNode>({comment: { value: "some part of the comment", match: "partial" }})!; // match: "partial" | "exact" | "partial-ignoreCase" | "exact-ignoreCase"
console.log(comment.stringValue());

Extracting Texts From an Element

JavaScript

// get the element's textual content
let text = robin.dom(element).text(); // string
console.log(text);

// alternatively
text = element.stringValue();
console.log(text);

See the web scraper example for more usage.

Documentation

Check out the docs. You can also take a look at some examples here.

Quick Questions

If you have little questions that you feel isn't worth opening an issue for, use the project's discussions.

Installation

Simply run the following command in your terminal:

npm install @ziord/robin

Contributing

Contributions are welcome! See the contribution guidelines to learn more. Thanks!

Reporting Bugs/Requesting Features

Please open an issue. Checkout the issue template.

License

Robin is distributed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
docs		docs
examples		examples
src		src
tests		tests
.eslintignore		.eslintignore
.eslintrc		.eslintrc
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
babel.config.js		babel.config.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick Start

Parsing a Document

Finding an Element Using the DOM API

Finding an Element Using XPath

Finding an Attribute

Finding a Text

Finding a Comment

Extracting Texts From an Element

Documentation

Quick Questions

Installation

Contributing

Reporting Bugs/Requesting Features

License

About

Releases 1

Packages

Languages

License

ziord/robin

Folders and files

Latest commit

History

Repository files navigation

Quick Start

Parsing a Document

Finding an Element Using the DOM API

Finding an Element Using XPath

Finding an Attribute

Finding a Text

Finding a Comment

Extracting Texts From an Element

Documentation

Quick Questions

Installation

Contributing

Reporting Bugs/Requesting Features

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages