A helper for the Script Kit automation app to generate a list of choices from an HTML document.
I made this to scratch an itch of mine. There are sites I visit from time to time to check out what's new. Many of these sites don't provide an API. This helper allows you to 'API-fy' a regular HTML page using CSS selectors and spits out the data in a format that can be readily consumed by John Lindquist's very cool Kit app.
cd
into your ~/.kenv
folder:
npm install @aseemtaneja/kit-list@latest
Create a new script e.g. ~/.kenv/scripts/frontend-masters.js
// Name: Frontend Masters
// Description: Browse courses from frontendmasters.com
// Author: Aseem Taneja
// Twitter: @aseemtaneja
import '@johnlindquist/kit';
const list = await npm('@aseemtaneja/kit-list');
const { choices } = await list('https://frontendmasters.com/courses/', {
containerSelector: '.MediaItem',
hrefSelector: 'h2 a',
descriptionSelector: '.description',
metaSelector: '.Instructor .name',
});
const itemUrl = await arg('Go to', choices);
await $`open ${itemUrl}`;
Create a new script e.g. ~/.kenv/scripts/courses.js
// Name: List
// Description: Browse latest courses from your favourite sites
// Author: Aseem Taneja
// Twitter: @aseemtaneja
// Shortcode: course
import '@johnlindquist/kit';
const list = await npm('@aseemtaneja/kit-list');
const PAGES = [
{
name: '🥋 frontendmasters',
description: 'Courses from frontendmasters.com',
value: {
url: 'https://frontendmasters.com/courses/',
selectors: {
containerSelector: '.MediaItem',
hrefSelector: 'h2 a',
descriptionSelector: '.description',
metaSelector: '.Instructor .name',
},
},
},
{
name: '🥚 egghead',
description: 'Courses from egghead.io',
value: {
url: 'https://egghead.io/q?sortBy=created',
selectors: {
containerSelector: 'a[href^="/playlists"]',
titleSelector: '[data-egghead-card-body] h3',
descriptionSelector: '[data-egghead-card-author] > span',
},
},
},
];
const { url: pageUrl, selectors, options } = await arg('Select', PAGES);
const { choices } = await list(pageUrl, selectors, options);
const itemUrl = await arg('Go to', choices);
await $`open ${itemUrl}`;
function list(
url: string,
selectors: ItemSelectors,
listOptions?: ListOptions | undefined
): Promise<{
data: ItemData[];
choices: Choice<string>[];
}>
A valid http/https url of the page to be scraped.
containerSelector
(required) - selector for a wrapper element (doubles as the url selector if nohrefSelector
is specified – useful for 'card' layouts where the 'card' is an anchor tag).🚨 All other selectors use the container element (specified by the
containerSelector
) as contexthrefSelector
(optional) - selector for the anchor tag which specifies the item url (doubles as the title selector if notitleSelector
is specified)/titleSelector
(optional) - selector for item title.descriptionSelector
(optional) - selector for item description.metaSelector
(optional) - selector for item meta (prepended to the title).
meta
- object that controls how item meta appears in the choices.hide
- hide item meta.afterTitle
- append meta to title (instead of prepending it).
translate
- an options object as expected by @vitalets/google-translate-api which can be used to translate the item titles and descriptions to the desired language.
A Promise
which resolves to an object
with the following properties:
An array of 'choice' objects. Each item is a 'choice' object (structurally expected by the Kit app) with the following properties:
name
- the item title and item meta (if any)description
- the item description (if any) or the item urlvalue
- the item url
An array of `item data' objects. Each item in the array has the following properties:
title
- the item title or'No title'
if one couldn't be foundurl
- the item url or the page url if no item url could be founddescription
- the item description or an empty stringmeta
- the item meta or an empty string
💡
data
can be used to format your results in the way you want if whatchoices
returns does not cut it for you.