Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: select publication date in split filter #85

Merged
merged 3 commits into from
Mar 15, 2024

Conversation

shouya
Copy link
Owner

@shouya shouya commented Mar 15, 2024

Previously, the split filter is only able to generate a feed whose items has title, link, author, and description.

However, the users can also benefit from having a field for publication date. The publication date field will allow the feed readers to sort by date, and also allow the items to be sorted correctly in the merge filter.

This PR adds a new config field to the split filter named date_selector to enable the users to select the publication date for each article. The selector is optional, and has the same syntax as other selector fields.

One thing to note is that the code employs some heuristic to guess the publication date from the selected elements. The heuristic is as follows:

  1. check the textContent for valid date string
  2. check through all the attributes for valid date string

The following formats are recognized valid date string:

  • rfc3339 (also known as iso8601): 1996-12-19T16:39:57-08:00
  • rfc2822: Thu, 19 Dec 1996 16:39:57 -0800

Here's an example using date_selector for extracting RSS feed for GitHub releases:

  - path: /github-release.xml
    note: Generate release feed for any GitHub repo
    filters:
      - note: Try input <code>https://github.com/shouya/rss-funnel/releases</code> in the source textbox!
      - split:
          title_selector: ".Box .f1 .Link"
          link_selector: ".Box .f1 .Link"
          description_selector: ".Box .markdown-body"
          date_selector: "section .mr-3 relative-time"
      - modify_post: |
          const project_name = feed.title.split(" · ")[1];
          post.title = `New release for ${project_name}: ${post.title}`;

@shouya shouya merged commit 5e70d60 into master Mar 15, 2024
2 checks passed
@shouya shouya deleted the split-with-publish-date branch March 15, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant