fix(content-blog): links in feed should be absolute #9151

VinceCYLiao · 2023-07-17T08:52:24Z

… absolute

Pre-flight checklist

I have read the Contributing Guidelines on pull requests.
If this is a code change: I have written unit tests and/or added dogfooding pages to fully verify the new behavior.
If this is a new API or substantial change: the PR has an accompanying issue (closes #0000) and the maintainers have approved on my working plan.

Motivation

Test Plan

added two mdx files in dogfood docs
website/_dogfooding/_blog tests/2023-07-19-a.mdx
website/_dogfooding/_blog tests/2023-07-19-b.mdx

Inside 2023-07-19-a.mdx are three links

[absolute full url](https://github.com/facebook/docusaurus)

[absolute url with implicit domain name](/tests/blog/2023/07/19/b)

[relative url](2023-07-19-b.mdx)

Visit /tests/blog/feed.json
1st link stays untouched
2nd link resolved as "https://docusaurus.io/tests/blog/2023/07/19/b"
3rd link also resolved as "https://docusaurus.io/tests/blog/2023/07/19/b"
Which are correct.

Test links

Deploy preview: https://deploy-preview-9151--docusaurus-2.netlify.app/blog/feed.json

Related issues/PRs

issue 9136

facebook-github-bot · 2023-07-17T08:52:29Z

Hi @VinceCYLiao!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

netlify · 2023-07-17T08:57:26Z

✅ [V2]

Name	Link
🔨 Latest commit	`da0fb44`
🔍 Latest deploy log	https://app.netlify.com/sites/docusaurus-2/deploys/64cbd22d4923da00086e8c9f
😎 Deploy Preview	https://deploy-preview-9151--docusaurus-2.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

github-actions · 2023-07-17T08:58:07Z

⚡️ Lighthouse report for the deploy preview of this PR

URL	Performance	Accessibility	Best Practices	SEO	PWA	Report
/	🟠 83	🟢 97	🟠 83	🟢 100	🟠 89	Report
/docs/installation	🟠 76	🟢 100	🟠 83	🟢 100	🟠 89	Report

Josh-Cena

The current solution is too error-prone because it includes too much custom logic. Plus it does not work with all relative links. Consider the following:

<a href="another-post">link</a>

If the page is at /blog/2020/09/13/current-post, then the href should point to /blog/2020/09/13/another-post, not /another-post. I.e. the resolver needs to be aware of the current URL, not just the site's URL.

Also, I would prefer not manually joining URLs in any case. Why not elm.attribs.href = String(new URL(elm.attribs.href, currentPageURL))?

VinceCYLiao · 2023-07-17T09:32:06Z

The current solution is too error-prone because it includes too much custom logic. Plus it does not work with all relative links. Consider the following:
<a href="another-post">link</a>
If the page is at /blog/2020/09/13/current-post, then the href should point to /blog/2020/09/13/another-post, not /another-post. I.e. the resolver needs to be aware of the current URL, not just the site's URL.

Also, I would prefer not manually joining URLs in any case. Why not elm.attribs.href = String(new URL(elm.attribs.href, currentPageURL))?

Thanks for your comments. Sorry that I miss clicked the request review button. I'll try to provide a better solution.

VinceCYLiao · 2023-07-18T16:59:08Z

After reading the Nodejs's doc for the URL class, I found that the URL class itself can just handle this issue.
I have revised my code accordingly and built the website locally, and checked the hrefs in the generated feed files are resolved correctly.

@Josh-Cena Please review at your convenience. Thank you!
feed.json of deploy preview

VinceCYLiao · 2023-07-18T17:23:46Z

Sorry that I forgot to run yarn test. Didn't noticed that so many tests will fail due to the changes. I'll look into how to fix the tests if you find my solution fine.

Josh-Cena · 2023-07-18T17:26:49Z

I think it looks good! In fact the test changes look expected to me. I reckon you don't need to resolve paths that are just anchor links, only ones that actually point to another page. We would also need test cases (either adding a new test post file, or adding a link to the existing post file)

VinceCYLiao · 2023-07-19T08:26:57Z

I think it looks good! In fact the test changes look expected to me. I reckon you don't need to resolve paths that are just anchor links, only ones that actually point to another page. We would also need test cases (either adding a new test post file, or adding a link to the existing post file)

I have added a test case and updated the test plan. Please let me know if it's ok for you. Thanks!

Josh-Cena

Could you add the tests to https://github.com/facebook/docusaurus/tree/main/packages/docusaurus-plugin-content-blog/src/__tests__/__fixtures__/website instead? No one is going to look at the feed of the dogfooding blog.

…quashed commits) Squashed commits: [2db488373] chore: add a new file to test href resolving [6c18cea] docs: added to test if href resolved correctly in feed

VinceCYLiao · 2023-07-21T08:09:24Z

I was thinking to parse the links from the feeds to check if they are correctly resolved, but found it's hard to so and maybe kind of meaningless since the links in feeds are from the object returned by the defaultCreateFeedItems function.
So my idea is to just write test case for the defaultCreateFeedItems function, checking if the full absolute path links are stay on touched while other links are correctly prefixed.
Don't know if the test makes sense for you, and if it is bad to export the defaultCreateFeedItems function just for unit test.

slorber

Thanks

The implementation looks good. 👍

But the way it is tested looks surprisingly complex to me, and unit tests are not passing.

We also need to absolutize image URLs.

Let me know if you need help to figure out how to test that.

slorber · 2023-07-21T13:40:16Z

packages/docusaurus-plugin-content-blog/src/feed.ts

+      $(`div#${blogPostContainerID} a`).each((_, elm) => {
+        const {href} = elm.attribs;
+        if (href) {
+          elm.attribs.href = String(new URL(href, link));
+        }
+      });


LGTM 👍

We will also need to convert image links to absolute, see
#9136 (comment)

https://validator.w3.org/feed/docs/warning/ContainsRelRef.html

Done. Image links now are absolutized.

slorber · 2023-07-21T13:50:48Z

packages/docusaurus-plugin-content-blog/src/__tests__/feed.test.ts

@@ -196,3 +228,95 @@ describe.each(['atom', 'rss', 'json'])('%s', (feedType) => {
    fsMock.mockClear();
  });
 });
+
+describe('Test defaultCreateFeedItems', () => {


That test looks super complex to me and I don't understand why.

Just call createBlogFeedFiles and take a snapshot: we'll review the snapshot and validate it contains what we expect

I think that I don't need to add new test case; instead I just need to update the snapshot. Am I correct ?

slorber · 2023-07-21T13:51:17Z

packages/docusaurus-plugin-content-blog/src/__tests__/feed.test.ts

+function isFullAbsolutePath(str: string) {
+  const domain = 'https://domain.com';
+  const {origin} = new URL(str, domain);
+  return origin !== domain;
+}
+
+async function generateLinksOfBlogPosts(outDir: string, blogPosts: BlogPost[]) {
+  const linksOfBlogPosts: {[postId: string]: string[]} = {};
+  const pathOfFile = path.join(outDir, 'blog');
+  const promises = blogPosts.map(async (post) => {
+    try {
+      const content = await readOutputHTMLFile(post.id, pathOfFile, true);
+      const $ = cheerioLoad(content);
+      const anchorElements = $(`div#${blogPostContainerID} a`);
+      if (anchorElements.length > 0) {
+        const href = anchorElements.map((_, elm) => elm.attribs.href).toArray();
+        linksOfBlogPosts[post.id] = href;
+      }
+    } catch {
+      // post is a draft
+    }
+  });
+  await Promise.all(promises);
+  return linksOfBlogPosts;
+}


I don't think we need that complexity inr our tests

The test case is removed

VinceCYLiao · 2023-07-23T00:35:54Z

Thanks

The implementation looks good. 👍

But the way it is tested looks surprisingly complex to me, and unit tests are not passing.

We also need to absolutize image URLs.

Let me know if you need help to figure out how to test that.

Thanks! I'll make the image URLs also absolutized.

Regarding the test, I'm thinking to create a new mock file which contains anchor elements with absolute/relative/anchor link, and image element with absolute/relative source URLs. And then in the test case just call createBlogFeedFiles and take a snapshot.

VinceCYLiao · 2023-07-23T08:38:33Z

Tested in local and all unit tests are passed.

slorber

I don't undersand how it works anymore 😅

@Josh-Cena do you remember what updates the build-snap folder exactly? Is this updated manually?

@VinceCYLiao how did this PR generate that new src/__tests__/__fixtures__/website/build-snap/blog/blog-with-links/index.html file?

The CI is failing and snapshots are not easy to review 😓
Surprisingly unit tests are passing locally, but not on GitHub action 🤷‍♂️

slorber · 2023-07-27T11:04:17Z

...n-content-blog/src/__tests__/__fixtures__/website/build-snap/blog/blog-with-links/index.html

@@ -0,0 +1,31 @@
+<!doctype html>


How was this file generated?

I created the blog-with-links.mdx in the website/blog folder and ran yarn:build:website:blogOnly. If this is not the way how the files in fixtures created, please let me know the create way to do it.

slorber · 2023-07-27T11:04:32Z

...s/docusaurus-plugin-content-blog/src/__tests__/__fixtures__/website/blog/blog-with-links.mdx

+import dino from "../static/img/docusaurus.png";
+import useBaseUrl from '@docusaurus/useBaseUrl';
+


ES imports are supposed to come after front matter

Order of imports are now correct. I also moved the front matter to the beginning of the file. Misplacing front matter seems to be the reason why tests failed.

slorber · 2023-07-27T11:05:08Z

packages/docusaurus-plugin-content-blog/src/feed.ts

+              elm.attribs.srcset = srcset
+                .split(',')
+                .map((s) => {
+                  const [imageURL, ...descriptors] = s.trim().split(/\s+/);
+                  const newImageURL = new URL(imageURL ?? '', link).href;
+                  return [newImageURL, ...descriptors].join(' ');
+                })
+                .join(', ');


Looks a bit unsafe/risky, maybe introduce a dedicated lib to manipulate srcset reliably instead? see https://www.npmjs.com/package/srcset

Thanks. I'll look into it and revise my code

Done and since latest version of srcset is pure ESM, so I have to use the previous version.

Josh-Cena · 2023-07-28T15:01:15Z

do you remember what updates the build-snap folder exactly? Is this updated manually?

I think I added this part of test but I don't remember how it works either. My guess is it's manual.

VinceCYLiao · 2023-07-29T01:44:41Z

I don't undersand how it works anymore 😅

@Josh-Cena do you remember what updates the build-snap folder exactly? Is this updated manually?

@VinceCYLiao how did this PR generate that new src/__tests__/__fixtures__/website/build-snap/blog/blog-with-links/index.html file?

The CI is failing and snapshots are not easy to review 😓 Surprisingly unit tests are passing locally, but not on GitHub action 🤷‍♂️

Just run the tests again and they are all passed. Sorry for the confusion and I'll look into why the tests are failing on github.

…lute

Josh-Cena

The implementation looks great to me! Just one stylistic suggestion.

packages/docusaurus-plugin-content-blog/src/feed.ts

Co-authored-by: Joshua Chen <sidachen2003@gmail.com>

slorber · 2023-08-03T14:58:08Z

Hold on, I'm fixing the build-snap generation thing in another PR before merging this PR

Josh-Cena · 2023-08-03T15:02:41Z

Note that you would also want to rebase to get rid of the extra commits

VinceCYLiao requested review from slorber, lex111 and Josh-Cena as code owners July 17, 2023 08:52

Josh-Cena changed the title ~~fix: #9136 Links in blog posts rendered in a feed (rss/atom/json) should be absolute~~ fix(content-blog): links in feed should be absolute Jul 17, 2023

Josh-Cena requested changes Jul 17, 2023

View reviewed changes

VinceCYLiao requested a review from Josh-Cena July 17, 2023 09:26

facebook-github-bot added the CLA Signed Signed Facebook CLA label Jul 17, 2023

fix: facebook#9136:fix(content-blog): links in feed should be absolute

b607956

VinceCYLiao force-pushed the fix/issue#9136 branch from 77fff54 to b607956 Compare July 18, 2023 16:35

Josh-Cena requested changes Jul 19, 2023

View reviewed changes

feat: add test to check if links in feed are resolved correctly (+2 s…

a339fc9

…quashed commits) Squashed commits: [2db488373] chore: add a new file to test href resolving [6c18cea] docs: added to test if href resolved correctly in feed

VinceCYLiao force-pushed the fix/issue#9136 branch from 6c18cea to a339fc9 Compare July 21, 2023 07:53

VinceCYLiao requested a review from Josh-Cena July 21, 2023 08:10

slorber requested changes Jul 21, 2023

View reviewed changes

VinceCYLiao added 5 commits July 23, 2023 15:47

feat: absolutize img URLs

e65995e

chore: remove test case added in previous commit

aaa869a

chore: update fixture

af9cbe2

chore: add for cSpell

10ba5cb

feat: update test cases as added on new fixture file and update snapshot

6a8345d

VinceCYLiao requested a review from slorber July 23, 2023 08:41

slorber added the pr: bug fix This PR fixes a bug in a past release. label Jul 27, 2023

slorber requested changes Jul 27, 2023

View reviewed changes

Josh-Cena mentioned this pull request Jul 28, 2023

Non-https URLs http://schema.org/Blog and BlogPosting in classic theme #9181

Closed

7 tasks

VinceCYLiao added 5 commits August 3, 2023 13:24

chore: correct order of front matter and imports

84493db

feat: use srcset lib to parse urls in img's srcset and make them abso…

f382af6

…lute

chore: update snapshot

08e17db

chore: remove console.log

df68905

chore: remove unneeded export declaration

646d22a

VinceCYLiao force-pushed the fix/issue#9136 branch from 0297631 to 646d22a Compare August 3, 2023 05:44

VinceCYLiao requested a review from slorber August 3, 2023 05:53

Josh-Cena approved these changes Aug 3, 2023

View reviewed changes

packages/docusaurus-plugin-content-blog/src/feed.ts Outdated Show resolved Hide resolved

packages/docusaurus-plugin-content-blog/src/feed.ts Outdated Show resolved Hide resolved

slorber and others added 2 commits August 3, 2023 16:12

Update packages/docusaurus-plugin-content-blog/src/feed.ts

d34fda6

Co-authored-by: Joshua Chen <sidachen2003@gmail.com>

Update packages/docusaurus-plugin-content-blog/src/feed.ts

0016a18

Co-authored-by: Joshua Chen <sidachen2003@gmail.com>

slorber mentioned this pull request Aug 3, 2023

test(blog-plugin): fix ability to generate proper blog website fixture build snapshot #9195

Merged

simplify impl

2158d7c

slorber force-pushed the fix/issue#9136 branch from 8099300 to 2158d7c Compare August 3, 2023 16:02

slorber added 5 commits August 3, 2023 18:04

Merge branch 'main' into fix/issue#9136

a62b99f

replace docusaurus.png

c04e223

improve blog-with-links.mdx file links

0afae8b

regenerate build-snap

1f8dfad

Update feed snapshots

da0fb44

slorber merged commit 109ab0c into facebook:main Aug 3, 2023

This was referenced Oct 19, 2023

chore: v3.0.0-rc.0 release #9418

Merged

chore: v3.0.0-rc.1 release #9453

Merged

Abuchtela mentioned this pull request May 20, 2024

[Snyk] Upgrade @docusaurus/preset-classic from 3.0.0 to 3.2.1 Abuchtela/aptos-core#42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(content-blog): links in feed should be absolute #9151

fix(content-blog): links in feed should be absolute #9151

VinceCYLiao commented Jul 17, 2023 •

edited

Loading

facebook-github-bot commented Jul 17, 2023

netlify bot commented Jul 17, 2023 •

edited

Loading

github-actions bot commented Jul 17, 2023 •

edited

Loading

Josh-Cena left a comment

VinceCYLiao commented Jul 17, 2023

VinceCYLiao commented Jul 18, 2023

VinceCYLiao commented Jul 18, 2023

Josh-Cena commented Jul 18, 2023

VinceCYLiao commented Jul 19, 2023

Josh-Cena left a comment

VinceCYLiao commented Jul 21, 2023

slorber left a comment

slorber Jul 21, 2023

VinceCYLiao Jul 23, 2023

slorber Jul 21, 2023

VinceCYLiao Jul 23, 2023

slorber Jul 21, 2023

VinceCYLiao Jul 23, 2023

VinceCYLiao commented Jul 23, 2023

VinceCYLiao commented Jul 23, 2023

slorber left a comment •

edited

Loading

slorber Jul 27, 2023

VinceCYLiao Jul 29, 2023

slorber Jul 27, 2023

VinceCYLiao Aug 3, 2023

slorber Jul 27, 2023

VinceCYLiao Jul 29, 2023

VinceCYLiao Aug 3, 2023 •

edited

Loading

Josh-Cena commented Jul 28, 2023

VinceCYLiao commented Jul 29, 2023 •

edited

Loading

Josh-Cena left a comment

slorber commented Aug 3, 2023 •

edited

Loading

Josh-Cena commented Aug 3, 2023

		import dino from "../static/img/docusaurus.png";
		import useBaseUrl from '@docusaurus/useBaseUrl';

fix(content-blog): links in feed should be absolute #9151

fix(content-blog): links in feed should be absolute #9151

Conversation

VinceCYLiao commented Jul 17, 2023 • edited Loading

Pre-flight checklist

Motivation

Test Plan

Test links

Related issues/PRs

facebook-github-bot commented Jul 17, 2023

Action Required

Process

netlify bot commented Jul 17, 2023 • edited Loading

✅ [V2]

github-actions bot commented Jul 17, 2023 • edited Loading

⚡️ Lighthouse report for the deploy preview of this PR

Josh-Cena left a comment

Choose a reason for hiding this comment

VinceCYLiao commented Jul 17, 2023

VinceCYLiao commented Jul 18, 2023

VinceCYLiao commented Jul 18, 2023

Josh-Cena commented Jul 18, 2023

VinceCYLiao commented Jul 19, 2023

Josh-Cena left a comment

Choose a reason for hiding this comment

VinceCYLiao commented Jul 21, 2023

slorber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VinceCYLiao commented Jul 23, 2023

VinceCYLiao commented Jul 23, 2023

slorber left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VinceCYLiao Aug 3, 2023 • edited Loading

Choose a reason for hiding this comment

Josh-Cena commented Jul 28, 2023

VinceCYLiao commented Jul 29, 2023 • edited Loading

Josh-Cena left a comment

Choose a reason for hiding this comment

slorber commented Aug 3, 2023 • edited Loading

Josh-Cena commented Aug 3, 2023

VinceCYLiao commented Jul 17, 2023 •

edited

Loading

netlify bot commented Jul 17, 2023 •

edited

Loading

github-actions bot commented Jul 17, 2023 •

edited

Loading

slorber left a comment •

edited

Loading

VinceCYLiao Aug 3, 2023 •

edited

Loading

VinceCYLiao commented Jul 29, 2023 •

edited

Loading

slorber commented Aug 3, 2023 •

edited

Loading