Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap filter does not actually filter pages #7256

Closed
1 task
AkashRajpurohit opened this issue May 31, 2023 · 5 comments · Fixed by #7263
Closed
1 task

Sitemap filter does not actually filter pages #7256

AkashRajpurohit opened this issue May 31, 2023 · 5 comments · Fixed by #7263

Comments

@AkashRajpurohit
Copy link
Contributor

What version of astro are you using?

2.5.6

Are you using an SSR adapter? If so, which one?

Cloudflare

What package manager are you using?

pnpm

What operating system are you using?

Mac

What browser are you using?

Firefox

Describe the Bug

Sitemap filter option does not actually filter based on the callback provided.
The repro shared is simple astro blog example where filter option is applied for not including /api/ routes however generated sitemap still has the routes /api/test

Link to Minimal Reproducible Example

https://stackblitz.com/edit/astro-sitemap-repro

Participation

  • I am willing to submit a pull request for this issue.
@SerekKiri
Copy link
Contributor

Confirm, I can reproduce the issue. One solution might be to use serialize if you need to solve it right away.

integrations: [mdx(), sitemap({
  serialize(item) {
    if (item.url.includes('/api/')) {
	return undefined;
    }
    return item;
  },
})],

@andremralves
Copy link
Contributor

andremralves commented Jun 1, 2023

Probably the problem is in this line:

pageUrls = Array.from(new Set([...pageUrls, ...routeUrls, ...(customPages ?? [])]));

I will work on a solution.

@xirkus
Copy link

xirkus commented Aug 25, 2024

This still doesn't work for me regardless of the number of filter arguments provided (1..N). When I tested with the filter matching the complete site URL, it still generated a full list of pages.

Astro Version: 4.14.4
Site Map Version: 3.1.6

Example:

site: 'https://site.url',
integrations: [tailwind(),
sitemap({
filter: (page) => page !== 'https://site.url/',
}),
mdx(),
]

This will generate a full set of s for the site.

@rnwolf
Copy link

rnwolf commented Sep 22, 2024

Hi @xirkus

The following works for me:

  integrations: [
    react(),
    sitemap({
      filter: (page) =>
        page !== 'https://www.example.com/contact_problem' &&
        page !== 'https://www.example.com/test-a' &&
        page !== 'https://www.example.com/test-b' &&
        page !== 'https://www.example.com/elements' &&
        page !== 'https://www.example.com/contact_success',
    }),
    tailwind({

with "astro": "^4.11.3", and "@astrojs/sitemap": "^3.1.6",

@rnwolf
Copy link

rnwolf commented Sep 22, 2024

Ok based on the astro docs I have made the following changes to astro.config.mjs

  integrations: [
    react(),
    sitemap({
        serialize(item) {
          if (/contact_.*[a-z]|test-[a-z]|elements/.test(item.url)) {  // Update this to exclude more pages from site-map
            return undefined;
          }
          // Make sure that any blog posts with todays date in url and the blog index page have a lastmod date
          let dateString = `${new Date().toLocaleString("en-CA", { timeZone: "Europe/London" }).slice(0, 10)}.*|blog`;
          if (new RegExp(dateString, 'i').test(item.url)) {
            item.changefreq = 'daily';
            item.lastmod = new Date();
            item.priority = 0.9;
          }
          return item;
        },
    }),

The idea is that when I commit a blog post I prefix filename with todays date in the format yyyy-mm-dd_some_slug.mdx. This then results in the sitemap having a lastmod value for this blog post. Hopefully Google with then index this page sooner than it would otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants