Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating a sitemap for [...slug].astro pages with slugs starting with / results in broken URLs #9946

Closed
1 task done
moose96 opened this issue Feb 2, 2024 · 9 comments
Closed
1 task done
Assignees
Labels
- P2: nice to have Not breaking anything but nice to have (priority)

Comments

@moose96
Copy link
Contributor

moose96 commented Feb 2, 2024

Astro Info

Astro                    v4.3.1
Node                     v18.18.0
System                   Linux (x64)
Package Manager          unknown
Output                   static
Adapter                  none
Integrations             @astrojs/sitemap

If this issue only occurs in one browser, which browser is a problem?

No response

Describe the Bug

I am creating my own integration with CMS and I've decided to store all slugs inside content management system. Thus, my pages folder contains only one page - [...slug].astro. I have complicated logic in getStaticPaths but it migth be simplified to this:

export async function getStaticPaths() {
  return [
    {
      params: {
        slug: undefined,
      }
    },
    {
      params: {
        slug: '/blog'
      }
    },
    {
      params: {
        slug: '/test'
      }
    }
  ];
}

Let's assume my astro.config.mjs looks like:

export default defineConfig({
  site: 'https://example.com',
  integrations: [sitemap()],
});

In this case I get sitemap:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
    <url>
        <loc>https://blog/</loc> <!-- <- wrong url here -->
    </url>
    <url>
        <loc>https://example.com/</loc>
    </url>
    <url>
        <loc>https://test/</loc><!-- <- wrong url here -->
    </url>
</urlset>

What's the expected result?

I expect sitemap to be:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
    <url>
        <loc>https://example.com/blog/</loc>
    </url>
    <url>
        <loc>https://example.com/</loc>
    </url>
    <url>
        <loc>https://example.com/test/</loc>
    </url>
</urlset>

Link to Minimal Reproducible Example

https://stackblitz.com/edit/github-e5zwsf?file=dist%2Fsitemap-0.xml

Participation

  • I am willing to submit a pull request for this issue.
@bluwy
Copy link
Member

bluwy commented Feb 6, 2024

I'm not sure if slugs should have a leading slash or trailing slash, I feel like there would be more bugs down the line if you do so, and not just an issue with the sitemap integration 🤔

@moose96
Copy link
Contributor Author

moose96 commented Feb 7, 2024

I'm not sure if slugs should have a leading slash or trailing slash, I feel like there would be more bugs down the line if you do so, and not just an issue with the sitemap integration 🤔

What kind of bugs? We use leading slashes in our application and everything works except generating sitemap. That's why I think it's the issue with the sitemap integration. If using leading or trailing slashes was forbidded there should be any validation while generating static paths or there should be a note in a documentation that you can't use this kind of slashes in the code with explanation.

@florian-lefebvre
Copy link
Member

I agree with @bluwy, I didn't think leading slashes would be allowed as parameters. Not sure what's the best way here, maybe we should do the following:

  1. If a param starts or ends with a /, remove it
  2. Log a warning
  3. In Astro 5, update the regex and fail is specified?

@lilnasy lilnasy added - P3: minor bug An edge case that only affects very specific usage (priority) needs discussion Issue needs to be discussed and removed needs triage Issue needs to be triaged labels Feb 9, 2024
@ematipico
Copy link
Member

@florian-lefebvre, where should this warning be printed?

@florian-lefebvre
Copy link
Member

florian-lefebvre commented Feb 19, 2024

I'm not 100% sure when this runs in Astro, but basically as soon as we get the result of the getStaticPaths function that returns an "invalid" param?

@ematipico
Copy link
Member

@matthewp what do you think? Does it make sense to you?

@matthewp
Copy link
Contributor

matthewp commented Apr 2, 2024

I think @florian-lefebvre's plan makes sense. There was a related issue where someone was confused today.

@matthewp matthewp added - P2: nice to have Not breaking anything but nice to have (priority) and removed - P3: minor bug An edge case that only affects very specific usage (priority) needs discussion Issue needs to be discussed labels Apr 3, 2024
@matthewp
Copy link
Contributor

matthewp commented Apr 3, 2024

After discussing this we realized that having double slashes is valid in a URL path, (/foo//about is a valid URL!), so we don't want to remove those even though it's what you would want most of the time.

Another idea that could help is having a utility like pathToParam(path) that people can use for this scenario, given that it's a common thing to try and do. With this utility method we can do whatever we want/need to normalize the param.

@matthewp matthewp self-assigned this Apr 3, 2024
@ematipico
Copy link
Member

ematipico commented Oct 9, 2024

Closing. This will be fixed in Astro v5. Follow #7962 (comment) for more information.

Params can contain slashes, but they must be encoded inside getStaticPaths, then the user must decode them when using them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- P2: nice to have Not breaking anything but nice to have (priority)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants