-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malformed Sitemap content when url contains searchParams #2420
Comments
Hello @austinbuckler! Do I understand correctly that the problem is that the sitemap content type is not detected correctly because the extension is followed by a query string? If that's the case, the solution that you propose should help. If you want to contribute a patch, feel free to do so - we'll be grateful to accept that! |
You are correct!
Awesome, will submit a patch this week. Thank you for the prompt response! |
@janbuchar here is the patch 🥂
|
Thank you very much! I assumed you'd open a pull request so that we 1. see if it passes tests and 2. can discuss it better. Also, if we accept that PR, you'll be listed as a contributor 🙂 Care to do that? |
Also please don't update changelogs, they are generated. |
sure, will get that done this evening.
good to know, will adjust — the contribution document implies that it should be done manually 😅 |
Ah, it's great that somebody actually reads those 😁 We'll update it! |
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/utils
Issue description
WARN Malformed sitemap content: <url>?from=1234&to=5678
Code sample
Package version
3.9.1
Node.js version
20
Operating system
Linux
Apify platform
I have tested this on the
next
releaseN/A
Other context
I see that the Sitemap module was updated recently to support additional content types. I would like to contribute an additional improvement that changes
sitemapUrl
from a string to a URL object.something like:
The text was updated successfully, but these errors were encountered: