Starting crawl from long url always loads only one page #233

Revertron · 2024-11-30T12:31:00Z

If I want to crawl some website for new pages I would want to start from some particular page, not the root. Like https://habr.com/en/feed/ for example. But spider always loads only that page and doesn't go further by links on that page.

How can I circumvent this?

And the same behavior is with addition of trailing slash: https://habr.com is starting crawling, but https://habr.com/ is loading only that page and then stops.

The text was updated successfully, but these errors were encountered:

j-mendez · 2024-12-01T12:50:26Z

fixed in latest release - thank you

j-mendez added a commit that referenced this issue Dec 1, 2024

chore(website): fix crawl establish domain removal [#233]

f1f13c1

j-mendez closed this as completed Dec 1, 2024

j-mendez added a commit that referenced this issue Dec 1, 2024

chore(website): fix crawl establish domain removal [#233]

f184914

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Starting crawl from long url always loads only one page #233

Starting crawl from long url always loads only one page #233

Revertron commented Nov 30, 2024

j-mendez commented Dec 1, 2024

Starting crawl from long url always loads only one page #233

Starting crawl from long url always loads only one page #233

Comments

Revertron commented Nov 30, 2024

j-mendez commented Dec 1, 2024