You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The crawler should behave more appropriately when it is encountering HTTP 429 - Too Many Requests errors.
Below is an example log where the website requested the scraper to slow-down but the crawler continued to proceed at the same pace.
Sample website where it happens after some times (happening after more or less 1 hour) : https://radiopaedia.org
Logs capture
{"logLevel":"info","timestamp":"2023-09-05T00:14:58.691Z","context":"worker","message":"Starting page","details":{"workerid":5,"page":"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us"}}
{"logLevel":"info","timestamp":"2023-09-05T00:14:58.692Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":9,"total":410,"pending":6,"failed":1,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:14:56.143Z\",\"url\":\"https://radiopaedia.org/go-ad-free\",\"added\":\"2023-09-05T00:14:35.344Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:58.308Z\",\"url\":\"https://radiopaedia.org/about\",\"added\":\"2023-09-05T00:14:35.347Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:58.691Z\",\"url\":\"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us\",\"added\":\"2023-09-05T00:14:35.348Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:51.911Z\",\"url\":\"https://radiopaedia.org/?lang=us\",\"added\":\"2023-09-05T00:14:35.340Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:35.414Z\",\"url\":\"https://radiopaedia.org/edits?lang=us\",\"added\":\"2023-09-05T00:14:35.335Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:14:58.844Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:14:59.358Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/go-ad-free","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:14:59.358Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/go-ad-free returned status code 429","page":"https://radiopaedia.org/go-ad-free","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:14:59.358Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/go-ad-free returned status code 429","stack":"Error: Page https://radiopaedia.org/go-ad-free returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 3)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/go-ad-free","workerid":3}}
{"logLevel":"warn","timestamp":"2023-09-05T00:14:59.359Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/go-ad-free","workerid":3}}
{"logLevel":"info","timestamp":"2023-09-05T00:14:59.382Z","context":"worker","message":"Starting page","details":{"workerid":3,"page":"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us"}}
{"logLevel":"info","timestamp":"2023-09-05T00:14:59.383Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":10,"total":410,"pending":6,"failed":2,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:14:58.308Z\",\"url\":\"https://radiopaedia.org/about\",\"added\":\"2023-09-05T00:14:35.347Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:58.691Z\",\"url\":\"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us\",\"added\":\"2023-09-05T00:14:35.348Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:59.381Z\",\"url\":\"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us\",\"added\":\"2023-09-05T00:14:35.348Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:51.911Z\",\"url\":\"https://radiopaedia.org/?lang=us\",\"added\":\"2023-09-05T00:14:35.340Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:35.414Z\",\"url\":\"https://radiopaedia.org/edits?lang=us\",\"added\":\"2023-09-05T00:14:35.335Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:14:59.561Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.023Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.024Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us returned status code 429","page":"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.024Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us returned status code 429","stack":"Error: Page https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 5)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us","workerid":5}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:00.027Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/cases/solitary-fibrous-tumor-of-the-dura-4?lang=us","workerid":5}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.054Z","context":"worker","message":"Starting page","details":{"workerid":5,"page":"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.055Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":11,"total":410,"pending":6,"failed":3,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:14:58.308Z\",\"url\":\"https://radiopaedia.org/about\",\"added\":\"2023-09-05T00:14:35.347Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:59.381Z\",\"url\":\"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us\",\"added\":\"2023-09-05T00:14:35.348Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.054Z\",\"url\":\"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:51.911Z\",\"url\":\"https://radiopaedia.org/?lang=us\",\"added\":\"2023-09-05T00:14:35.340Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:35.414Z\",\"url\":\"https://radiopaedia.org/edits?lang=us\",\"added\":\"2023-09-05T00:14:35.335Z\",\"depth\":1}"]}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.083Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/about","workerid":4}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.083Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/about returned status code 429","page":"https://radiopaedia.org/about","workerid":4}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.084Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/about returned status code 429","stack":"Error: Page https://radiopaedia.org/about returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 4)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/about","workerid":4}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:00.085Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/about","workerid":4}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.103Z","context":"worker","message":"Starting page","details":{"workerid":4,"page":"https://radiopaedia.org/feature_images/previous?lang=us"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.104Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":12,"total":410,"pending":6,"failed":4,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:59.381Z\",\"url\":\"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us\",\"added\":\"2023-09-05T00:14:35.348Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.054Z\",\"url\":\"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:51.911Z\",\"url\":\"https://radiopaedia.org/?lang=us\",\"added\":\"2023-09-05T00:14:35.340Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:35.414Z\",\"url\":\"https://radiopaedia.org/edits?lang=us\",\"added\":\"2023-09-05T00:14:35.335Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.249Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us","workerid":5}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:00.264Z","context":"general","message":"Invalid Page - URL must start with http:// or https://","details":{"url":"javascript:;","page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.272Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/feature_images/previous?lang=us","workerid":4}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.288Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.288Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us returned status code 429","page":"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:00.288Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us returned status code 429","stack":"Error: Page https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 3)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us","workerid":3}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:00.289Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/users/jose-roberto-montanez-sauceda?lang=us","workerid":3}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.308Z","context":"worker","message":"Starting page","details":{"workerid":3,"page":"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.310Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":13,"total":437,"pending":6,"failed":5,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.054Z\",\"url\":\"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:51.911Z\",\"url\":\"https://radiopaedia.org/?lang=us\",\"added\":\"2023-09-05T00:14:35.340Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:35.414Z\",\"url\":\"https://radiopaedia.org/edits?lang=us\",\"added\":\"2023-09-05T00:14:35.335Z\",\"depth\":1}"]}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:00.319Z","context":"general","message":"Invalid Page - URL must start with http:// or https://","details":{"url":"javascript:;","page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.355Z","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://radiopaedia.org/edits?lang=us"],"page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.355Z","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://radiopaedia.org/edits?lang=us","page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.357Z","context":"behavior","message":"Run Script Finished","details":{"frameUrl":"https://radiopaedia.org/edits?lang=us","page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.357Z","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.358Z","context":"pageStatus","message":"Page Finished","details":{"loadState":4,"page":"https://radiopaedia.org/edits?lang=us","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.379Z","context":"worker","message":"Starting page","details":{"workerid":0,"page":"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.380Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":14,"total":494,"pending":6,"failed":5,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.378Z\",\"url\":\"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.054Z\",\"url\":\"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:51.911Z\",\"url\":\"https://radiopaedia.org/?lang=us\",\"added\":\"2023-09-05T00:14:35.340Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.472Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg","workerid":3}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:00.510Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference","workerid":0}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:01.069Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:01.069Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us returned status code 429","page":"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:01.069Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us returned status code 429","stack":"Error: Page https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 5)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us","workerid":5}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:01.070Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/articles/solitary-fibrous-tumour-of-the-dura?lang=us","workerid":5}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.072Z","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://radiopaedia.org/?lang=us"],"page":"https://radiopaedia.org/?lang=us","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.072Z","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://radiopaedia.org/?lang=us","page":"https://radiopaedia.org/?lang=us","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.074Z","context":"behavior","message":"Run Script Finished","details":{"frameUrl":"https://radiopaedia.org/?lang=us","page":"https://radiopaedia.org/?lang=us","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.076Z","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://radiopaedia.org/?lang=us","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.077Z","context":"pageStatus","message":"Page Finished","details":{"loadState":4,"page":"https://radiopaedia.org/?lang=us","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.127Z","context":"worker","message":"Starting page","details":{"workerid":5,"page":"https://radiopaedia.org/podcast"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.128Z","context":"worker","message":"Starting page","details":{"workerid":1,"page":"https://radiopaedia.org/articles/playlists-1"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.129Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":16,"total":494,"pending":6,"failed":6,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/podcast\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/articles/playlists-1\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.378Z\",\"url\":\"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.129Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":16,"total":494,"pending":6,"failed":6,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/podcast\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/articles/playlists-1\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.378Z\",\"url\":\"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:14:48.711Z\",\"url\":\"https://radiopaedia.org/quizzes/all?lang=us\",\"added\":\"2023-09-05T00:14:35.338Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.241Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/podcast","workerid":5}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.302Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/articles/playlists-1","workerid":1}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:01.504Z","context":"general","message":"Invalid Page - URL must start with http:// or https://","details":{"url":"javascript:;","page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:01.549Z","context":"general","message":"Invalid Page - URL must start with http:// or https://","details":{"url":"javascript:;","page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.678Z","context":"behavior","message":"Running behaviors","details":{"frames":1,"frameUrls":["https://radiopaedia.org/quizzes/all?lang=us"],"page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.679Z","context":"behavior","message":"Run Script Started","details":{"frameUrl":"https://radiopaedia.org/quizzes/all?lang=us","page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.692Z","context":"behavior","message":"Run Script Finished","details":{"frameUrl":"https://radiopaedia.org/quizzes/all?lang=us","page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.693Z","context":"behavior","message":"Behaviors finished","details":{"finished":1,"page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.694Z","context":"pageStatus","message":"Page Finished","details":{"loadState":4,"page":"https://radiopaedia.org/quizzes/all?lang=us","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.732Z","context":"worker","message":"Starting page","details":{"workerid":2,"page":"https://radiopaedia.org/courses/editing-radiopaedia-articles"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:01.733Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":17,"total":546,"pending":6,"failed":6,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/podcast\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/articles/playlists-1\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.378Z\",\"url\":\"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}"]}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:01.974Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference","workerid":0}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:01.975Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference returned status code 429","page":"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference","workerid":0}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:01.975Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference returned status code 429","stack":"Error: Page https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 0)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference","workerid":0}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:01.975Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/courses/radiopaedia-2023-virtual-conference","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.005Z","context":"worker","message":"Starting page","details":{"workerid":0,"page":"https://radiopaedia.org/impact"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.006Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":18,"total":546,"pending":6,"failed":7,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/podcast\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.127Z\",\"url\":\"https://radiopaedia.org/articles/playlists-1\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.103Z\",\"url\":\"https://radiopaedia.org/feature_images/previous?lang=us\",\"added\":\"2023-09-05T00:14:35.349Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.005Z\",\"url\":\"https://radiopaedia.org/impact\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.177Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/courses/editing-radiopaedia-articles","workerid":2}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.270Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/impact","workerid":0}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.322Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/feature_images/previous?lang=us","workerid":4}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.323Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/feature_images/previous?lang=us returned status code 429","page":"https://radiopaedia.org/feature_images/previous?lang=us","workerid":4}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.323Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/feature_images/previous?lang=us returned status code 429","stack":"Error: Page https://radiopaedia.org/feature_images/previous?lang=us returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 4)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/feature_images/previous?lang=us","workerid":4}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:02.325Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/feature_images/previous?lang=us","workerid":4}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.374Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/podcast","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.374Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/podcast returned status code 429","page":"https://radiopaedia.org/podcast","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.374Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/podcast returned status code 429","stack":"Error: Page https://radiopaedia.org/podcast returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 5)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/podcast","workerid":5}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:02.375Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/podcast","workerid":5}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.639Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/articles/playlists-1","workerid":1}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.642Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/articles/playlists-1 returned status code 429","page":"https://radiopaedia.org/articles/playlists-1","workerid":1}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.642Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/articles/playlists-1 returned status code 429","stack":"Error: Page https://radiopaedia.org/articles/playlists-1 returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 1)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/articles/playlists-1","workerid":1}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:02.644Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/articles/playlists-1","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.662Z","context":"worker","message":"Starting page","details":{"workerid":4,"page":"https://radiopaedia.org/courses/help-creating-cases"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.676Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":21,"total":546,"pending":5,"failed":10,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.363Z\",\"url\":\"https://radiopaedia.org/courses/help-creating-cases\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.005Z\",\"url\":\"https://radiopaedia.org/impact\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.415Z\",\"url\":\"https://radiopaedia.org/courses/help-multiple-choice-questions\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.679Z","context":"worker","message":"Starting page","details":{"workerid":5,"page":"https://radiopaedia.org/courses/help-multiple-choice-questions"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.684Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":21,"total":546,"pending":5,"failed":10,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.363Z\",\"url\":\"https://radiopaedia.org/courses/help-creating-cases\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.005Z\",\"url\":\"https://radiopaedia.org/impact\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.415Z\",\"url\":\"https://radiopaedia.org/courses/help-multiple-choice-questions\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.686Z","context":"worker","message":"Starting page","details":{"workerid":1,"page":"https://radiopaedia.org/peer-review-policy"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.687Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":21,"total":546,"pending":6,"failed":10,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.363Z\",\"url\":\"https://radiopaedia.org/courses/help-creating-cases\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.005Z\",\"url\":\"https://radiopaedia.org/impact\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:00.308Z\",\"url\":\"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg\",\"added\":\"2023-09-05T00:14:35.350Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.684Z\",\"url\":\"https://radiopaedia.org/peer-review-policy\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.415Z\",\"url\":\"https://radiopaedia.org/courses/help-multiple-choice-questions\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}"]}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.760Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.760Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/articles/general-overview-of-radiopaediaorg returned status code 429","page":"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:02.760Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/articles/general-overview-of-radiopaediaorg returned status code 429","stack":"Error: Page https://radiopaedia.org/articles/general-overview-of-radiopaediaorg returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 3)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg","workerid":3}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:02.761Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/articles/general-overview-of-radiopaediaorg","workerid":3}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.797Z","context":"worker","message":"Starting page","details":{"workerid":3,"page":"https://radiopaedia.org/continuing-medical-education-cme"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.798Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":22,"total":546,"pending":6,"failed":11,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.363Z\",\"url\":\"https://radiopaedia.org/courses/help-creating-cases\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.796Z\",\"url\":\"https://radiopaedia.org/continuing-medical-education-cme\",\"added\":\"2023-09-05T00:14:35.353Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.005Z\",\"url\":\"https://radiopaedia.org/impact\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.684Z\",\"url\":\"https://radiopaedia.org/peer-review-policy\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.415Z\",\"url\":\"https://radiopaedia.org/courses/help-multiple-choice-questions\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:02.975Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/courses/help-multiple-choice-questions","workerid":5}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:03.023Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/peer-review-policy","workerid":1}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:03.026Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/courses/help-creating-cases","workerid":4}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:03.050Z","context":"general","message":"Awaiting page load","details":{"page":"https://radiopaedia.org/continuing-medical-education-cme","workerid":3}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.781Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/peer-review-policy","workerid":1}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.781Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/peer-review-policy returned status code 429","page":"https://radiopaedia.org/peer-review-policy","workerid":1}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.781Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/peer-review-policy returned status code 429","stack":"Error: Page https://radiopaedia.org/peer-review-policy returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 1)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/peer-review-policy","workerid":1}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:04.782Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/peer-review-policy","workerid":1}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.799Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/impact","workerid":0}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.799Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/impact returned status code 429","page":"https://radiopaedia.org/impact","workerid":0}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.800Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/impact returned status code 429","stack":"Error: Page https://radiopaedia.org/impact returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 0)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/impact","workerid":0}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:04.800Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/impact","workerid":0}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:04.816Z","context":"worker","message":"Starting page","details":{"workerid":1,"page":"https://radiopaedia.org/editors"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:04.817Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":24,"total":546,"pending":5,"failed":13,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:04.815Z\",\"url\":\"https://radiopaedia.org/editors\",\"added\":\"2023-09-05T00:14:35.353Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.363Z\",\"url\":\"https://radiopaedia.org/courses/help-creating-cases\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.796Z\",\"url\":\"https://radiopaedia.org/continuing-medical-education-cme\",\"added\":\"2023-09-05T00:14:35.353Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.415Z\",\"url\":\"https://radiopaedia.org/courses/help-multiple-choice-questions\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}"]}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:04.844Z","context":"worker","message":"Starting page","details":{"workerid":0,"page":"https://radiopaedia.org/radiopaedia-educational-board"}}
{"logLevel":"info","timestamp":"2023-09-05T00:15:04.850Z","context":"crawlStatus","message":"Crawl statistics","details":{"crawled":24,"total":546,"pending":6,"failed":13,"limit":{"max":0,"hit":false},"pendingPages":["{\"seedId\":0,\"started\":\"2023-09-05T00:15:04.815Z\",\"url\":\"https://radiopaedia.org/editors\",\"added\":\"2023-09-05T00:14:35.353Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.363Z\",\"url\":\"https://radiopaedia.org/courses/help-creating-cases\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:01.732Z\",\"url\":\"https://radiopaedia.org/courses/editing-radiopaedia-articles\",\"added\":\"2023-09-05T00:14:35.351Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:04.843Z\",\"url\":\"https://radiopaedia.org/radiopaedia-educational-board\",\"added\":\"2023-09-05T00:14:35.353Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.796Z\",\"url\":\"https://radiopaedia.org/continuing-medical-education-cme\",\"added\":\"2023-09-05T00:14:35.353Z\",\"depth\":1}","{\"seedId\":0,\"started\":\"2023-09-05T00:15:02.415Z\",\"url\":\"https://radiopaedia.org/courses/help-multiple-choice-questions\",\"added\":\"2023-09-05T00:14:35.352Z\",\"depth\":1}"]}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.905Z","context":"general","message":"Page Load Error, skipping page","details":{"statusCode":429,"page":"https://radiopaedia.org/courses/editing-radiopaedia-articles","workerid":2}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.906Z","context":"general","message":"Page Load Error, skipping page","details":{"msg":"Page https://radiopaedia.org/courses/editing-radiopaedia-articles returned status code 429","page":"https://radiopaedia.org/courses/editing-radiopaedia-articles","workerid":2}}
{"logLevel":"error","timestamp":"2023-09-05T00:15:04.906Z","context":"worker","message":"Worker Exception","details":{"type":"exception","message":"Page https://radiopaedia.org/courses/editing-radiopaedia-articles returned status code 429","stack":"Error: Page https://radiopaedia.org/courses/editing-radiopaedia-articles returned status code 429\n at Crawler.loadPage (file:///app/crawler.js:1083:17)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Crawler.default [as driver] (file:///app/defaultDriver.js:3:3)\n at async Crawler.crawlPage (file:///app/crawler.js:451:5)\n at async PageWorker.timedCrawlPage (file:///app/util/worker.js:165:7)\n at async PageWorker.runLoop (file:///app/util/worker.js:206:9)\n at async PageWorker.run (file:///app/util/worker.js:187:7)\n at async Promise.allSettled (index 2)\n at async Crawler.crawl (file:///app/crawler.js:793:5)\n at async Crawler.run (file:///app/crawler.js:311:7)","page":"https://radiopaedia.org/courses/editing-radiopaedia-articles","workerid":2}}
{"logLevel":"warn","timestamp":"2023-09-05T00:15:04.907Z","context":"pageStatus","message":"Page Load Failed","details":{"loadState":1,"page":"https://radiopaedia.org/courses/editing-radiopaedia-articles","workerid":2}}
The crawler could be enhanced by:
detecting HTTP 429 errors, and in such situation waiting some time (configurable) before continuing
retrying the same page on HTTP 429 errors (the page is available, the website just asked us to slow down)
some websites are even returning an HTTP header Retry-After indicating how long the user agent should wait, could be great to use them
counting the number of HTTP 429 errors and finishing the crawl early if too many of them have been returned in a row (configurable), to not continue to overwhelm a website
The text was updated successfully, but these errors were encountered:
FYI, I finally have a repro of #387, but this is way better handled as stated in this issue:
Cloudflare is responding with an HTTP 429
Cloudflare is returning a Retry-After header with a decent value of 60 seconds, which progressively decreases (59 secs, 57 secs, ...) as the crawler does not respect this parameter
I'm working on a PR, so you could assign me this issue.
The crawler should behave more appropriately when it is encountering
HTTP 429 - Too Many Requests
errors.Below is an example log where the website requested the scraper to slow-down but the crawler continued to proceed at the same pace.
Sample website where it happens after some times (happening after more or less 1 hour) : https://radiopaedia.org
Logs capture
The crawler could be enhanced by:
HTTP 429
errors, and in such situation waiting some time (configurable) before continuingHTTP 429
errors (the page is available, the website just asked us to slow down)Retry-After
indicating how long the user agent should wait, could be great to use themHTTP 429
errors and finishing the crawl early if too many of them have been returned in a row (configurable), to not continue to overwhelm a websiteThe text was updated successfully, but these errors were encountered: