Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aborting requests at prerequest #289

Closed
ishan-marikar opened this issue Sep 15, 2018 · 2 comments
Closed

Aborting requests at prerequest #289

ishan-marikar opened this issue Sep 15, 2018 · 2 comments

Comments

@ishan-marikar
Copy link

I've been trying to write my own logic to skip duplicates and was wondering if there was a possible way to abort requests at the preRequest stage.

var c = new Crawler({
    preRequest: function(options, done) {
        if (crawledURLs.includes(options.uri)) {
            // Abort the request?
        } else {
            // Continue with the request.
            return done();
        }
    },
    callback: function(err, res, done) {
        if(err) {
	    console.log(err)
	} else {
	    console.log(res.statusCode)
	}
    }
});
@mike442144
Copy link
Collaborator

Please do not do it in preRequest stage, because you'll be confused with requests rate. reRequest is the last stage before actual requesting. I think you should remove it when queuing new tasks, and suggest you to use seenreq module to simplify the code.

@mike442144
Copy link
Collaborator

close due to inactive, feel free to reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants