Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to deal with captcha. #5

Closed
leovan opened this issue Jul 7, 2018 · 2 comments
Closed

Unable to deal with captcha. #5

leovan opened this issue Jul 7, 2018 · 2 comments

Comments

@leovan
Copy link
Owner

leovan commented Jul 7, 2018

It seems that we need enter captcha sometimes to download the PDF from sci-hub.

@leovan leovan closed this as completed in 99bb762 Jul 7, 2018
@ziofat
Copy link

ziofat commented Aug 2, 2018

image

Captcha image did not show up.

@leovan
Copy link
Owner Author

leovan commented Aug 3, 2018

Thanks for report. I tried with this case, actually the response of fetching the pdf is not the page with captcha. Since the response is an HTML and it also has an <iframe>, the <iframe> is also an HTML (which was used to determine whether we need give the captcha or not). But the truth is that in the <iframe> is an HMTL, yet not the CAPTCHA page. I tried with sci-hub.tw too, it leads me to another page to download it (https://sci-hub.tw/https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-018-0146-8) and it seems an OPEN ACCESS articles. So I think I can't handle all scenarios of different external websites. I think I can enhance the log output, which may help user to download manually. And I will fix this later in next release.

@leovan leovan reopened this Aug 3, 2018
@leovan leovan closed this as completed in 7b8bab0 Aug 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants