-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added firefox binary path to config #168
Conversation
This should save users like me from WebDiver permission denied errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments. Seems not valid for merging. This will break support for Linux/MacOS/Docker users.
lib/googlesearch.py
Outdated
@@ -23,6 +25,7 @@ def closeBrowser(): | |||
browser.quit() | |||
|
|||
def search(req, stop): | |||
time.sleep(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this sleep command here helped me avoid being blocked by google. I've previously done some bot scripts and a small wait usually prevents our scripts from being detected by the website.
config.example.py
Outdated
google_cx_id='' | ||
firefox_exe_path = r'C:\Program Files\Mozilla Firefox\firefox.exe' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not working. You setup a executable path for your own installation of Firefox. This will not work for Linux users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should put an UNIX path (/usr/bin/firefox
) as default value, and add a statement in documentation for Windows users. I don't want to provide explicit support for Windows.
now the script should check if system is windows or not. If windows then it uses firefox binary path else dosen't use it. This should make the script compatible with docker/linux/mac and easy for windows users. Also increased the time.sleep to 8 seconds since I experienced some blocks with 5 second timeout.
Check out the latest commit. This should resolve all your current issues. Also the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea to add a field in config, but the default value shouldn't be related to Windows.
config.example.py
Outdated
google_cx_id='' | ||
firefox_exe_path = r'C:\Program Files\Mozilla Firefox\firefox.exe' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should put an UNIX path (/usr/bin/firefox
) as default value, and add a statement in documentation for Windows users. I don't want to provide explicit support for Windows.
lib/googlesearch.py
Outdated
@@ -32,7 +36,11 @@ def search(req, stop): | |||
if os.environ.get('webdriverRemote'): | |||
browser = webdriver.Remote(os.environ.get('webdriverRemote'), webdriver.DesiredCapabilities.FIREFOX.copy()) | |||
else: | |||
browser = webdriver.Firefox() | |||
if os.name == 'nt': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the path directly, or check if it's defined/not empty.
Added default binary path for linux as requested now instead of checking the operating system type for using binary path , the script simply checks if the binary path holds a value or is empty and uses the binary path accordingly Hope this should finally resolve all issues
lib/googlesearch.py
Outdated
@@ -23,6 +26,7 @@ def closeBrowser(): | |||
browser.quit() | |||
|
|||
def search(req, stop): | |||
time.sleep(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting 10 sec at each request is ridiculous, it shouldn't be handled that way. A rotating proxy must be used to avoid getting blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented the rotating proxy. Have a look
a rotating proxy as you asked for This is my first ever implementation of a rotating proxy. Might have some bugs. I just tested and it seemed to work fine for me. We can sure fix if any bugs arise in the future
Well.. I didn't ask you to add a rotating proxy feature but anyway I appreciate your concern about this issue 😄
What I'd suggest is to stay focus on the initial issue, which is the Firefox executable path, and simply use the variable Of course you can still open another pull request to suggest changes about the rotating proxy feature. |
okay i mentioned it in the docs. Regarding the rotating proxy, actually I did that in excitement because if you merged this pull request then it'd be my first useful open source contribution. 😃 |
Just so you know, open source maintainers prefer 10 small PRs rather than 1 big PR with tons of features. 1 pull request = 1 feature/fix. Small PR are useful anyway! You can create a new branch on your fork to create another PR if you want to. |
Can you please move rotating proxy related changes to another branch so I can review and merge this one ? Thank you |
done. i'll open another pr for rotating proxies later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! Thank you
This should save users like me from WebDiver permission denied errors.
I've added Firefox Binary path to
config.example.py
file now. So when the users create the config file, they can specify the path and not have to encounter WebDriver errors the way I did.