added firefox binary path to config #168

GrayHat12 · 2019-12-15T09:08:27Z

This should save users like me from WebDiver permission denied errors.

I've added Firefox Binary path to config.example.py file now. So when the users create the config file, they can specify the path and not have to encounter WebDriver errors the way I did.

This should save users like me from WebDiver permission denied errors

sundowndev

Few comments. Seems not valid for merging. This will break support for Linux/MacOS/Docker users.

sundowndev · 2019-12-15T11:27:59Z

lib/googlesearch.py

@@ -23,6 +25,7 @@ def closeBrowser():
        browser.quit()

 def search(req, stop):
+    time.sleep(5)


What is this ?

Adding this sleep command here helped me avoid being blocked by google. I've previously done some bot scripts and a small wait usually prevents our scripts from being detected by the website.

sundowndev · 2019-12-15T11:28:41Z

config.example.py

+google_cx_id=''
+firefox_exe_path = r'C:\Program Files\Mozilla Firefox\firefox.exe'


This is not working. You setup a executable path for your own installation of Firefox. This will not work for Linux users.

I think you should put an UNIX path (/usr/bin/firefox) as default value, and add a statement in documentation for Windows users. I don't want to provide explicit support for Windows.

now the script should check if system is windows or not. If windows then it uses firefox binary path else dosen't use it. This should make the script compatible with docker/linux/mac and easy for windows users. Also increased the time.sleep to 8 seconds since I experienced some blocks with 5 second timeout.

GrayHat12 · 2019-12-16T04:37:42Z

Check out the latest commit. This should resolve all your current issues. Also the time.sleep(8) should now prevent you from being blocked so you can use selenium webdriver in headless mode. Sure it makes the script a bit slower but should be good overall since the user won't have to keep solving captcha's

sundowndev

Interesting idea to add a field in config, but the default value shouldn't be related to Windows.

sundowndev · 2019-12-18T16:59:40Z

config.example.py

+google_cx_id=''
+firefox_exe_path = r'C:\Program Files\Mozilla Firefox\firefox.exe'


I think you should put an UNIX path (/usr/bin/firefox) as default value, and add a statement in documentation for Windows users. I don't want to provide explicit support for Windows.

sundowndev · 2019-12-18T17:00:27Z

lib/googlesearch.py

@@ -32,7 +36,11 @@ def search(req, stop):
        if os.environ.get('webdriverRemote'):
            browser = webdriver.Remote(os.environ.get('webdriverRemote'), webdriver.DesiredCapabilities.FIREFOX.copy())
        else:
-            browser = webdriver.Firefox()
+            if os.name == 'nt':


Use the path directly, or check if it's defined/not empty.

Added default binary path for linux as requested now instead of checking the operating system type for using binary path , the script simply checks if the binary path holds a value or is empty and uses the binary path accordingly Hope this should finally resolve all issues

sundowndev · 2019-12-18T17:59:39Z

lib/googlesearch.py

@@ -23,6 +26,7 @@ def closeBrowser():
        browser.quit()

 def search(req, stop):
+    time.sleep(10)


Waiting 10 sec at each request is ridiculous, it shouldn't be handled that way. A rotating proxy must be used to avoid getting blocked.

Implemented the rotating proxy. Have a look

a rotating proxy as you asked for This is my first ever implementation of a rotating proxy. Might have some bugs. I just tested and it seemed to work fine for me. We can sure fix if any bugs arise in the future

sundowndev · 2019-12-19T13:04:18Z

Well.. I didn't ask you to add a rotating proxy feature but anyway I appreciate your concern about this issue 😄

Scraping free-proxy-list.net is a anti-pattern, HTML is subject to changes, just create an array of IPs instead
Google blocks free proxies
If we create such a feature, it has to be isolated in a component, then used in the search feature to avoid increasing code complexity; we also have to be able to disable it in the config file

What I'd suggest is to stay focus on the initial issue, which is the Firefox executable path, and simply use the variable firefox_exe_path if it's not empty; this means it is defined but empty by default. Also why not adding a mention about this in documentation.

Of course you can still open another pull request to suggest changes about the rotating proxy feature.

GrayHat12 · 2019-12-19T13:24:22Z

okay i mentioned it in the docs. Regarding the rotating proxy, actually I did that in excitement because if you merged this pull request then it'd be my first useful open source contribution. 😃
I'm not sure how to open another pull request to the same repo and branch but okay i'll search it on google and maybe go with that.
Although I can implement the disable proxy from configuration right now in this pull request if you want.

sundowndev · 2019-12-19T20:51:12Z

Just so you know, open source maintainers prefer 10 small PRs rather than 1 big PR with tons of features. 1 pull request = 1 feature/fix. Small PR are useful anyway! You can create a new branch on your fork to create another PR if you want to.

sundowndev · 2019-12-20T16:13:18Z

Can you please move rotating proxy related changes to another branch so I can review and merge this one ? Thank you

GrayHat12 · 2019-12-20T16:39:11Z

done. i'll open another pr for rotating proxies later

Implement getFirefoxBrowser method

sundowndev

Good job! Thank you

GrayHat12 added 2 commits December 15, 2019 14:34

added firefox binary path to config

bf54ecd

This should save users like me from WebDiver permission denied errors

Update googlesearch.py

2fdb708

sundowndev requested changes Dec 15, 2019

View reviewed changes

sundowndev added geckodriver/selenium invalid This doesn't seem right labels Dec 15, 2019

sundowndev requested changes Dec 18, 2019

View reviewed changes

sundowndev reviewed Dec 18, 2019

View reviewed changes

rotating proxy

cd8e2b8

a rotating proxy as you asked for This is my first ever implementation of a rotating proxy. Might have some bugs. I just tested and it seemed to work fine for me. We can sure fix if any bugs arise in the future

sundowndev added kind/feature New feature or request and removed invalid This doesn't seem right labels Dec 18, 2019

removed comments

fa07f27

updated documentation

564904c

GrayHat12 added 2 commits December 20, 2019 22:00

Update googlesearch.py

97530f8

Update googlesearch.py

24515e1

GrayHat12 and others added 5 commits December 20, 2019 22:14

Update googlesearch.py

d738a77

Merge branch 'develop' into develop

cea28d0

refactor(lib): google search

57d4549

Implement getFirefoxBrowser method

refactor(config): rename -#firefox_exe_path to firefox_path

9703e86

chore(docs): googlesearch

e098d3f

sundowndev approved these changes Dec 22, 2019

View reviewed changes

sundowndev merged commit f332ff8 into sundowndev:develop Dec 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added firefox binary path to config #168

added firefox binary path to config #168

GrayHat12 commented Dec 15, 2019

sundowndev left a comment •

edited

Loading

sundowndev Dec 15, 2019

GrayHat12 Dec 16, 2019

sundowndev Dec 15, 2019

sundowndev Dec 18, 2019

GrayHat12 commented Dec 16, 2019

sundowndev left a comment

sundowndev Dec 18, 2019

sundowndev Dec 18, 2019

sundowndev Dec 18, 2019

GrayHat12 Dec 18, 2019

sundowndev commented Dec 19, 2019

GrayHat12 commented Dec 19, 2019

sundowndev commented Dec 19, 2019 •

edited

Loading

sundowndev commented Dec 20, 2019

GrayHat12 commented Dec 20, 2019

sundowndev left a comment

		google_cx_id=''
		firefox_exe_path = r'C:\Program Files\Mozilla Firefox\firefox.exe'

added firefox binary path to config #168

added firefox binary path to config #168

Conversation

GrayHat12 commented Dec 15, 2019

sundowndev left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GrayHat12 commented Dec 16, 2019

sundowndev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundowndev commented Dec 19, 2019

GrayHat12 commented Dec 19, 2019

sundowndev commented Dec 19, 2019 • edited Loading

sundowndev commented Dec 20, 2019

GrayHat12 commented Dec 20, 2019

sundowndev left a comment

Choose a reason for hiding this comment

sundowndev left a comment •

edited

Loading

sundowndev commented Dec 19, 2019 •

edited

Loading