Skip to content
This repository has been archived by the owner on May 1, 2019. It is now read-only.

Strategies to avoid hitting the rate limit? #349

Closed
adamlundrigan opened this issue Jan 28, 2015 · 21 comments
Closed

Strategies to avoid hitting the rate limit? #349

adamlundrigan opened this issue Jan 28, 2015 · 21 comments
Labels

Comments

@adamlundrigan
Copy link
Contributor

At the very least we should be caching the result of API calls used to generate the module view page. A better solution might be to follow Packagist's lead: indefinitely cache the initial result during module add, then listen for repository changes via a webhook handler. Of course, this will require us to instruct the module owner on how to enable it.

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

allready implement as the github client uses etags to avoid increasing the current api limit if a repository content did not changed.

https://developer.github.com/v3/#conditional-requests
https://github.com/EvanDotPro/EdpGithub/blob/master/src/EdpGithub/Listener/Cache.php#L42

authenticated request to github had a rate limit of 5000 calls per hour and 50 for unauthenticated requests. i would wonder if we hit this rate limit the next time again after we use this caching and now authenticated requests on production. no?

@adamlundrigan
Copy link
Contributor Author

I haven't looked into it too closely yet, but I set up my local dev env, logged in, and tried to sync my modules and after a sizable wait received the "API limit has been exceeded" message. My GitHub account has 56 repositories.

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

now i got your point, there is one request to collect all the user repositorys but the old function https://github.com/zendframework/modules.zendframework.com/blob/master/module/ZfModule/src/ZfModule/Service/Module.php#L71 will use one api call against every single github repository in order to filter only zf2 modules.

i would love to see this check moved to the register process where the check will be executed. or we need to tell module owners how to name there repositorys in order to display it in the module listening eg. zf2- reponame. of course we could also save user repositorys in the db or other storage/cache

also if a user only access the user panel the ajax request will fired and force to load this heavy request machine.

other suggestions?

@adamlundrigan
Copy link
Contributor Author

One optimization could be to use the search instead of loading a list of all the user's repositories. As an example, with this query:

user:adamlundrigan filename:Module.php "class Module"

I get only the repositories that match our criteria for being a module (see my result). This also works for organization names.

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

@adamlundrigan awesome 👍 like

@adamlundrigan
Copy link
Contributor Author

Something's fishy with the rate limit on my local clone. Invoking my awesome powers of VDD I added this to EdpGithub\Listener\Cache::postSend:

var_dump($response->getHeaders()->get('X-RateLimit-Remaining')->getFieldValue());

And this is the progression of X-RateLimit-Remaining when I try to sync my repositories:

image

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

woot call decreases the rate limit from 4992 to 29 😨
VDD made my day - thanks 😆

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

ok now it makes sense the search api had a rate limit from only 30 requests per minute. so i think there is no other option as to leave the search api and looking for a different way to display modules - simple hit url field?

@adamlundrigan
Copy link
Contributor Author

Ah yes, a different rate limit for searches would explain that. I agree that it may be best to take the same approach as Packagist: enter your repository URL to add a module to the site (this would also make supporting sources other than GitHub easier as well). If we do add a webhook receiver to catch updates from GitHub we could have it import the module data if the module doesn't already exist on the site.

If nobody is currently working on refactoring/rewriting the module import and sync I can work on implementing the repository url and webhook method.

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

👍

@gianarb
Copy link
Contributor

gianarb commented Jan 28, 2015

@adamlundrigan can you try if you authenticate the request the problem of rate persist? 🔍
Thanks..

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

@gianarb with unauthenticated requests the limit will drop to 5 requests/minute

@gianarb
Copy link
Contributor

gianarb commented Jan 28, 2015

Ok, thanks.. Sorry :)

@ins0 ins0 mentioned this issue Jan 28, 2015
3 tasks
@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

😄 as fiddler pops up the response headers i was thinking this is a bad joke and unauthenticated too ^^ but unfortunately not :)

@gianarb
Copy link
Contributor

gianarb commented Jan 28, 2015

I don't know but I'm building a little application to manage Github PR status.
Login from Github and resume all my repos (103) and I have not this problem..
After the first request (list of repositories) I save this list into the database and exist a button to resume it from github.. mmm

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

the search api make the problem as it had a different (and looooowwer) rate limit as the repository api.

main problem is this zf2 module check function

$response = $this->githubClient->getHttpClient()->request('search/code?q=' . $query);

@gianarb
Copy link
Contributor

gianarb commented Jan 28, 2015

Ok.. Maybe we can change actual flow..
We can check module.php if he tries to enable repository

@ins0
Copy link
Contributor

ins0 commented Jan 28, 2015

#349 (comment) - i like @adamlundrigan suggestion like packagist handels the "add repository" flow but some name standards for repository names would have positive aspects too without forced to look at the repository file content.

clearly identify a repository for zf2- or in the future for zf3- and users would see ontop "this is a zfX- module i can use for zf" e.g. when browsing github

/cc @localheinz @Ocramius @GeeH

@adamlundrigan
Copy link
Contributor Author

Perhaps we could support setting the Composer package type to "zf2-module"? (see Composer docs, here). We would still need to fallback to searching for Module class for all those packages that don't have the type set properly. It also doesn't help the rate limit issue, but following a Packagist-like module add procedure should negate those issues anyway

@adamlundrigan
Copy link
Contributor Author

I've moved the discussion around the module add procedure to #352

@GeeH
Copy link
Contributor

GeeH commented Mar 5, 2015

I believe this is fixed. I added authentication. If it's not please re-open.

@GeeH GeeH closed this as completed Mar 5, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants