It provides customized Github repo recommendations, based on your previous activities on Github. It works by pulling large number of public Github user data from Github Public Event API, modeling each Github repo by model based collaborative filtering and provide customized recommendations.
The code mainly does 5 things
wat | deployed to | tools | wat exactly |
---|---|---|---|
Gathering Data | AWS EC2 | Node.js | Querying Github API for public activity |
Modeling | AWS EMR | Python, Spark | CF modeling stuff given bunch of data to train with |
Predicting | AWS Lambda | Python, Serverless | KNN over repo feature vectors with NumPy |
API Hosting | AWS Lambda | TypeScript, Node.js, Serverless | generic web API services, handle misc stuff like OAuth, user authentication |
UI | Github Pages / S3 | React, CSS, Semantic UI | UI stuff |