Skip to content
This repository has been archived by the owner on Mar 23, 2021. It is now read-only.

Latest commit

 

History

History
26 lines (16 loc) · 1.39 KB

README.md

File metadata and controls

26 lines (16 loc) · 1.39 KB

Google(mini)

3rd Project of Modern Information Retrieval course.

Crawls research gate and indexes papers in the site. Clusters papers, authors and calculates rank for papers based on their citation and references.

Crawler is written from scratch. Indexing and retrieval is done with elastic search 2.1, web interface is powered by flask and bootstrap, numpy helps a lot in performing ranking and clustering calculations.

How to use

Install requirements from requirements.txt file. Creating a python virtual environment is a really good idea.

pip install -r requirements.txt
python ui/ui.py  # requires python3.4 or higher

And open http://127.0.0.1:5000/admin/ in your browser. Crawl, calculate page ranks, perform clustering and finally add documents to index. Now your mini version of google can be used. Just point your browser to http://127.0.0.1:5000/search.

Note: You should setup elastic search before adding documents to index. For more information read here

Contributors