Custom Search Engine Assignment

This is a project I was assigned for the "Information Retrieval" course of my university. The objective of this exercise is to create a simple, custom search engine that maintains the functionality of the more complex, large scale search engines found in the market.

Installation

pip3 install -r requirements.txt
docker-compose up

Execution

Begin by executing the crawler: python3 crawler.py [starting_url] [pages_to_crawl] [append_pages] [number_of_threads]

Where:
- starting_url: The page url from which the crawler will begin the crawling process.
- pages_to_crawl: The number of pages we want to be scanned by the crawler
- append_pages: Determines whether any page data collected from previous crawls should be deleted (append_pages=0) or new page data from the current crawl should be appended to any existing page data (append_pages=1).
- number_of_threads: The number of threads that should be used during the crawler's execution.
Example: python3 myCrawler.py https://en.wikipedia.org/wiki/Apage 100 0 8

After the crawling is completed, the crawler will automatically call the indexer to start the index building process.
Execute the Flask Server in order to access the Search Engine User Interface in localhost: python3 app.py [number_of_threads]

Where number_of_threads is the number of threads that should be used during the query handler's execution (query handler is initialized inside app.py).

Example: python3 app.py 8

Tip:

For the purpose of visualizing MongoDB and easily inspecting its contents, try using MongoDB Compass.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
static/css		static/css
templates		templates
.env		.env
.gitignore		.gitignore
README.md		README.md
app.py		app.py
chapter4-vector.pdf		chapter4-vector.pdf
chapter6-inverted.pdf		chapter6-inverted.pdf
crawler.py		crawler.py
docker-compose.yml		docker-compose.yml
indexer.py		indexer.py
mongodb.py		mongodb.py
project-2019-2020.pdf		project-2019-2020.pdf
query_handler.py		query_handler.py
requirements.txt		requirements.txt
Αναφορά Εργασίας - Σχεδιασμός και Υλοποίηση Μηχανής Αναζήτησης.pdf		Αναφορά Εργασίας - Σχεδιασμός και Υλοποίηση Μηχανής Αναζήτησης.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Custom Search Engine Assignment

Installation

Execution

Tip:

About

Releases

Packages

Contributors 2

Languages

Fantomas4/Custom-Search-Engine-Assignment

Folders and files

Latest commit

History

Repository files navigation

Custom Search Engine Assignment

Installation

Execution

Tip:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages