The project simulates on how web search engine works on web by following every link on the web. They use sophisticated algorithms to search efficiently. For example, they don't follow each link equally often; content that changes often is followed more often.
This project will demonstrate crawl a web site that implement socket programming, which is fundamental to writing all internet applications, and also about the HTTP application layer protocol.
To execute the program:
- clone this repository and use linux terminal and makefile to compile the program with the cmd: make
- Run the program with the execution name crawler followed by the url. Example : crawler http://web1.comp30023