AUT Information Retrieval Course Projects
- Preprocessing -- normalizing, tokenizing, stemming, removing stop words
- Creating the positional index
- Query processing -- processing "and", "or", and "not" parts
- Zipf and Heaps laws
- TF-IDF -- term frequency, inverse document frequency
- Similarity metrics -- cosine similarity
- Index elimination -- Champion list
- Elasticsearch