Skip to content

Some information retrieval algorithms and datastructures (inverted index, ranking (bm25, tf, idf scores), fuzzy search, ...)

License

Notifications You must be signed in to change notification settings

raphsenn/info-retrieval-notebooks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

info-retrieval-notebooks

  • Designed for viewing in GitHub.

Implemented Algorithms and Datastructures

search

  • InvertedIndex

  • InvertedIndex (via vector space model, linear algebra, sparse matrices)

  • Similarity search (via cosine similarity)

  • Fuzzy string search

  • Ranking and evaluation

databases

  • Basic database operations (project, select, cartesian product)

  • more database operations (equi join, merge join, hash join, group by)

  • SPARQL to SQL algorithm

  • SQL to SPARQL algorithm

Used datasets

IMDB movies dataset

https://www.kaggle.com/datasets/ashpalsingh1525/imdb-movies-dataset

About

Some information retrieval algorithms and datastructures (inverted index, ranking (bm25, tf, idf scores), fuzzy search, ...)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published