Skip to content
This repository has been archived by the owner on Apr 4, 2022. It is now read-only.
/ pyIR Public archive

Information retrival models using Inverted Index

License

Notifications You must be signed in to change notification settings

hrwX/pyIR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pyIR: Collection of Information Retrieval algorithms

GitHub version

A collection of algorithms for querying a set of documents and returning the ones most relevant to the query.

The algorithms that have been implemented are:

  • Vector Space Model
  • Best Match 25
  • Unigram Language Model using Jelinek Mercer Smoothing

Installation

If you want to be sure you're getting the newest version, you can install it directly from github with

pip install git+ssh://git@github.com/hrwx/pyIR.git

TREC

The algorithms were implemented primarily to run evaluations using the TREC Cranfield collection. The TREC evaluation can be run from the evaluate.py file.