Skip to content

Latest commit

 

History

History
9 lines (7 loc) · 454 Bytes

README.md

File metadata and controls

9 lines (7 loc) · 454 Bytes

Term Frequency-Inverse Document Frequency

This package provides utilities for calculating tf-idf for a set of documents. A document is a bag of terms, where the definition of term is left to the caller.

The example program NgramTfIdf calculates tf-idf of n-gram frequencies. It takes a single file as an argument and treats each line of that file as a separate document, calculating tf-idf for n-gram terms.