Skip to content

shib71/latin-stemming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

latin-stemming

Functions for getting the stems for Latin words. Based on the Schinke algorithm. The key thing to note about this approach is that it can be applied to words when it doesn't know what language part (e.g. noun vs verb) it is. In those cases it will usually return more than one possible stem.

var stemming = require("latin-stemming");

Constants

The module uses several hard-coded arrays for lookups and replacements. You can access them yourself via:

  • stemming.quewords => an array of -que words that are atomic and NOT 'and'
  • stemming.nounsuffixes => an array of regexes for matching noun suffixes
  • stemming.verbsuffixes => an array of regexes for matching verb suffixes, and what those suffixes should be replaced with in the stem

Functions

stemming.stem(word, config) // => []

Word, I hope, is self explanatory. Config is a struct that can contain several optional values:

  • quewords - override the default quewords list provided by the module
  • nounsuffixes - override the default noun suffix regexes
  • verbsuffixes - override the default verb suffixes and replacements
  • type - if this is "Noun", "Adjective", "Adverb", or "Verb", the stemmer will only apply the relevent stemming rules

stemming.couchkey(config) // => Function

Returns a contextless function that can be used for CouchDB indexes. Config can contain several optional values:

  • condition - string to be used inside an if (...) statement; if you include this, documents will only be processed if they pass the condition
  • wordkey (defaults to 'word') - the document property containing the word to be stemmed
  • typekey (defaults to 'wordclass') - the document property that contains language part of the word (e.g. Noun, Verb); if the property is not available, keys for both verb and noun interpretations will be emitted
  • quewords - override the default quewords list provided by the module
  • nounsuffixes - override the default noun suffix regexes
  • verbsuffixes - override the default verb suffixes and replacements

About

Functions for getting and comparing Latin word stems.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published