Skip to content

BlackRabbitt/mspm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multi-String Pattern Matching algorithm.

Go Report Card GoDoc

This implementation is inspired from Aho-Corasick algorithm

Getting Started

modelA = mspm.NewModel("mspm_model_A")
patternsToSearch = strings.NewReader(words) // words is newline seperated list of words/keywords

// build mspm model with patterns
modelA.Build(patternsToSearch)

inputString := "input document where the above pattern is searched for."
document := strings.NewReader(inputString)
output, err := modelA.MultiTermMatch(document)

// output ~= [{matched_word: n_count}, ..]

Test Coverage

TrieNode vs TrieHashNode benchmark

$ cd github.com/BlackRabbitt/mspm/ds/trie
$ go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/BlackRabbitt/mspm/ds/trie
BenchmarkTrieNodeInsert-4             500000          2582 ns/op
BenchmarkTrieNodeSearch-4           10000000           205 ns/op
BenchmarkTrieHashNodeInsert-4        1000000          1491 ns/op
BenchmarkTrieHashNodeSearch-4       10000000           206 ns/op
PASS
ok      github.com/BlackRabbitt/mspm/ds/trie	8.795s

Resources

  1. Trie
  2. mspm - Multi-String Pattern Matching algorithm. Generally used for Information Retrieval.
  3. Aho-Corasick algorithm

About

Multi-String Pattern Matching Algorithm Using TrieNode

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages