In computer science, the Aho–Corasick algorithm is a string searching algorithm invented by Alfred V. Aho and Margaret J. Corasick.[1] It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings (the "dictionary") within an input text. It matches all strings simultaneously. The complexity of the algorithm is linear in the length of the strings plus the length of the searched text plus the number of output matches. Note that because all matches are found, there can be a quadratic number of matches if every substring matches (e.g. dictionary = a, aa, aaa, aaaa and input string is aaaa).
(from Wiki)
Check this blog for more detail explanation, but it is Chinese
go get github.com/kkdai/aca
package main
import (
"fmt"
. "github.com/kkdai/aca"
)
func main() {
ac := NewACA()
ac.Insert("say")
ac.Insert("she")
ac.Insert("shell")
ac.Insert("shr")
ac.Insert("her")
ac.BuildAC()
fmt.Println(ac.Query("aaashellaashrmmmmmhemmhera"))
//[shell, shr, her]
}
- 跳跃表,字典树(单词查找树,Trie树),后缀树,KMP算法,AC 自动机相关算法原理详细汇总
- Biosequence Algorithms, Spring 2005 Lecture 4: Set Matching and Aho-Corasick Algorithm
- Wiki: Aho–Corasick algorithm
It is one of my project 52.
This package is licensed under MIT license. See LICENSE for details.