悟空全文搜索引擎

高效索引和搜索（1M条微博500M数据28秒索引完，1.65毫秒搜索响应时间，19K搜索QPS）
支持中文分词（使用sego分词包并发分词，速度27MB/秒）
支持计算关键词在文本中的紧邻距离（token proximity）
支持计算BM25相关度
支持自定义评分字段和评分规则
支持在线添加、删除索引
支持持久存储
可实现分布式索引和搜索
采用对商业应用友好的Apache License v2发布

安装/更新

go get -u -v github.com/huichen/wukong

需要Go版本至少1.1.1

使用

先看一个例子（来自examples/simplest_example.go）

package main

import (
	"github.com/huichen/wukong/engine"
	"github.com/huichen/wukong/types"
	"log"
)

var (
	// searcher是协程安全的
	searcher = engine.Engine{}
)

func main() {
	// 初始化
	searcher.Init(types.EngineInitOptions{
		SegmenterDictionaries: "github.com/huichen/wukong/data/dictionary.txt"})
	defer searcher.Close()

	// 将文档加入索引，docId 从1开始
	searcher.IndexDocument(1, types.DocumentIndexData{Content: "此次百度收购将成中国互联网最大并购"}, false)
	searcher.IndexDocument(2, types.DocumentIndexData{Content: "百度宣布拟全资收购91无线业务"}, false)
	searcher.IndexDocument(3, types.DocumentIndexData{Content: "百度是中国最大的搜索引擎"}, false)

	// 等待索引刷新完毕
	searcher.FlushIndex()

	// 搜索输出格式见types.SearchResponse结构体
	log.Print(searcher.Search(types.SearchRequest{Text:"百度中国"}))
}

是不是很简单！

然后看看一个入门教程，教你用不到200行Go代码实现一个微博搜索网站。

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
core		core
data		data
docs		docs
engine		engine
examples		examples
storage		storage
testdata		testdata
types		types
utils		utils
vendor		vendor
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum
license.txt		license.txt
wukong.go		wukong.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

悟空全文搜索引擎

安装/更新

使用

其它

About

Releases 1

Packages

Contributors 8

Languages

License

huichen/wukong

Folders and files

Latest commit

History

Repository files navigation

悟空全文搜索引擎

安装/更新

使用

其它

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 8

Languages

Packages