Skip to content

Infix, Prefix, Star Search

Gilbert Jeiziner edited this page Sep 12, 2013 · 23 revisions

It's possible in our version of SphinxSearch to separate infix and prefix. Infix is usually good for all our stuff, but for the quad_index, prefix is enough.

Also, according to the requirements, we set the min_infix_len value to 2.

Impact of infix/prefix/quadindex

I recreated the indices according to the requirements.

parcel: 1.9GB -> 783MB (no quadindex, detail infix=2, took 3 minutes to create)

address: 4.1GB -> 1.6GB (no quadindex, detail infix=2, took 5.5 minutes to create)

ivs-reg-log: 40MB -> 57MB (both quadindex (prefix=1) and detail (infix = 2), took 80 seconds to create)

gebaeude_wohnungs_register: 4.1G -> 208MB (only quadindex (prefix=1), no details, took 55 minutes to create)

All indices together shrunk from 20GB to 6.3GB (60% smaller)

Compared to RE2

Infix/prefix add to the size of the indices considerably. For example, the address index would be around 100MB without any infix/prefix. Also, infix increases alot more than prefix.

In RE2, we didn't support infix searches. We only supported prefix searches. Therefore, we could probably reduce the sizes of our indices further if we decide to not do infix searches.

Clone this wiki locally