Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle large indexes better #106

Open
timshannon opened this issue Apr 24, 2020 · 1 comment
Open

Handle large indexes better #106

timshannon opened this issue Apr 24, 2020 · 1 comment
Assignees

Comments

@timshannon
Copy link
Owner

timshannon commented Apr 24, 2020

See #105

The cost of inserting into an index grows as the number of values an index key refers to grows. This shouldn't be the case. The cost should be the same if your inserting your 1st value or your millionth.

Indexes will need to be able to split into separate key ranges as they grow.

This will need a new index storage format, so I'll need to continue to support the old format and use the new one going forward if possible.

@baryluk
Copy link

baryluk commented May 21, 2022

In my own implementation of indexing (which also supports multi-column indexing), for unique indexes I use this format:

key: table\x00s<INDEXINDEX>\x00<COL1>\x00<COL2> 
value: primary_id_value  ,   which will point to table\x00p\x00<ID>, where the value will be the actual row / struct

PS. COLs are serialized in fixed width format so everything is nicely sorted lexicographically. Mostly as raw []bytes, or hex strings. Strings require some extra care due to their variable length size.

I use this form, because it is trivial to detect unique constraint collisions without doing a range scan, and automatically will be detected by badger on txn.Commit that two concurrent transactions tried to insert same secondary unique key. Encoding the primary key id value in the key itself (similar to what is above), would be actually more efficient on lookups when using badger (there should be no difference when using boltdb), but would require a way more elaborate uniquness detection (possibly 2 phase commit even).

For non-unique indices I use this format:

key: table\x00s<INDEXINDEX>\x00<COL1>\x00<COL2>\x00<PrimaryID> 
value: ""

(For non-unique indices I even disable the value fetching in badger iterators, as value is irrelevant)

When I do a point scan, I do a iteration from table\x00s<INDEX>\x00<COL1>\x00<COL2>\x000000000000000000 to table\x00s<INDEXINDEX>\x00<COL1>\x00<COL2>\x00ffffffffffffffff , and decode the keys on the fly.

This also allows me to utilize prefix-scans to do range scans using partial column information over the index (i.e. leading equalities, and optional one trailing range).

And there is no limit of how many items are there for a given secondary index key, it will scale, and is efficient to add or remove items from it. The long prefix is also not an issue, due to key prefix compression in badger (and most other LSM KV stores).

INDEXINDEX is an internal number of the index used. But using a hash of the index specification could be a bit more safe in regards to upgrades. (Or store it in some metadata in the DB itself as a mapping, and cache in memory on startup).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants