ArchiveDB #1911

nytzuga · 2023-08-24T03:11:34Z

This is the MVP for #1715.

This PR creates a thin database layer on top of database.Database. ArchiveDb is an append only database which stores all state changes happening at every block height. Each record is stored in such way to perform both fast inserts and selects.

Currently its API is quite simple, it has two main functions, one to create a Batch write with a given block height, inside this batch entries can be added with a given value or they can be deleted. It also provides a Get function that takes a given key and a height.

The way it works is as follows:

Height: 10 Set(foo, "foo's value is bar") Set(bar, "bar's value is bar")
Height: 100 Set(foo, "updatedfoo's value is bar")
Height: 1000 Set(bar, "updated bar's value is bar") Delete(foo)

When requesting Get(foo, 9) it will return an errNotFound error because foo was not defined at block height 9, it was defined later. When calling Get(foo, 99) it will return a tuple ("foo's value is bar", 10) returning the value of foo at height 99 (which was set at height 10). If requesting Get(foo, 2000) it will return an error because foo was deleted at height

This that should be considered

Some safe mechanism to avoid updating a key would be nice. If a given key is attempted to update (foo at height 100) twice, it should error and the whole write batch should fail.
Maybe return a different error when a key is deleted, instead of treating it as it never existed. I am thinking of a errKeyDeleted error and return the height where the deletion happened. Thoughts @aaronbuchwald @StephenButtolph ?
Maybe the context can be useful for something?

Why this should be merged

How this works

How this was tested

Unit tests

patrick-ogrady · 2023-08-24T14:13:48Z

x/archivedb/db.go

+// If the value does not exists or it was actually removed an error is returned.
+// Otherwise a value does exists it will be returned, alongside with the height
+// at which it was updated prior the requested height.
+func (db *archiveDB) Get(key []byte, height uint64) ([]byte, uint64, error) {


Very curious what the benchmarked performance on this is.

Wondering if it makes sense to add a cache to avoid iteration for commonly used key + heights?

@patrick-ogrady The iterator will consume just a single entry at most. I don't think it would be overkill to be honest; I will create some benchmark scripts with some real load and many million entries to measure performance.

Could we also add the golang style benchmarks?

I'd be in favor of making any caching a follow-on PR

x/archivedb/codec.go

x/archivedb/key.go

x/archivedb/db.go

darioush

looks like a nice start.

x/archivedb/key.go

x/archivedb/key_test.go

x/archivedb/db.go

darioush · 2023-08-24T23:08:36Z

x/archivedb/db.go

+// If the value does not exists or it was actually removed an error is returned.
+// Otherwise a value does exists it will be returned, alongside with the height
+// at which it was updated prior the requested height.
+func (db *archiveDB) Get(key []byte, height uint64) ([]byte, uint64, error) {


Could we also add the golang style benchmarks?

x/archivedb/key.go

x/archivedb/db.go

x/archivedb/key.go

nytzuga · 2023-08-28T15:17:18Z

if the length of the key ends up encoded in the bytes does this make iteration difficult?

@darioush No, because when iterating I know the last 9 bytes are outside of the Prefix, and the last byte is isDeleted and the previous 8 bytes are the block height

patrick-ogrady · 2023-08-28T15:17:56Z

My guess is that we'll want to use bbolt (https://github.com/etcd-io/bbolt) as the underlying db to get wicked fast iterations on the read path (it is a B-Tree).

We should compare with Level/Pebble to see if that is true in practice with large DBs and random reads.

nytzuga · 2023-08-28T15:19:59Z

My guess is that we'll want to use bbolt (https://github.com/etcd-io/bbolt) as the underlying db to get wicked fast iterations on the read path (it is a B-Tree).

We should compare with Level/Pebble to see if that is true in practice with large DBs and random reads.

@patrick-ogrady Yes. @StephenButtolph's idea was to the MVP out ASAP with leveldb and our interface but we find the fastest database engine before going to master with archivedb. I will play with bbolt today.

StephenButtolph · 2023-08-28T15:34:42Z

Bolt is currently used in high-load production environments serving databases as large as 1TB.

Might be pushing Bolt to it's limits.

We should also consider https://github.com/erigontech/mdbx-go (which is just a wrapper around https://libmdbx.dqdkfa.ru/)

darioush · 2023-08-28T15:45:26Z

if the length of the key ends up encoded in the bytes does this make iteration difficult?

@darioush No, because when iterating I know the last 9 bytes are outside of the Prefix, and the last byte is isDeleted and the previous 8 bytes are the block height

As Stephen also mentions, this would be for iterating over keys that share a prefix or in lexicographical order (eg, at a given height). (not referring to iterating over the DB for the most recent record for a single key)

StephenButtolph · 2023-08-28T18:39:17Z

As Stephen also mentions, this would be for iterating over keys that share a prefix or in lexicographical order (eg, at a given height). (not referring to iterating over the DB for the most recent record for a single key)

I was actually talking about a bug in the current Get/Put implementation's internal use of iteration. This DB will not support efficient iteration of the state at a given height to the user.

StephenButtolph · 2023-08-31T03:07:01Z

Please add:

func TestDBKeySpace(t *testing.T) {
	require := require.New(t)

	var (
		key1   = []byte("key1")
		key2   = newKey([]byte("key1"), 2).Bytes()
		key3   = []byte("key3")
		value1 = []byte("value1@1")
		value2 = []byte("value2@2")
		value3 = []byte("value3@3")
	)
	require.NotEqual(key1, key2)
	require.NotEqual(key1, key3)
	require.NotEqual(key2, key3)

	db, err := getBasicDB()
	require.NoError(err)

	writer, err := db.NewBatch()
	require.NoError(err)
	require.NoError(writer.Put(key1, value1))
	require.Equal(uint64(1), writer.Height())
	require.NoError(writer.Write())

	writer, err = db.NewBatch()
	require.NoError(err)
	require.NoError(writer.Put(key2, value2))
	require.Equal(uint64(2), writer.Height())
	require.NoError(writer.Write())

	writer, err = db.NewBatch()
	require.NoError(err)
	require.NoError(writer.Put(key3, value3))
	require.Equal(uint64(3), writer.Height())
	require.NoError(writer.Write())

	val, height, err := db.Get(key1, 3)
	require.NoError(err)
	require.Equal(uint64(1), height)
	require.Equal(value1, val)
}

as a regression test.

Signed-off-by: Cesar <137245636+nytzuga@users.noreply.github.com> Co-authored-by: aaronbuchwald <aaron.buchwald56@gmail.com>

Signed-off-by: Cesar <137245636+nytzuga@users.noreply.github.com> Co-authored-by: Stephen Buttolph <stephen@avalabs.org> Co-authored-by: Richard Pringle <rpring9@gmail.com> Co-authored-by: Darioush Jalali <darioush.jalali@avalabs.org> Co-authored-by: Dan Laine <daniel.laine@avalabs.org>

Context: #1911 (comment)

Unless otherwise notice, all operations, by default, are performed using the last known height. Every operation has their counterpart function with a given height.

Remove the unimplemented features, pretty much anything related with iterators

1. Remove height safety checks to make the code simpler 2. Use single byte metadata as keys 3. Use `NewIteratorWithStartAndPrefix` to make iteration simpler. Introduced `parseDBKeyPrefix` to extract the prefix from a dbKey

The database user can now write at any height

x/archivedb/key.go

darioush · 2023-09-21T19:11:16Z

x/archivedb/prefix_test.go

+	var (
+		key          = []byte("key")
+		value        = []byte("value")
+		maliciousKey = newDBKey(key, 2)


nit: could we pick a different name for maliciousKey?

darioush · 2023-09-21T19:12:35Z

x/archivedb/prefix_test.go

+	require.Equal(value, val)
+	height, err := db.GetHeight(key)
+	require.NoError(err)
+	require.Equal(uint64(1), height)


could we also get the maliciousKey at the end of the iteration?

darioush · 2023-09-21T19:12:59Z

x/archivedb/prefix_test.go

+	require.Equal(uint64(1), writer.Height())
+	require.NoError(writer.Write())
+
+	for i := 0; i < 10000; i++ {


seems we could get the same benefit with a shorter loop like 10 or 100 since this is not a benchmark.

darioush · 2023-09-21T19:14:00Z

x/archivedb/prefix_test.go

+	require.Equal(uint64(1), height)
+}
+
+func TestDBMoreEfficientLookups(t *testing.T) {


how is this test different from the one above?

x/archivedb/key_value_reader.go

x/archivedb/db.go

darioush · 2023-09-21T20:17:22Z

x/archivedb/db_test.go

+	for _, test := range tests {
+		db, err := getBasicDB()
+		require.NoError(t, err)
+		test(t, db)


darioush · 2023-09-21T20:22:40Z

x/archivedb/batch.go

+	c.db.lock.Lock()
+	defer c.db.lock.Unlock()
+
+	batch := c.db.inner.NewBatch()


could we take a batch from the inner db when the batch struct is created then directly call put on the batch from the inner db with the modified keys when put/delete operations are done?

seems it would simplify this file a bit.

x/archivedb/key.go

patrick-ogrady · 2023-09-22T03:05:01Z

x/archivedb/batch.go

+		}
+	}
+
+	if err := database.PutUInt64(batch, heightKey, c.height); err != nil {


We should enforce that this is > what is currently in the database?

I think the DB should expose the height for atomic-ness... but if the user wants to modify something at an older height I think it's fine to let them

nytzuga requested review from StephenButtolph and aaronbuchwald August 24, 2023 03:11

nytzuga self-assigned this Aug 24, 2023

nytzuga requested review from danlaine, darioush and dboehm-avalabs as code owners August 24, 2023 03:11

nytzuga marked this pull request as draft August 24, 2023 03:11

nytzuga force-pushed the prototype-archivedb branch from 220b315 to d49c883 Compare August 24, 2023 03:55

patrick-ogrady reviewed Aug 24, 2023

View reviewed changes

patrick-ogrady changed the title ~~ArchiveDb~~ ArchiveDB Aug 24, 2023

StephenButtolph reviewed Aug 24, 2023

View reviewed changes

x/archivedb/codec.go Outdated Show resolved Hide resolved

x/archivedb/key.go Outdated Show resolved Hide resolved

x/archivedb/key.go Outdated Show resolved Hide resolved

x/archivedb/db.go Outdated Show resolved Hide resolved

StephenButtolph reviewed Aug 24, 2023

View reviewed changes

x/archivedb/db.go Outdated Show resolved Hide resolved

patrick-ogrady reviewed Aug 24, 2023

View reviewed changes

x/archivedb/db.go Outdated Show resolved Hide resolved

patrick-ogrady reviewed Aug 24, 2023

View reviewed changes

x/archivedb/db.go Outdated Show resolved Hide resolved

aaronbuchwald reviewed Aug 24, 2023

View reviewed changes

x/archivedb/db.go Outdated Show resolved Hide resolved

darioush reviewed Aug 24, 2023

View reviewed changes

nytzuga force-pushed the prototype-archivedb branch 4 times, most recently from 2e1bc97 to 0369c4b Compare August 25, 2023 20:22

StephenButtolph reviewed Aug 26, 2023

View reviewed changes

x/archivedb/db.go Outdated Show resolved Hide resolved

x/archivedb/db.go Outdated Show resolved Hide resolved

x/archivedb/db.go Outdated Show resolved Hide resolved

x/archivedb/key.go Outdated Show resolved Hide resolved

nytzuga force-pushed the prototype-archivedb branch from 0369c4b to b37846e Compare August 28, 2023 15:18

nytzuga mentioned this pull request Aug 30, 2023

Fetch all keys defined at a given height #1948

Closed

nytzuga and others added 11 commits September 21, 2023 10:49

Introduce GetHeightReader() function (#1956)

72cee4c

Signed-off-by: Cesar <137245636+nytzuga@users.noreply.github.com> Co-authored-by: aaronbuchwald <aaron.buchwald56@gmail.com>

Update comments

7179de6

Improve GetHeightFromLastFoundKey() and Get() functions

20cfc94

Add more unit tests

26f5a62

Remove context from ArchiveDB

a0cbebc

Context: #1911 (comment)

Implementing the database interface

5b4a3a9

Unless otherwise notice, all operations, by default, are performed using the last known height. Every operation has their counterpart function with a given height.

Test database interface

686a8bc

Remove the unimplemented features, pretty much anything related with iterators

Fix linting issues

26c6ce0

Improvements suggested by @aaronbuchwald

b11e0a0

1. Remove height safety checks to make the code simpler 2. Use single byte metadata as keys 3. Use `NewIteratorWithStartAndPrefix` to make iteration simpler. Introduced `parseDBKeyPrefix` to extract the prefix from a dbKey

Add benchmarks to the database

13e0496

nytzuga force-pushed the prototype-archivedb branch from 1556ef2 to e2d2daf Compare September 21, 2023 14:49

Remove safe guards for height

811afef

The database user can now write at any height

nytzuga force-pushed the prototype-archivedb branch from e2d2daf to 811afef Compare September 21, 2023 14:55

dboehm-avalabs reviewed Sep 21, 2023

View reviewed changes

x/archivedb/key.go Outdated Show resolved Hide resolved

dboehm-avalabs reviewed Sep 21, 2023

View reviewed changes

x/archivedb/key.go Outdated Show resolved Hide resolved

darioush reviewed Sep 21, 2023

View reviewed changes

x/archivedb/key.go Outdated Show resolved Hide resolved

StephenButtolph added 2 commits September 21, 2023 22:30

Simplify ArchiveDB interface (#2067)

2ca571d

Merge branch 'dev' into prototype-archivedb

a8e9f2d

StephenButtolph approved these changes Sep 22, 2023

View reviewed changes

StephenButtolph enabled auto-merge September 22, 2023 02:32

patrick-ogrady reviewed Sep 22, 2023

View reviewed changes

Add comment around heights

469ded5

patrick-ogrady approved these changes Sep 22, 2023

View reviewed changes

Merge branch 'dev' into prototype-archivedb

83432e0

StephenButtolph disabled auto-merge September 22, 2023 04:34

StephenButtolph merged commit dd1a148 into dev Sep 22, 2023
16 checks passed

StephenButtolph deleted the prototype-archivedb branch September 22, 2023 04:36

StephenButtolph added this to the v1.10.11 milestone Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ArchiveDB #1911

ArchiveDB #1911

nytzuga commented Aug 24, 2023 •

edited

Loading

patrick-ogrady Aug 24, 2023

nytzuga Aug 24, 2023

darioush Aug 24, 2023

aaronbuchwald Sep 3, 2023

darioush left a comment

darioush Aug 24, 2023

nytzuga commented Aug 28, 2023

patrick-ogrady commented Aug 28, 2023

nytzuga commented Aug 28, 2023

StephenButtolph commented Aug 28, 2023 •

edited

Loading

darioush commented Aug 28, 2023 •

edited

Loading

StephenButtolph commented Aug 28, 2023 •

edited

Loading

StephenButtolph commented Aug 31, 2023

darioush Sep 21, 2023

darioush Sep 21, 2023

darioush Sep 21, 2023

darioush Sep 21, 2023

darioush Sep 21, 2023

darioush Sep 21, 2023

patrick-ogrady Sep 22, 2023

StephenButtolph Sep 22, 2023

ArchiveDB #1911

ArchiveDB #1911

Conversation

nytzuga commented Aug 24, 2023 • edited Loading

This that should be considered

Why this should be merged

How this works

How this was tested

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darioush left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nytzuga commented Aug 28, 2023

patrick-ogrady commented Aug 28, 2023

nytzuga commented Aug 28, 2023

StephenButtolph commented Aug 28, 2023 • edited Loading

darioush commented Aug 28, 2023 • edited Loading

StephenButtolph commented Aug 28, 2023 • edited Loading

StephenButtolph commented Aug 31, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nytzuga commented Aug 24, 2023 •

edited

Loading

StephenButtolph commented Aug 28, 2023 •

edited

Loading

darioush commented Aug 28, 2023 •

edited

Loading

StephenButtolph commented Aug 28, 2023 •

edited

Loading