[WIP] rushed - shed extension #1006

zelig · 2018-11-20T09:11:39Z

experimental PR - untested - to be discussed

adds to the shed

multi-mode accessors (with update and access manipulating alternative fields/indexes)
continuous optimised batch writes (as is, adds zero performance gain)
continuous iterator support and subscriptions (untested)

adds localstore package with tentative db store using rushed

frncmx

Note: I'll still have to revisit DB.listen(). In textbook examples I did not see WaitGroup used this way... so I want to spend some more time on it.

frncmx · 2018-11-20T11:41:38Z

swarm/shed/rushed/db.go

+)
+
+// batch wraps leveldb batch extending it with a waitgroup and a done channel
+type Batch struct {


We don't expose newBatch(). Do we need to expose Batch?
(or vice-versa, if we expose Batch maybe we should expose newBatch, too)

frncmx · 2018-11-20T11:45:07Z

swarm/shed/rushed/db.go

+// DB extends shed DB with batch execution, garbage collection and iteration support with subscriptions
+type DB struct {
+	*shed.DB                                           // underlying shed.DB
+	update   func(*Batch, Mode, *shed.IndexItem) error // mode-dependent update method


on update/access I would be happy to see more comments :) At this point, as just getting familiar with the code, it's totally not clear what their purpose is

frncmx · 2018-11-20T11:56:05Z

swarm/shed/rushed/db.go

+}
+
+// New constructs a new DB
+func New(sdb *shed.DB, update func(*Batch, Mode, *shed.IndexItem) error, access func(Mode, *shed.IndexItem) error) *DB {


maybe I would extract update and access to a private Type => it would keep the method signature clearer and their definition is already duplicated (see DB struct)

frncmx · 2018-11-20T14:04:23Z

swarm/shed/rushed/db.go

+	}
+	// call the update with the  access mode
+	err := db.update(b, mode, item)
+	if err != nil {


Am I the only one who favors the "simple statement" format if?
I think it's clearer that the variable only needed for a limited scope and does not pollute the scope.
(For me it's also easier to understand as I immediately know the scope I have to consider is just a few lines.)
=> if err:= foo(); err != nil { <3

frncmx · 2018-11-20T14:11:54Z

swarm/shed/rushed/db.go

+	b := newBatch()        // current batch
+	var done chan struct{} //
+	wasdone := make(chan struct{})
+	close(wasdone)


note: I did not get it why the hack we close this channel immediately. 1st though was we missed a defer.

frncmx · 2018-11-20T14:49:03Z

swarm/shed/rushed/db_test.go

+}
+
+func benchmarkPut(b *testing.B, n int, in time.Duration, db putter) {
+	for i := 0; i < b.N; i++ {


before this for loop I would generate the random chunks => so we don't benchmark the generate function, too

actually I would try to start only X number of Goroutines, where X should correlate to the number of CPUs (as we cannot run more in parallel anyway)

each Goroutine would get it's own slice and would try to iterate that as fast with put as possible

frncmx · 2018-11-20T14:52:52Z

swarm/shed/rushed/db_test.go

+	}
+}
+
+func benchmarkPut(b *testing.B, n int, in time.Duration, db putter) {


in parameter not used

frncmx · 2018-11-20T14:54:22Z

swarm/shed/rushed/db_test.go

+	return t.index.Put(*(newItemFromChunk(ch)))
+}
+func BenchmarkPut(b *testing.B) {
+	n := 128	


maybe: n => numberOfChunks ?
It's not clear from here why 128 or why n. Those two together makes it hard to resolve the meaning.

frncmx · 2018-11-20T14:56:49Z

swarm/shed/rushed/db_test.go

+	n := 128	
+	for j := 0; j < 5; j++ {
+		n *= 2
+		in := time.Nanosecond


What is the purpose of in?

frncmx · 2018-11-21T10:15:00Z

swarm/storage/localstore/db.go

+		},
+		EncodeValue: func(fields shed.IndexItem) (value []byte, err error) {
+			b := make([]byte, 16)
+			binary.BigEndian.PutUint64(b[:8], uint64(fields.StoreTimestamp))


Any reason why we don't just implement the BinaryMarshaler interface?

I think we don't use Go interfaces enough. One of the key features of Go are interfaces and they make the code really flexible.

That goes as far as I think EncodeKey and DecodeKey should be binary marshal and unmarshal on an IndexKey and same goes for IndexValue. The IndexIteam could be a composition of the key and value types.

I believe that could result in a more idiomatic Go code and could make the code look immediately familiar to any Gopher.

This comment is pore related how shed is designed, and I would like to see your thoughts on #993. There is a go example test where the usage is demonstrated.

The decision not to use interfaces for encode/decode functions is to avoid performance penalties when using them by introducing abstractions. Interfaces are a bit costly and allocate to heap. Encode/decode functions will be called all the time in swarm, as the storage is lower layer. This is the reasoning why IndexItem is implemented in most simple value semantics, instead interfaces. I tried to document this in comments in the PR #993, and would like to hear other opinions. If needed, I can do another demonstration, maybe in more detail then I did last week where I talked about this decisions.

@janos
The code this complex I have to sit down and read, that's just how my brain works. So I'll go trough shed, too. Thanks for the demonstration offer though. :)

If you have some good articles in mind, our your own measurements about these performance benefits I would be happy to see/read. If not, I'll google. :)

The most simple demonstration how interfaces are complied, escaped or inlined is here golang/go#20116 (comment). But it just boils down to how interfaces in implemented in go and how compiler decides what to leave to runtime.

If you are interested in implementing encode/decode functions with interfaces, I would be glad to see benchmarks and how the example will look like with them as we would need additional types in example/usage instead anonymous functions. Performance was the main question after the first demonstration and for the right reason, as this part of the code is the hot path.

Or maybe we can just leave the performance considerations aside as they may be neglectable compared to database operations. :)

~~I guess this will be a good start for me: golang/go#20116 :)~~
GitHub just did not show your comments without reload.

Also, I guess as to compiler improves some of these penalties will disappear:
golang/go#19361

Or maybe we can just leave the performance considerations aside as they may be neglectable compared to database operations. :)

I guess that's also possible. Perf: never guess. :)

What I was thinking about: Are we optimizing prematurely?

Would it be better 1st write the code with interfaces then optimize?
Considerations:

Maybe the code ends up more ideal in look and we have to tune only a few places.

What about optimizing for change? (XP core value) Change is also a cost and imminent.

If we first go with interfaces, then what would be the blast radius (cost) of replacing them? Given that we try to expose/export only as much as required.

I would vote for interfaces, but I have limited experience here. So I have to dig deeper.

Great, this is the discussion that I was looking for. I would be glad to hear other opinions too. If we are up to use interfaces, then struct fields also can help in simplifying usage by applying declarative approach instead of the current functional one.

swarm/shed/rushed/subscribe.go

zelig · 2018-11-26T16:16:03Z

obsolete as unnecessary based on #1014 (comment)

zelig added 2 commits November 20, 2018 10:01

swarm/shed: new package rushed

9141ccc

swarm/storage: new package localstore

1189c3d

zelig added do-not-merge labels Nov 20, 2018

zelig added this to the 2018 Christmas Edition milestone Nov 20, 2018

zelig self-assigned this Nov 20, 2018

zelig requested review from janos and frncmx November 20, 2018 09:11

zelig added the discussion label Nov 20, 2018

frncmx reviewed Nov 20, 2018

View reviewed changes

frncmx reviewed Nov 21, 2018

View reviewed changes

change to lock

e696a7b

zelig closed this Nov 26, 2018

zelig mentioned this pull request Nov 29, 2018

database rewrite - meta + timeline #1027

Closed

26 tasks

frncmx deleted the rushed branch December 5, 2018 12:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] rushed - shed extension #1006

[WIP] rushed - shed extension #1006

zelig commented Nov 20, 2018

frncmx left a comment •

edited

Loading

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 20, 2018

frncmx Nov 21, 2018

frncmx Nov 21, 2018

janos Nov 21, 2018

frncmx Nov 21, 2018

janos Nov 21, 2018

janos Nov 21, 2018

frncmx Nov 21, 2018 •

edited

Loading

frncmx Nov 21, 2018

frncmx Nov 21, 2018

janos Nov 21, 2018

zelig commented Nov 26, 2018

[WIP] rushed - shed extension #1006

[WIP] rushed - shed extension #1006

Conversation

zelig commented Nov 20, 2018

frncmx left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

frncmx Nov 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zelig commented Nov 26, 2018

frncmx left a comment •

edited

Loading

frncmx Nov 21, 2018 •

edited

Loading