Benchmarks: End iterator manually vs on GC #627

peakji · 2019-04-27T17:15:45Z

Original discussion: #601

Benchmark script and full outputs: https://gist.github.com/peakji/0fd6c1529951767697480e708806ae33

The benchmark script records time usage of each epoch. And an epoch comes with two flavors: sequential and parallel.

In sequential mode, we first insert some pseudo-random keys (deterministic) with a batch, then scan the keys using an iterator. In parallel mode we insert and scan at the same time, and of course unlike the sequential mode, the randomness makes the program uncertain / nondeterministic.

Between epochs, we do not clear the database.

Result shows that ending on GC is slower than ending eagerly on a busy server with mixed workload. I guess the main cause is that holding snapshots might affect compaction of new keys after the SequenceNumber.

	master-sequential	gc-sequential	master-parallel	gc-parallel
Total time (nanoseconds)	90005813332	104395061841	81394247896	94485047913

The text was updated successfully, but these errors were encountered:

vweevers · 2019-04-27T17:54:33Z

I guess the main cause is that holding snapshots might affect compression of new keys

To be sure: do you mean compaction or compression?

Maybe you can test this hypothesis by manually triggering GC with --expose-gc and global.gc()? That will make the test slower, so it's not suitable to compare to the above tests. Perhaps compare two different manual GC intervals. The expected outcome is that doing less GC cycles is slower.

vweevers · 2019-04-27T18:33:29Z

I'm leaning towards dropping the idea (of ending on GC) entirely.

@chjj To get back to your original assertion that it's impossible to catch errors, it's not:

async function* test () {
  try {
    yield 'result'
  } finally {
    // End the iterator here
    console.log('end')
  }
}

async function main () {
  try {
    for await (let result of test()) {
      throw new Error('error')
    }
  } catch (err) {
    console.log(err.message)
  }
}

main()

$ node example.js
end
error

peakji · 2019-04-28T04:35:20Z

To be sure: do you mean compaction or compression?

My mistake, it should be compaction.

Maybe you can test this hypothesis by manually triggering GC with --expose-gc and global.gc()?

The benchmark was designed to simulate a busy system, where node.js often delays garbage collection until exceeding the memory limit, or simply until when it think it's the time. While relying on GC to end iterators, this behavior will retain a lot of unnecessary snapshots. My point is that we could not trust GC for these kind of tasks, unless using manual global.gc() in production.

Perhaps compare two different manual GC intervals.

Good point! I'll update the script and maybe add some batch.del() operations as well.

vweevers · 2019-05-18T07:11:49Z

After thinking about it some more, I agree with your reasoning and see no reason to confirm it in different ways (like comparing GC intervals).

Closing this; let's decide on next steps in #601.

vweevers mentioned this issue Apr 27, 2019

Proof of concept: asyncIterator Level/abstract-leveldown#338

Closed

vweevers closed this as completed May 18, 2019

vweevers mentioned this issue May 18, 2019

End iterator on GC or db close #601

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks: End iterator manually vs on GC #627

Benchmarks: End iterator manually vs on GC #627

peakji commented Apr 27, 2019 •

edited

Loading

vweevers commented Apr 27, 2019

vweevers commented Apr 27, 2019

peakji commented Apr 28, 2019

vweevers commented May 18, 2019

Benchmarks: End iterator manually vs on GC #627

Benchmarks: End iterator manually vs on GC #627

Comments

peakji commented Apr 27, 2019 • edited Loading

vweevers commented Apr 27, 2019

vweevers commented Apr 27, 2019

peakji commented Apr 28, 2019

vweevers commented May 18, 2019

peakji commented Apr 27, 2019 •

edited

Loading