-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Conversation
for i, v := range f.ids[:len(f.ids)-1] { | ||
if f.ids[i+1] < v { | ||
sort.Sort(pgids(f.ids)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to break
here, else you'll sort, then keep testing all following indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right
6436adc
to
278ef9a
Compare
fixed error |
278ef9a
to
2a05b06
Compare
@@ -6,19 +6,21 @@ import ( | |||
"unsafe" | |||
) | |||
|
|||
type ctrue struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessary. Can you simply use map[pgid]struct{}
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will do
Besides the |
2a05b06
to
87b6edd
Compare
I've added commit: reading freelist on write transaction. |
03fb180
to
5f5b4b1
Compare
last commit looks to be complex :( I will not not be surprised if you reject it. |
5f5b4b1
to
ccd0542
Compare
+1 |
@@ -133,6 +133,17 @@ type pgids []pgid | |||
func (s pgids) Len() int { return len(s) } | |||
func (s pgids) Swap(i, j int) { s[i], s[j] = s[j], s[i] } | |||
func (s pgids) Less(i, j int) bool { return s[i] < s[j] } | |||
func (s pgids) isSorted() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Package sort already has an IsSorted function. No need to add this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort.IsSorted uses interfaces, so it is slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much slower? Slower enough that it is worth adding more code to work around it? Numbers would help a lot here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want numbers, you may measure. If you find I was wrong, I will appologise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually the onus is on you to prove that your performance claims are justified. Without having provided proofs that your optimizations were guided by a scientific method, it's to be expected that others will ask you to justify your claims.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aybabtme , as I've said two times down in a page, I don't care, will this pool request be merged or not.
And I don't gonna prove obvious things. If it is not obvious for someone that function call through interface is slower, that man should study a bit more instead of asking meaningless questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@funny-falcon thanks for your feedback, I hadn't yet read the other comments. I now see that this is an old PR, sorry for the confusion. We can disagree on the burden of proof another time maybe. Have a good day.
@@ -171,7 +172,9 @@ func (f *freelist) read(p *page) { | |||
copy(f.ids, ids) | |||
|
|||
// Make sure they're sorted. | |||
sort.Sort(pgids(f.ids)) | |||
if !pgids(f.ids).isSorted() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you benchmarked to confirm that it is faster on sorted data to first ask whether it is sorted? It is not obvious to me. Many sort algorithms run very quickly on already-sorted data, and when f.ids is large, touching all the data twice may be more expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are talking not about "many sort algorithms", but about sort.Sort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. How does sort.Sort behave on already sorted data? Do you have benchmarks to help understand how much of a win this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you benchmark instead of asking?
Yes, I did benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you bothered to benchmark, why didn't you include it in the commit? And it's not a question of how much faster it is in isolation, it's a question of how much faster it is in the context of things that matter in boltdb.
Anyway, I'll stop reviewing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look, I was investigating Bolt more than year ago. I need it to open many mostly-read-only files fast. That is why I concentrated on open. All patches were benched, but I have no numbers now. Neither I use Bolt now, cause my director didn't ratify my suggestion of using it.
So now I don't care will this pull request be accepted or not.
If you benchmark it and find it will help you, then it will be your deal to push on merging this pull request.
Though, if you find some error here, I will fix it.
cache map[pgid]bool // fast lookup of all free and pending page ids. | ||
ids []pgid // all free and available free page ids. | ||
pending map[txid][]pgid // mapping of soon-to-be free page ids by tx. | ||
cache map[pgid]struct{} // fast lookup of all free and pending page ids. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A benchmark or some other data demonstrating the memory savings would be useful. I'm not a prior convinced that the memory savings here are noticeable in real world use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance is one thing, and idiomatic code is another:
a map[key]struct{} is just the idiomatic way to store a set. Golang doesn't have many idioms, they should not require more challenges to be accepted. As long as the code passes the tests of course.
@@ -397,6 +393,19 @@ func (db *DB) close() error { | |||
return nil | |||
} | |||
|
|||
func (db *DB) ensureFreelist() *freelist { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain more how this lazy load helps?
It appears that you're trading
- an eager read (that will happen eventually anyway?)
- a regular pointer load
- straightforward code
for
- a lazy read
- an atomic pointer load
- less obvious code
I'd like to see more detail about the benefits to understand why this is worth doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you open file only for read, you don't need to read freelist at all ever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first commit in this series handles that already, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes you want to have single code for both readonly open and readwrite simply cause you don't know will you write or not.
Anyway, I was never confident last commit will be accepted.
To be clear, I'm reviewing this (and the other perf-related PRs) because I am not getting the performance I want out of boltdb, and I'm checking to see whether someone else has already fixed my problems. So I want boltdb to be faster... |
@funny-falcon Any sort of speedup PR should always have reproducible benchmarks demonstrating the advantages of the new code. |
@DavidVorick , I don't care will this pull request be accepted, or not. |
sizeof struct{} == 0
Hint: they are always sorted.
ccd0542
to
e40ad77
Compare
e40ad77
to
ce500b5
Compare
Netherless, I've rebased patch and simplified couple of details. |
map([pgids]struct{}) - is a real set structure, cause struct{} doesn't consume memory