Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffers: Prevent bug-prone iteration, speed up accidentally-quadratic iteration #369

Closed

Conversation

BenWiederhake
Copy link
Contributor

This PR:

  • Raises a new exception when the caller tries to iterate over a nested buffer. This causes the caller to only see the last part of a larger dataset, and is usually a bug. When I first ran into this issue, I first thought this was a bug with libosmium, since there was data missing. Let's just disable these kinds of accesses; none of the examples or code attempt this kind of access anyway.
  • Speeds up Buffer::get_last_nested(). This method used to be accidentally-quadratic: Each invocation takes a linear amount of time in the depth, and needs to be invoked a linear amount of times in order to iterate all buffers. This means an overall quadratic running time, even though one would expect a linear running time. I also added a test that is very sensitive to this (4s versus <0.01s). When iterating over the entire planet, I observe a speedup from 163.183 s to 160.942 s. (And an observed stddev of 0.836 s, so this result has 2.68 sigma. Physics scientists would laugh at that, but it's good enough proof for me, in this case.)

@joto
Copy link
Member

joto commented Dec 2, 2023

Where did you encounter those nested buffers? The Reader class should only ever return unnested buffers from read(), so the user should never see them?

@BenWiederhake
Copy link
Contributor Author

Nested buffers occur internally, so this affects the running time even if the user's code doesn't have direct access to them.

Checking for errors makes even more sense in this case. The user shouldn't be able to trigger this issue even if they wanted to, and the checks make sure that no data is lost internally (e.g. by new code).

I personally encountered these because osmium::io::detail::PBFDataBlobDecoder::operator() returns deeply nested buffers, and that's what the random access code uses / will use.

@joto
Copy link
Member

joto commented Dec 2, 2023

There are two issues here which we should not mix up:

  1. Some functions can now throw which couldn't before. And it is somewhat unexpected that a simple function like begin() can throw. So I am not too happy about that. And because this is internal to libosmium anyway, an assert could be a better solution. But this is something I have to look into.
  2. The performance issue. I can not remember why I implemented it the way I did. Probably because it was the simplest way of doing things. As this code is only run on I/O which does a lot more stuff which will swamp any efficiencies gained here. The 1-2% improvement you measured doesn't seem to me to be a big deal. On the other hand the changs are not huge. I am wondering myself now why I coulnd't implement the linked list the other way around so that the next buffer is always at the beginning of the list and not at the end.

In any case these changes are orthogonal to all the other work you are doing, right?

@BenWiederhake
Copy link
Contributor Author

  1. I take that as a request to change the exceptions into asserts. Sure, I'll do that.
  2. Yes, these changes are orthogonal. I expected the performance impact to be significantly higher when I started. The TEST_CASE("Can quickly handle deeply nested buffer") in test_buffer_nested.cpp proves that this change does speed things up, so I believe it's worth to improve the code in this way.

@BenWiederhake
Copy link
Contributor Author

Changes since last push:

  • Changed the exceptions into asserts, as requested.

get_last_nested() used to walk the entire list, requiring linear time.
Because get_last_nested() is also invoked a linear amount of times, this
leads to an overall quadratic running time in order to access all nested
buffers. This effect is called 'accidentally quadratic', and is a common
issue.

The 'Can quickly handle deeply nested buffer' test makes sure that this
issue does not regress in the future, and also allows me to make some
guesstimates about the performance gain: For a nesting depth of 30000,
the test is sped up from 4 seconds to less than 0.01 seconds.

Here are some more practical numbers: Iterating over the entire planet
(which has very large blocks and thus deeply nested buffers), I observe
a speed up from 163.1 s to 160.9 s according to hyperfine (10 runs each).
@BenWiederhake
Copy link
Contributor Author

Changes since last push summary:

  • Relaxed a test from buffer.written() == 1360 to buffer.written() >= 1360 && buffer.written() <= 1440, since apparently Windows writes 8 bytes more per node. This used to break windows-minimal-2019 and windows-minimal-2022.
  • Remove a few forgotten printf statements. Whoops!

@BenWiederhake
Copy link
Contributor Author

CI failure seems to be a flake: It fails long before the code of this PR becomes relevant, specifically during package installation.

@BenWiederhake
Copy link
Contributor Author

Closing because you don't seem to accept any PRs at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants