-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add VarInt encoder for unsigned longs [#98] #449
Conversation
https://github.com/onthegomap/planetiler/actions/runs/3985606584 ℹ️ Base Logs 42e2e19
ℹ️ This Branch Logs 123dd1f
|
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
* an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations under the License. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this adapted code, I've been putting the license at the top of the file, with the javadoc for the class used to give a description of what was changed (if anything). See VectorTile.java as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the top; though at this point the code as-is might be trivial enough to not require this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! And can you add a short javadoc comment with link to the original and say what changed (like un-unrolling the loop) from the original?
planetiler-core/src/main/java/com/onthegomap/planetiler/util/VarInt.java
Show resolved
Hide resolved
*/ | ||
public static long getVarLong(ByteBuffer src) { | ||
long tmp; | ||
if ((tmp = src.get()) >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make it a loop? I wonder why they did it this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fastPath and slowPath comments here might be relevant: https://github.com/protocolbuffers/protobuf/blob/6c72d103fdc5edb88ecd6342a5dd8dba88f3356f/java/core/src/main/java/com/google/protobuf/CodedInputStream.java#L1048
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a benchmark and changed it to protobuf-java's SlowPath, which uses a for loop and doesn't seem to slow it down much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, wasn't super clear - was looking for the raw license at the top of the file (just regular /*
comment not /**
javadoc) then the javadoc comment on the class says where it's from and what changed.
Also it looks like protobuf java has a different license than java (chatgpt tells me it's BSD-3 plus some extra stuff about generated code) so let's just go back to Bazel's getVarLong implementation - it appears 20% faster anyway.
I tried to just update your branch but it doesn't look like I have access so just here are the suggested changes instead.
/** | ||
* Encode and decode Protocol Buffer-style VarInts. | ||
* | ||
* getVarLong is adapted from readRawVarint64SlowPath in <a href="https://github.com/protocolbuffers/protobuf/blob/main/java/core/src/main/java/com/google/protobuf/CodedInputStream.java">protobuf-java</a>. | ||
* putVarLong is adapted from <a href="https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/util/VarInt.java">Bazel</a>. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* Encode and decode Protocol Buffer-style VarInts. | |
* | |
* getVarLong is adapted from readRawVarint64SlowPath in <a href="https://github.com/protocolbuffers/protobuf/blob/main/java/core/src/main/java/com/google/protobuf/CodedInputStream.java">protobuf-java</a>. | |
* putVarLong is adapted from <a href="https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/util/VarInt.java">Bazel</a>. | |
*/ | |
/** | |
* Encode and decode Protocol Buffer-style VarInts. | |
* | |
* getVarLong is adapted from readRawVarint64SlowPath in <a href="https://github.com/protocolbuffers/protobuf/blob/main/java/core/src/main/java/com/google/protobuf/CodedInputStream.java">protobuf-java</a>. | |
* putVarLong is adapted from <a href="https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/util/VarInt.java">Bazel</a>. | |
*/ |
planetiler-core/src/main/java/com/onthegomap/planetiler/util/VarInt.java
Show resolved
Hide resolved
public static long getVarLong(ByteBuffer src) throws IOException { | ||
long result = 0; | ||
for (int shift = 0; shift < 64; shift += 7) { | ||
final byte b = src.get(); | ||
result |= (long) (b & 0x7F) << shift; | ||
if ((b & 0x80) == 0) { | ||
return result; | ||
} | ||
} | ||
|
||
throw new IOException("Varint exceeded 10 bytes"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the churn here but let's just go back to Bazel's implementation so we don't need to deal with 2 different licenses in the same file. It appears to about 20% faster on my macbook as well so probably worth the extra code.
public static long getVarLong(ByteBuffer src) throws IOException { | |
long result = 0; | |
for (int shift = 0; shift < 64; shift += 7) { | |
final byte b = src.get(); | |
result |= (long) (b & 0x7F) << shift; | |
if ((b & 0x80) == 0) { | |
return result; | |
} | |
} | |
throw new IOException("Varint exceeded 10 bytes"); | |
} | |
public static long getVarLong(ByteBuffer src) { | |
long tmp; | |
if ((tmp = src.get()) >= 0) { | |
return tmp; | |
} | |
long result = tmp & 0x7f; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 7; | |
} else { | |
result |= (tmp & 0x7f) << 7; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 14; | |
} else { | |
result |= (tmp & 0x7f) << 14; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 21; | |
} else { | |
result |= (tmp & 0x7f) << 21; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 28; | |
} else { | |
result |= (tmp & 0x7f) << 28; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 35; | |
} else { | |
result |= (tmp & 0x7f) << 35; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 42; | |
} else { | |
result |= (tmp & 0x7f) << 42; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 49; | |
} else { | |
result |= (tmp & 0x7f) << 49; | |
if ((tmp = src.get()) >= 0) { | |
result |= tmp << 56; | |
} else { | |
result |= (tmp & 0x7f) << 56; | |
result |= ((long) src.get()) << 63; | |
} | |
} | |
} | |
} | |
} | |
} | |
} | |
} | |
return result; | |
} |
planetiler-core/src/main/java/com/onthegomap/planetiler/util/VarInt.java
Outdated
Show resolved
Hide resolved
SonarCloud Quality Gate failed. |
Another prerequisite to #98
The Protobuf package doesn't expose its Varint encoding as a public API, so I copied this one from Bazel
and adapted it since the only parts relevant for pmtiles support are unsigned longs ... maybe this is useful somewhere else too?