-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java gatherer #1523
Java gatherer #1523
Conversation
Speedup using
First run shows speedups: 1,001 rows: 13.2x |
Integrations/src/main/java/io/deephaven/integrations/learn/Gatherer.java
Outdated
Show resolved
Hide resolved
Integrations/src/main/java/io/deephaven/integrations/learn/Gatherer.java
Outdated
Show resolved
Hide resolved
…eephaven.java for a bit
Here's an example of a Python query with the current code:
|
In the current instantiation of the I have tested this for all data types and have only found issues with boolean values. I believe this is because the Right now, the following NumPy and Python data types are supported: Python built in: int, float, boolean (supported by explicit conversion to the corresponding NumPy dtype) After testing with Python built-in types, I got strange errors I didn't fully understand. Thus, I decided that explicit conversion to the corresponding NumPy dtype is appropriate. I have yet to implement a |
Integrations/src/main/java/io/deephaven/integrations/learn/Gatherer.java
Outdated
Show resolved
Hide resolved
Integrations/src/main/java/io/deephaven/integrations/learn/Gatherer.java
Outdated
Show resolved
Hide resolved
Integrations/src/main/java/io/deephaven/integrations/learn/Gatherer.java
Outdated
Show resolved
Hide resolved
Integrations/src/main/java/io/deephaven/integrations/learn/Scatterer.java
Outdated
Show resolved
Hide resolved
…n accordance with
Currently, |
… and the corresponding Python code
The java gatherer transfers table data to a Python object. It works in Python using the
frombuffer
method, for which NumPy and Torch have methods. This is acceptable, because these are the two most common data containers that will be used with AI.Speed testing between between Chip's v1 and v2 showed mixed results. This code is his v2, because it showed better speedups more often than v1. There was enough of both to justify including two methods, but that seems unnecessary given both showed speedups between 5x and 250x over the legacy method.
I have tested for memory leaks and differences between outputs of legacy and new code and have been able to find none.