Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Initial Java binding through JNI. #21

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mzhaom
Copy link

@mzhaom mzhaom commented Jan 1, 2022

This is a draft implementation to JNI binding.

Major questions for the next step:

  • Is going through the RiegeliReader/Writer the right starting point?
  • Shall we provide a byte{reader,writer} implementation that can work with java.io.{Input,Out}Stream?

@QrczakMK
Copy link
Member

QrczakMK commented Jan 3, 2022

Thank you!

I will go through this tomorrow.

I assume you mean Record{Reader,Writer}. Yes, it seems to be a good starting point.

Yes, there should be a riegeli::{Reader,Writer} adapting a java.io.{Input,Output}Stream (perhaps called riegeli::Java{Reader,Writer}), like riegeli::Python{Reader,Writer}. Given the API of Java streams (read/write functions which copy pieces of data to/from arrays, as opposed to e.g. providing pointers to buffers owned by the streams — which would not work well across Java/C++ anyway), they too should derive from riegeli::Buffered{Reader,Writer}.

I don’t know how random access should be supported. In Riegeli this is an optional functionality of a Reader (and also Writer, but that is rarely useful). I am not familiar enough with Java to know one typically expresses a random access input stream. There is java.io.RandomAccessFile, but it is file-specific. There is java.nio.channels.SeekableByteChannel, but channels are a separate class hierarchy from streams — are streams or channels recommended? Initially we can ignore random access (and skip wrap functions which require it, like riegeli::RecordReader::Seek()), but eventually something should be done with that.

The Python wrapper ignores the possibility of substituting a custom ChunkWriter (and a ChunkReader is not really substitutable even in C++, because it lacks a separate interface and implementation). This will be fine in Java as well.

@QrczakMK
Copy link
Member

QrczakMK commented Jan 3, 2022

Instead of java.io.{Input,Output}Stream, I think Java{Reader,Writer} should take java.nio.{Readable,Writable}ByteChannel.

Benefits:

  • An interface rather than a class makes it easier to be implemented without creating an adapter object.
  • Random access can be queried with instanceof SeekableByteChannel.
  • https://docs.oracle.com/javase/9/docs/specs/jni/functions.html#getdirectbufferaddress allows to reduce data copying. This means that riegeli::Java{Reader,Writer} would not be implemented using riegeli::Buffered{Reader,Writer} but using riegeli::{PullableReader,PushableWriter}.

@weixin-zhang-weride-ai
Copy link

Hi everyone, just curious how is the progress, I am also looking for Java support for riegeli, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants