Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping Blob/Clob to ByteBuffer/String is not supported #32

Closed
Michael-A-McMahon opened this issue Jun 30, 2021 · 2 comments · Fixed by #36
Closed

Mapping Blob/Clob to ByteBuffer/String is not supported #32

Michael-A-McMahon opened this issue Jun 30, 2021 · 2 comments · Fixed by #36

Comments

@Michael-A-McMahon
Copy link
Member

Michael-A-McMahon commented Jun 30, 2021

Not supporting BLOB/CLOB to ByteBuffer/String mapping is a known limitation of Oracle R2DBC. Only io.r2dbc.spi.Blob/Clob mapping is supported. This is noted in the documentation, but the reasoning is not explained in depth. So I’d like to share the full extent of my thoughts on this, and see if we can get to a better solution.

When a query returns a LOB value, Oracle Database returns a locator for that value. For the purpose of this discussion, we can think of the locator as a pointer to the actual value. To get the actual value into a buffer, a database driver makes another database call requesting to read the value which a locator points to.
Although the LOB read requires a database call, Oracle R2DBC could still support the ByteBuffer mapping without Row.get(int/String, ByteBuffer.class) having to make a blocking database call. Before emitting the Row to the mapping function, the driver could execute non-blocking database calls to read the LOB content into a buffer. After the content had been buffered, then the Row can be input to the mapping function and the ByteBuffer would be ready to go.
Of course, if the LOB exceeded 2GB, then it would not fit into a ByteBuffer and the driver would need to handle that. But we can ignore this case for the moment, as it doesn't completely prevent Oracle R2DBC from supporting the ByteBuffer mapping.

So, pre-buffering the LOB content is one option to consider. However, this approach seems to devalue the case where user code wants to use io.r2dbc.spi.Blob. Rather than have Blob.stream() respond to backpressure from a Subscriber, the stream() implementation has decided to allocate memory for the entire content of the LOB before a Subscriber has even subscribed.

On the other hand, if Oracle R2DBC only supports the io.r2dbc.spi.Blob mapping, and user code want to map that into a ByteBuffer, it still has the freedom to implement that mapping. If the user code knows that the BLOB value won’t exhaust memory, or exceed the 2GB ByteBuffer capacity, then it can map the BLOB into a ByteBuffer like this:

Flux.usingWhen(ConnectionFactories.get(
  "r2dbc:oracle://test:test@localhost:1521/xepdb1")
  .create(),
  connection ->
    Flux.from(connection.createStatement("DROP TABLE BlobTest")
      .execute())
      .flatMap(Result::getRowsUpdated)
      .onErrorResume(throwable ->
        // If the table doesn't exist, don't emit ORA-00942
        (throwable instanceof R2dbcException)
        && ((R2dbcException)throwable).getErrorCode() == 942,
        throwable -> Mono.empty())
      .thenMany(connection.createStatement(
        "CREATE TABLE BlobTest (value BLOB)")
        .execute())
      .flatMap(Result::getRowsUpdated)
      .thenMany(connection.createStatement(
        "INSERT INTO BlobTest VALUES(?)")
        .bind(0, ByteBuffer.wrap(new byte[128_000]))
        .execute())
        .flatMap(Result::getRowsUpdated)
      .thenMany(connection.createStatement(
        "SELECT value FROM BlobTest")
        .execute())
      .flatMap(result ->
        result.map((row, metadata) -> row.get(0, Blob.class)))
      .flatMap(blob ->
        Flux.from(blob.stream())
          .reduce(ByteBuffer.allocate(8192),
            (buffer, next) -> {
              // Ensure capacity for all remaining bytes in the next buffer.
              if (buffer.remaining() < next.remaining()) {
                ByteBuffer newBuffer = ByteBuffer.allocate(Math.max(
                    buffer.capacity()
                      + (next.remaining() - buffer.remaining()),
                    buffer.capacity() << 1));
                buffer = newBuffer.put(buffer.flip());
              }

              buffer.put(next);
              return buffer;
            })),
  Connection::close)
  .toStream()
  .map(ByteBuffer::flip)
  .forEach(System.out::println);

As shown above, the ability for user code to implement a Blob to ByteBuffer mapping is what ultimately lead to the decision for Oracle R2DBC to only support Blob mapping. With Blob, user code still has the option to map it into a ByteBuffer if wants to, but user code can also choose to process the Blob as a stream of smaller buffers if it wants to do that instead.

So far, we’ve only considered solutions that represent two extremes. Either: A) Buffer everything, or B) Buffer nothing. Option C might look like this:

  • Let X be a reasonable buffer size, one that should not exhaust the memory resources of modern hardware when there are N instances of X, and N is a reasonable upper bound for the number of database connections.
  • The Oracle driver buffers X bytes from a BLOB into memory before emitting a Row to the mapping function.
  • The mapping function calls Row.get(…, T), where T may be a ByteBuffer or Blob
    • If T is Blob.class, then return a Blob instance with X bytes pre-buffered.
    • If T is ByteBuffer.class, then let L be the length of the BLOB.
      • If L > 2GB: Throw an exception. The BLOB won’t fit into a ByteBuffer
      • If L > X: Execute a blocking database call to read the remaining length of the BLOB into the buffer.
      • Else: Return the fully-buffered BLOB.

I find the solution described above to be problematic because the cases where errors and blocking database calls occur seem like like pitfalls that are easy to miss. It seems too likely that a system would be verified by tests that miss the case where a LOB exceeds 2GB, and then fail in production when the >2GB case occurs. And for blocking database calls, that’s really hard to detect unless you have something like Java Mission control to measure socket read time.
Although having to implement a ByteBuffer mapping with something like the reduce operator above puts some burden on user code, it seemed like a better alternative than to introduce the pitfalls I’ve described.

Of course, it would be excellent if Oracle R2DBC could support the ByteBuffer mapping. I’m happy to discuss new solutions with anyone that wants to explore this further.

@lukaseder
Copy link

Of course, if the LOB exceeded 2GB, then it would not fit into a ByteBuffer and the driver would need to handle that. But we can ignore this case for the moment, as it doesn't completely prevent Oracle R2DBC from supporting the ByteBuffer mapping.

I think this is an edge case. Sure it's great that very large BLOBs can be handled as well by both R2DBC and JDBC, but BLOBs are mostly also useful to store images or PDFs and the likes (KBs, not GBs), in case of which the blocking nature of the buffering can be ignored, just like for strings. The distinction of VARCHAR2/CLOB is already quite an annoyance in the database, it got a bit better when VARCHAR2 was extended to contain 32KB at most. But the fact that I have to think about this all the time when aggregating strings with LISTAGG or JSON with JSON_ARRAYAGG is really annoying. It feels like some very low level detail leaking throughout the design of Oracle SQL, PL/SQL, OCI, JDBC, right into my client application.

Frankly, if I had to store GBs, I'm not convinced I'd store them in the database. A file system seems just as good.

I think the tradeoff here should be in favour of usability. It's already very hard to correctly stream lobs with JDBC (while correctly managing resources), and I doubt most people are doing it (let alone doing it right, e.g. I've met only few people who are even aware of Blob.free() and Clob.free() in JDBC). It's possible to fetch byte[] and String very easily instead, with JDBC. It's much harder even with R2DBC, so this should only be required when strictly needed, in the GB case.

I'm saying this from an adoption perspective. If users have to write so much boilerplate code every time a small-ish BLOB/CLOB is read, then they might re-consider the entire reactive model, at least in my opinion, especially in the CLOB case.

We're now discussing the case of a single Blob in a single column. What if we have a result set with 10 Blobs/Clobs? A simple SQL select will turn into a 100 line mess of streaming infrastructure boilerplate logic.

I find the solution described above to be problematic because the cases where errors and blocking database calls occur seem like like pitfalls that are easy to miss. It seems too likely that a system would be verified by tests that miss the case where a LOB exceeds 2GB, and then fail in production when the >2GB case occurs

It's great that the edge cases are kept in mind, no doubt. But let's also keep in mind that the maximum VARCHAR2 size was 4KBs until recently, and it is 32KBs now. That is extremely small.

Thinking of all the things that can go wrong when an inexperienced developer deserialises a stream of ByteBuffer chunks compared to the few things that can go wrong occasionally when the LOB is that big, I would favour the simplicity of making the 80% (or 99% ?) case simple to use. Again, this is because Oracle database has such low limits for VARCHAR2, compared to other RDBMS (e.g. SQL Server's VARCHAR(MAX) supports 2GB), but that's out of scope for this issue here.

Although having to implement a ByteBuffer mapping with something like the reduce operator above puts some burden on user code, it seemed like a better alternative than to introduce the pitfalls I’ve described.

Well, I'll tell you what will happen :)

Everyone will implement blocking ByteBuffer calls nonetheless just like in the old days when ojdbc didn't support reading byte[] and String yet (when was it? Oracle 8i or 9i?). They'll even do that right where they read the Blob from the Row, because that's much easier than thinking about intermediate data structures to store non-lob data and lob-data and then correctly flatMap the latter later on. At least, libraries like jOOQ and others, who do not have a means of representing such streams within a record will certainly do that.

jOOQ, for example, doesn't distinguish much between VARCHAR2 and CLOB. It's hardly ever something that a client application will care about. I don't think Hibernate does, either. You can annotate your columns with @Lob, but that's mostly used to generate the appropriate DDL, I think. Users will still consume String values. Neither jOOQ nor Hibernate nor any other ORM I'm aware of will account for 100GB LOB values that are stored in a "record" or "entity" style data structure. Such values are always a special case and it's OK to leave the burden of deciding when to apply the special case to the user, hoping that everyone else doesn't have to pay that price.

The mapping function calls Row.get(…, T), where T may be a ByteBuffer or Blob

I'm happy with that suggestion from a usability perspective (including the exceptions if > 2GB), can't comment much on the blocking scenario.

@Michael-A-McMahon
Copy link
Member Author

A major insight for me is that LOBs are rarely larger than 1GB. I confirmed this with several members of the Oracle JDBC Team as well.

If its uncommon for LOBs to be larger than 1GB, then we don't need to be as concerned about errors arising unexpectedly when LOBs exceed the capacity of ByteBuffer or String. Although it can happen, it will not be a common occurrence. When programmers are dealing with multi-GB LOBs, I think we can assume they are aware of that, and they know to use an appropriate mapping for this case: Blob/Clob rather than ByteBuffer/String.

So the only concern left is the blocking database call. If the LOB prefetch size is large enough, then this concern can be alleviated as well. If we say that 1GB is the maximum amount of prefetched data, this will be large enough to avoid a blocking call when dealing with LOBs of typical sizes. A call to Row.get(..., ByteBuffer.class) can return a prefetched ByteBuffer without having to perform additional database calls.

So with a default prefetch size of 1GB, and allowing smaller non-default sizes to be configured as well, I think we end with a satisfactory solution. By default, there's no unexpected errors or blocking database calls when a LOB value of a typical size is mapped to ByteBuffer/String.
For exceptionally large LOBs, a smaller non-default prefetch size can be configured, and the Blob/Clob mapping can be used to conserve memory and to avoid blocking database calls.

I'll plan to start implementing this new behavior next week, but additional feedback is always welcome.

Thanks to @lukaseder, @Douglas-Surber, @Kuassim for looking into this with me.

@Michael-A-McMahon Michael-A-McMahon linked a pull request Jul 14, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants