Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java heap space error when writing > 2 GB file content as Blob into Oracle Table #1617

Closed
harinivas-ganapathy opened this issue Oct 28, 2022 · 13 comments

Comments

@harinivas-ganapathy
Copy link

I have scenario to read a remote file of >2 GB file into an Oracle Blob Column. When ever the remote file size is > 1.2 GB I am am getting Java heap space error.

Below is my implementation? Can you please help?

object FileDownload: Table("FILE_DOWNLOAD") {
      private val fileID = integer("FILE_ID")
     val fileContent = blob("FILE_CONTENT")
}

val inStream = sftp.get(filePath)

transaction(db) {
    FileDownload.insert {
        it[fileContent] = ExposedBlob(inStream.readBytes())
    }
    inStream.close()
}
@AlexeySoshin
Copy link
Contributor

Looking at the code, I think Exposed doesn't handle LOBs that well at the moment:

Here the Exposed Blob wrapper will create a copy of LOB in the memory.

Now, under the hood, when writing to the database it actually uses inputStream, which should read data in chunks, thus avoiding the OutOfMemory error you're getting:

is InputStream -> stmt.setInputStream(index, toSetValue)

What you can try and do is to use raw statements and see if that solves your issue.

transaction(db) {
    exec("insert into FILE_DOWNLOAD (FILE_CONTENT) values (?)",
            listOf(BlobColumnType() to inStream)
        )
}

If it does help, then I'd be happy to look into adding support of InputStreams for the ExposedBlob type.

@harinivas-ganapathy
Copy link
Author

Thanks for the suggestion. I tried using raw statements, though I am not seeing any error, when I check the Oracle database I only see null value inserted into the Blob Column.

@AlexeySoshin
Copy link
Contributor

Are you sure that your input stream is not-null?

@harinivas-ganapathy
Copy link
Author

Yes I could confirm the remote file has 5GB content. Also I get the Java Heap error when trying to readBytes from the input stream.

@harinivas-ganapathy
Copy link
Author

Also, when I tried for smaller sized file I only see null value in DB. But if I do BlobColumnType() to inStream.readBytes() for the same smaller file I could see the right value in the blob column.

@AlexeySoshin
Copy link
Contributor

readBytes() will always attempt to read the entire file contents into the memory.

That's why you'll get OutOfMemory for larger files.

But BlobColumnType() to inStream should work, although I could test it only on PostgreSQL

Two things to check:

  1. BlobColumnType() to inStream with a small file.
  2. What's the return type of sftp.get(filePath)? Is it InputStream or something more specific?

Tapac added a commit that referenced this issue Nov 12, 2022
…racle Table #1617 / SQLServer doesn't support read from stream after steatment is closed
@Tapac
Copy link
Contributor

Tapac commented Nov 12, 2022

It might become better with the next release.
Now ExposedBlob stores inputStream by default, so please wait for the release and check will it helps or not.
Don't forget to replace ExposedBlob(inStream.readBytes()) with ExposedBlob(inStream)

@Tapac
Copy link
Contributor

Tapac commented Nov 14, 2022

@harinivas-ganapathy , please check with Exposed 0.41.1

@harinivas-ganapathy
Copy link
Author

Tried with Exposed 0.41.1
When trying to load a remote file using -
val inStream = sftp.get(fileName)

After replacing to ExposedBlob(inStream), Again, though I don't see any errors, I only see null inserted into the Blob
column.

This is same behavior for 1GB file and 2 GB file.

However I tried loading a local file of 1GB, 2.8 GB and 5 GB files into Blob column
1GB inserted successfully, 2.8 GB and 5 GB files were trimmed to 2.14 GB into the blob column

@harinivas-ganapathy
Copy link
Author

Also I noticed the inStream.available() (for local file) returns the same value 2147483647 as the amount of Bytes written into Blob Column

@harinivas-ganapathy
Copy link
Author

But I was able to write the entire 5GB of data back into another local file from inStream

Tapac added a commit that referenced this issue Dec 12, 2022
@oharaandrew314
Copy link
Contributor

Is this issue resolved thanks to commit 367babc?

@joc-a
Copy link
Collaborator

joc-a commented May 8, 2023

Hello @harinivas-ganapathy, could you please check if this issue is resolved with commit 367babc?

@joc-a joc-a closed this as completed May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants