Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResultSet & Statement leaks in simple query #871

Closed
kdubb opened this issue Apr 10, 2020 · 7 comments
Closed

ResultSet & Statement leaks in simple query #871

kdubb opened this issue Apr 10, 2020 · 7 comments
Labels
waiting for reply Additional information required

Comments

@kdubb
Copy link

kdubb commented Apr 10, 2020

The PreparedStatement allocated here:

return JdbcPreparedStatementImpl(connection.prepareStatement(sql, generated), returnKeys)

and the ResultSet allocated here:

are leaked because close is never called. Calling close is a requirement of JDBC and some drivers (e.g. pgjdbc-ng) rely on this resource cleanup to keep memory usage optimized.

I was using the smallest of tests while debugging an issue logged against the pgjdbc-ng driver so I cannot say this is the extent of the leaks; looking through the codebase it appears there may be quite a few more leaks similar to these.

Here was my simple test:

    val db = Database.connect(
        "jdbc:pgsql://localhost/test",
        user = "test",
        password = "test",
        driver = "com.impossibl.postgres.jdbc.PGDriver"
    )

    transaction(db) {

        for (i in 0 until 100000) {
            TestTable
                .selectAll()
                .map { Test(it[id], it[name], [bin]) }
        }

    }

FYI, I ran with the pgjdbc-ng driver and -Xmx256m to help force GC which triggers the driver's internal leak housekeeper to report leaks.

@Tapac
Copy link
Contributor

Tapac commented Apr 13, 2020

What exposed version do you use?
ResultSets and PreparedStatements close after executions/before transaction commit.

Could you prepare a sample project to check the memory leak?

@Tapac Tapac added the waiting for reply Additional information required label Apr 14, 2020
@MrPowerGamerBR
Copy link
Contributor

I'm not sure how to reproduce it, but I can confirm that PG-JDBC is complaining that there are leaks.

[18:07:25.498] [PG-JDBC I/O (1)/ERROR]: LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records:
Created at:
        io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:349)
        io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
        io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
        io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
        io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
        io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
        io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
        io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
        io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
        io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        java.base/java.lang.Thread.run(Thread.java:832)

@Tapac
Copy link
Contributor

Tapac commented May 9, 2020

@MrPowerGamerBR , do you use Spring/SptingBoot or pure Exposed transaction? Also, is it suspended version?

@MrPowerGamerBR
Copy link
Contributor

@Tapac pure Exposed transation, not using suspended queries.

@arbiter34
Copy link

arbiter34 commented May 20, 2020

I don't know where it is, but all things point to there being a race between GC and resource releasing, such that under load a stop the world GC pause runs before the ByteBuf.release().

It's not a deterministic behavior, and I'm unable to trigger without doing a continual read.

This seems to prove out with these two examples below.

Note: The only change is to move the .use on rs.getBinaryStream(index) to readObject. This emulates the difference between PGResultSet.getObject which is the default behavior for BinaryColumnType() since it's not specified via override and PGResultSet.getBytes which is the first method Exposed will call for BlobColumnType on Postgres.

https://github.com/impossibl/pgjdbc-ng/blob/efd596feb89e96786ab55f6e8945c0ba1f9624fc/driver/src/main/java/com/impossibl/postgres/jdbc/PGResultSet.java#L829

https://github.com/impossibl/pgjdbc-ng/blob/efd596feb89e96786ab55f6e8945c0ba1f9624fc/driver/src/main/java/com/impossibl/postgres/jdbc/PGResultSet.java#L716

The only reason this isn't a problem for the standard Postgres Driver is they don't support binary streaming, so they just use ByteArrayInputStream(getBytes(index))

fun <R : Message, T : Table> T.protoBytes(name: String, message: R): Column<R> =
    registerColumn(name, ProtoByteAColumnType(message))

class ProtoByteAColumnType<T: Message>(private val message: T): ColumnType() {

    override fun sqlType() = "bytea"

    override fun notNullValueToDB(value: Any): Any =
        when {
            message::class.java.isInstance(value) -> message::class.java.cast(value).toByteArray()
            else -> throw IllegalStateException("Unknown type when serializing ${message::class.java.canonicalName}: ${value::class.java.canonicalName}")
        }

    override fun valueFromDB(value: Any): Any =
        when (value) {
            is ByteArray -> message.parserForType.parseFrom(value)
            is Message -> value
            is InputStream -> value.use { message.parserForType.parseFrom(it) }
            is ByteBufInputStream -> value.use { message.parserForType.parseFrom(it.readBytes()) }
            else -> throw IllegalStateException("Unknown type when serializing ${message::class.java.canonicalName}: ${value::class.java.canonicalName}")
        }

    override fun valueToDB(value: Any?): Any? = value?.let { notNullValueToDB(value) }

    override fun readObject(
        rs: ResultSet,
        index: Int
    ): Any? {
        return rs.getBinaryStream(index)
    }
}

class ProtoByteAColumnTypeFixed<T: Message>(private val message: T): ColumnType() {

    override fun sqlType() = "bytea"

    override fun notNullValueToDB(value: Any): Any =
        when {
            message::class.java.isInstance(value) -> message::class.java.cast(value).toByteArray()
            else -> throw IllegalStateException("Unknown type when serializing ${message::class.java.canonicalName}: ${value::class.java.canonicalName}")
        }

    override fun valueFromDB(value: Any): Any =
        when (value) {
            is ByteArray -> message.parserForType.parseFrom(value)
            is Message -> value
            is InputStream -> value.use { message.parserForType.parseFrom(it) }
            is ByteBufInputStream -> throw IllegalStateException("Known bug when using ByteBufInputStream inside of valueFromDB due to race between GC and Ref Counting")
            else -> throw IllegalStateException("Unknown type when serializing ${message::class.java.canonicalName}: ${value::class.java.canonicalName}")
        }

    override fun valueToDB(value: Any?): Any? = value?.let { notNullValueToDB(value) }

    override fun readObject(
        rs: ResultSet,
        index: Int
    ): Any? {
        return rs.getBinaryStream(index).use { it.readAllBytes() }
    }
}

@Tapac
Copy link
Contributor

Tapac commented May 24, 2020

Maybe you are right and memory leak cause is not closed Blobs' InputStreams.
I will fix it for binary and blob column types. Please check it on the next release (0.25.1).

@arbiter34
Copy link

arbiter34 commented May 25, 2020

We really appreciate you looking at this.

I'm using 0.17.7 of Exposed for these code snippets by the way.

I'm not convinced the issue is necessarily with Exposed or Impossbli's driver or Netty, but instead an unfortunate combination of the three in use.

Maybe you are right and memory leak cause is not closed Blobs' InputStreams.

If I'm understanding you right and you are saying maybe the leak is from not closing Blobs' InputStreams, I'm quite sure that's not the case. In fact (anecdotal of course) one of my coworker's switched temporarily to BlobColumnType because it bypasses this bug with the following.

Note for Postgres blobAsStream is always true so the top if is always taken.

While the else branch is never hit in our case it does appear to be missing a .use or try with resources. It's not the issue here, just worth pointing out.

**From BlobColumnType**

override fun readObject(rs: ResultSet, index: Int): Any? {
        return if (currentDialect.dataTypeProvider.blobAsStream) { <----This path is always taken in postgres setups.
            rs.getBytes(index)?.let(::ExposedBlob)
        } else {
            rs.getBlob(index)?.binaryStream?.readBytes()?.let(::ExposedBlob)
        }
    }

The following is Impossibli's ResultSet.getBytes(index) implementation, as you can see they use try with resources on input streams which is why BlobColumnType hasn't been an issue.

public byte[] getBytes(int columnIndex) throws SQLException {
    checkClosed();
    checkRow();
    checkColumnIndex(columnIndex);

    try (InputStream data = getVal(columnIndex, context, InputStream.class, null)) {

      if (data == null) {
        return null;
      }

      return ByteStreams.toByteArray(data);
    }
    catch (IOException e) {
      throw new SQLException(e);
    }
  }

Using BinaryColumnType, which doesn't have it's own readObject override, falls back to the following from IColumnType.

fun readObject(rs: ResultSet, index: Int): Any? = rs.getObject(index)

In Impossibli's driver, this results in it returning a default of the following based off the postgres column type. Return class type wasn't specified in the above readObject for IColumnType. In fairness, there's not a clean way I can see for you guys to specify the correct generic type in IColumnType's readObject.

This snippet is from Impossibli and is the default return type class for BYTEA when one isn't specified.

    @Override
    public Class<?> getDefaultClass() {
      return InputStream.class;
    }

The end result is BinaryColumnType and any classes that extend it in a setup which uses Postgres with Impossibli's driver, will have memory leaks if they aren't aware of the odd lifecycle requirements needed for binaryStreams in this setup.

The ByteBufInputStream is a Netty reference counted object. The bug manifested when the ByteBufInputStream is used later outside of readObject. In our cases the later usage was in valueFromDb. We've mitigated our issues, because there was zero reason for us to not just deserialize in the readObject method. This move seems to eliminate the bug, which again leads me to believe a race between when Netty thinks a given ByteBuf has no references and in turn begins GC on it and when Exposed or users of Exposed are actually done with said stream.

This is not an important issue to our company because we don't have any need for binary streaming, but I'd expect if there is somehow a use case from some longer running binary stream, this would leak and bring down an app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for reply Additional information required
Projects
None yet
Development

No branches or pull requests

4 participants