Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to release a message: UnpooledSlicedByteBuf(freed) #11201

Closed
cdgjzzy opened this issue May 13, 2024 · 16 comments
Closed

Failed to release a message: UnpooledSlicedByteBuf(freed) #11201

cdgjzzy opened this issue May 13, 2024 · 16 comments

Comments

@cdgjzzy
Copy link

cdgjzzy commented May 13, 2024

gRPC-java===1.63.0
java: 17.0.9 2023-10-17 LTS 64-Bit

When I use gRPC streaming responses, I handle it like this:

observers.forEach((context, observer) -> {
    if (!context.isCancelled() && observer.isReady()) {
        observer.onNext(data);
    }
});

This method gets triggered about 1500 times per second. Under such concurrent circumstances, there are issues with gRPC failing to send messages.

Stacktrace and logs

2024-05-13 09:44:10.412 [grpc-nio-worker-ELG-1-7] WARN  io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil.safeRelease - Failed to release a message: UnpooledSlicedByteBuf(freed)
io.grpc.netty.shaded.io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1
	at io.grpc.netty.shaded.io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:83)
	at io.grpc.netty.shaded.io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:148)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:101)
	at io.grpc.netty.shaded.io.netty.buffer.CompositeByteBuf$Component.free(CompositeByteBuf.java:1959)
	at io.grpc.netty.shaded.io.netty.buffer.CompositeByteBuf.deallocate(CompositeByteBuf.java:2264)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractReferenceCountedByteBuf.handleRelease(AbstractReferenceCountedByteBuf.java:111)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:101)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractDerivedByteBuf.release0(AbstractDerivedByteBuf.java:98)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractDerivedByteBuf.release(AbstractDerivedByteBuf.java:94)
	at io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil.release(ReferenceCountUtil.java:90)
	at io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil.safeRelease(ReferenceCountUtil.java:116)
	at io.grpc.netty.shaded.io.netty.channel.ChannelOutboundBuffer.remove(ChannelOutboundBuffer.java:280)
	at io.grpc.netty.shaded.io.netty.channel.ChannelOutboundBuffer.removeBytes(ChannelOutboundBuffer.java:361)
	at io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:438)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:931)
	at io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:354)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:895)
	at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1372)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:921)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:907)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:893)
	at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler.flush(Http2ConnectionHandler.java:197)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:925)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:907)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:893)
	at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:967)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannel.flush(AbstractChannel.java:254)
	at io.grpc.netty.shaded.io.grpc.netty.WriteQueue.flush(WriteQueue.java:143)
	at io.grpc.netty.shaded.io.grpc.netty.WriteQueue.access$000(WriteQueue.java:35)
	at io.grpc.netty.shaded.io.grpc.netty.WriteQueue$1.run(WriteQueue.java:47)
	at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:173)
	at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java)
	at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	
	
Client error:
On error,listening end
io.grpc.StatusRuntimeException: CANCELLED: Failed to read message.
	at io.grpc.Status.asRuntimeException(Status.java:533)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
	at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: io.grpc.StatusRuntimeException: INTERNAL: Invalid protobuf byte sequence
	at io.grpc.Status.asRuntimeException(Status.java:525)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parse(ProtoLiteUtils.java:240)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parse(ProtoLiteUtils.java:134)
	at io.grpc.MethodDescriptor.parseResponse(MethodDescriptor.java:284)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:657)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:644)
	... 5 common frames omitted
Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field.  This could mean either that the input has been truncated or that an embedded message misreported its own length.
	at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:115)
	at com.google.protobuf.CodedInputStream$ArrayDecoder.readStringRequireUtf8(CodedInputStream.java:822)
	at com.google.protobuf.StringValue$Builder.mergeFrom(StringValue.java:386)
	at com.google.protobuf.StringValue$1.parsePartialFrom(StringValue.java:517)
	at com.google.protobuf.StringValue$1.parsePartialFrom(StringValue.java:509)
	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:86)
	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:48)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parseFrom(ProtoLiteUtils.java:245)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parse(ProtoLiteUtils.java:237)
	... 9 common frames omitted
	

If I control the sending frequency with a queue, then this exception does not occur.

@kannanjgithub
Copy link
Contributor

Can you provide a more complete code involving your multiple stream observers use case?

@cdgjzzy
Copy link
Author

cdgjzzy commented May 15, 2024

When a client connects, we store the observer in an observerMap; in reality, we currently only have one client.

@Getter
private final Map<Context, ServerCallStreamObserver<StringValue>> observerMap = new ConcurrentHashMap<>();


@Override
public void currentChartUpdateSubscribe(Empty request, StreamObserver<StringValue> responseObserver) {
   var serverCallStreamObserver = (ServerCallStreamObserver<StringValue>) responseObserver;
   Context currentContext = Context.current();
   observerMap.put(currentContext, serverCallStreamObserver);
   currentContext.addListener(context -> {
       observerMap.remove(context);
   }, ExecutorConfig.GRPC_EXECUTOR);
}

When a SpringEvent occurs, it will trigger the onPricePublish method, which in turn invokes the observer to send a gRPC message. You can increase the concurrency of this event to achieve the desired testing effect.

@Async(value = "priceTaskExecutor")
@EventListener(value = TickShort.class)
public void onPricePublish(TickShort tickShort) {
    String symbol = tickShort.getSymbol();
    StringValue data = StringValue.newBuilder()
            .setValue(symbol)
            .build();
    Map<Context, ServerCallStreamObserver<StringValue>> observers = chartBarEndpoint.getObserverMap();
    observers.forEach((context, observer) -> {
        if (!context.isCancelled() && observer.isReady()) {
            observer.onNext(data);
        }
    });
}

@ejona86
Copy link
Member

ejona86 commented May 16, 2024

This is probably the same as #11115 (fixed in 1.64.0)

@sergiitk
Copy link
Member

sergiitk commented May 16, 2024

I'll release the backported fix to v1.63.

@tcronis
Copy link

tcronis commented May 17, 2024

I'm running into the same issue. I upgraded to 1.64.0, as recommended by the fix above, retested my code and didn't see any improvement. My stack trace is slightly different from the one above, but the same message is printed during a high volume test.

@ejona86
Copy link
Member

ejona86 commented May 17, 2024

@tcronis, if running on 1.64.0, please open a new issue and include the messages you are seeing.

@sergiitk
Copy link
Member

The v1.63.1 release is now available.

Bug fixes

@sergiitk
Copy link
Member

I'm closing this ticket, but feel free to reopen it if you still encounter the issue after upgrading to v1.63.1.

@sergiitk sergiitk changed the title Failed to release a message: UnpooledSlicedByteBuf(freed) grpc-java v1.63.0: Failed to release a message: UnpooledSlicedByteBuf(freed) May 17, 2024
@cdgjzzy
Copy link
Author

cdgjzzy commented May 21, 2024

he issue reoccurred when I used version 1.63.1.

2024-05-21 09:28:20.134 [grpc-nio-worker-ELG-1-5] WARN  io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil.safeRelease - Failed to release a message: UnpooledSlicedByteBuf(freed)
io.grpc.netty.shaded.io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1
	at io.grpc.netty.shaded.io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:83)
	at io.grpc.netty.shaded.io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:148)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:101)
	at io.grpc.netty.shaded.io.netty.buffer.CompositeByteBuf$Component.free(CompositeByteBuf.java:1959)
	at io.grpc.netty.shaded.io.netty.buffer.CompositeByteBuf.deallocate(CompositeByteBuf.java:2264)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractReferenceCountedByteBuf.handleRelease(AbstractReferenceCountedByteBuf.java:111)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:101)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractDerivedByteBuf.release0(AbstractDerivedByteBuf.java:98)
	at io.grpc.netty.shaded.io.netty.buffer.AbstractDerivedByteBuf.release(AbstractDerivedByteBuf.java:94)
	at io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil.release(ReferenceCountUtil.java:90)
	at io.grpc.netty.shaded.io.netty.util.ReferenceCountUtil.safeRelease(ReferenceCountUtil.java:116)
	at io.grpc.netty.shaded.io.netty.channel.ChannelOutboundBuffer.remove(ChannelOutboundBuffer.java:280)
	at io.grpc.netty.shaded.io.netty.channel.ChannelOutboundBuffer.removeBytes(ChannelOutboundBuffer.java:361)
	at io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:438)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:931)
	at io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:354)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:895)
	at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1372)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:921)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:907)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:893)
	at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler.flush(Http2ConnectionHandler.java:197)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:925)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:907)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:893)
	at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:967)
	at io.grpc.netty.shaded.io.netty.channel.AbstractChannel.flush(AbstractChannel.java:254)
	at io.grpc.netty.shaded.io.grpc.netty.WriteQueue.flush(WriteQueue.java:143)
	at io.grpc.netty.shaded.io.grpc.netty.WriteQueue.access$000(WriteQueue.java:35)
	at io.grpc.netty.shaded.io.grpc.netty.WriteQueue$1.run(WriteQueue.java:47)
	at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:173)
	at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java)
	at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:840)

@ejona86
Copy link
Member

ejona86 commented May 21, 2024

refCnt: 0, decrement: 1. Oh, I see. This isn't a leak, but decrementing when the refcount is zero. So either a double-free or failing to acquire a reference. That is different from the other issue.

@ejona86 ejona86 reopened this May 21, 2024
@rickyma
Copy link

rickyma commented May 22, 2024

Did you work well with previous versions of gRPC? @cdgjzzy

@cdgjzzy
Copy link
Author

cdgjzzy commented May 22, 2024

We were previously using version 1.58. The issue occurred, and then we tried to resolve it by upgrading to version 1.63.0, but unfortunately, it still persists.

@ejona86
Copy link
Member

ejona86 commented May 22, 2024

@cdgjzzy, I'm suspicious that this isn't a gRPC bug if you saw it in 1.58. I think you might be able to get this if you used a StreamObserver from multiple threads without synchronization.

@sergiitk sergiitk changed the title grpc-java v1.63.0: Failed to release a message: UnpooledSlicedByteBuf(freed) Failed to release a message: UnpooledSlicedByteBuf(freed) May 23, 2024
@ejona86
Copy link
Member

ejona86 commented Jun 6, 2024

@cdgjzzy, can you confirm that you are only calling each stream observer instance from a single thread? If you are calling from multiple threads, it is probably missing synchronization in your code.

@cdgjzzy
Copy link
Author

cdgjzzy commented Jun 7, 2024

Yes, this kind of error occurs when the stream observer is in a multi-threaded environment and there is no Synchronization. I previously thought that the stream observer might be thread-safe, and currently, I have added a Lock in the code to ensure thread safety.

@ejona86
Copy link
Member

ejona86 commented Jun 7, 2024

Glad things seem to be working for you now. If not, comment, and it can be reopened.

@ejona86 ejona86 closed this as completed Jun 7, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants