Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] CorsNotSetIT.testThatOmittingCorsHeaderDoesNotReturnAnything fails with ByteBuf leak #32342

Closed
dakrone opened this issue Jul 24, 2018 · 18 comments
Assignees
Labels
:Distributed Coordination/Network Http and internode communication implementations >test-failure Triaged test failures from CI v6.4.0

Comments

@dakrone
Copy link
Member

dakrone commented Jul 24, 2018

From https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/2392/console

{node_s0}{mbPv4JhZQuewN5xyN96p5Q}{OMtH6HNuQVWcdv5nl7jWrQ}{127.0.0.1}{127.0.0.1:43712} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)[, ]]])
11:31:14   1> [2018-07-25T05:31:11,917][ERROR][i.n.u.ResourceLeakDetector] LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
11:31:14   1> Recent access records: 4
11:31:14   1> #4:
11:31:14   1> 	Hint: 'cors' will handle the message from this point.
11:31:14   1> 	io.netty.buffer.AdvancedLeakAwareCompositeByteBuf.touch(AdvancedLeakAwareCompositeByteBuf.java:36)
11:31:14   1> 	io.netty.handler.codec.http.HttpObjectAggregator$AggregatedFullHttpMessage.touch(HttpObjectAggregator.java:374)
11:31:14   1> 	io.netty.handler.codec.http.HttpObjectAggregator$AggregatedFullHttpRequest.touch(HttpObjectAggregator.java:454)
11:31:14   1> 	io.netty.handler.codec.http.HttpObjectAggregator$AggregatedFullHttpRequest.touch(HttpObjectAggregator.java:404)
11:31:14   1> 	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:116)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:345)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
11:31:14   1> 	io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
11:31:14   1> 	io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
11:31:14   1> 	io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
11:31:14   1> 	io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
11:31:14   1> 	io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
11:31:14   1> 	io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
11:31:14   1> 	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)

Doesn't reproduce for me, but the line is:

./gradlew :qa:smoke-test-http:integTestRunner -Dtests.seed=7164671BFDB380AB -Dtests.class=org.elasticsearch.http.CorsNotSetIT -Dtests.method="testThatOmittingCorsHeaderDoesNotReturnAnything" -Dtests.security.manager=true -Dtests.locale=pl-PL -Dtests.timezone=Pacific/Wake
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@dakrone dakrone added :Distributed Coordination/Network Http and internode communication implementations >test-failure Triaged test failures from CI v7.0.0 labels Jul 24, 2018
@original-brownbear
Copy link
Member

This looks like it could have the same origin as #32289 (failing test and offending test don't need to be the same since the failure comes at GC time of the buffer.)
I can reproduce this fairly well and fixed the linked issue in
#32296, will take a look here too and see if I can reproduce it :)

@original-brownbear original-brownbear self-assigned this Jul 24, 2018
@dakrone
Copy link
Member Author

dakrone commented Jul 24, 2018

Here's another ByteBuf leak that just happened, but in a different test: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=debian/1201/console

@original-brownbear
Copy link
Member

That one's probably also coming from Netty4HttpServerPipeliningTests leaking (form the log it seems to have run in the same JVM and before the failure) => #32296 should fix this as well.

@dakrone
Copy link
Member Author

dakrone commented Jul 24, 2018

@nik9000
Copy link
Member

nik9000 commented Jul 25, 2018

@original-brownbear
Copy link
Member

This was fixed in master via #32296 but is still an issue in 6.x. The fix for master isn't applicable to 6.x because 6.x doesn't contain #30820 => still looking into this one

@original-brownbear
Copy link
Member

@tbrooks8 can you take a look here for 6.x? See above comment. The leak I fixed in master was in the tests, but in 6.x your PR (#30820) isn't there and the test I fixed isn't what's leaking here and it seems that you probably know exactly where to look since you made that change :)

@original-brownbear
Copy link
Member

master has remaining issues too ... see #32354 (activating paranoid leak checking still makes things fail)

@original-brownbear
Copy link
Member

Fixed a few more verified leak candidates here in #32377

@original-brownbear
Copy link
Member

@tbrooks8 should we close here? master is completely green now even in "paranoid mode" #32354

@Tim-Brooks
Copy link
Contributor

Waiting on back port. #32410

@Tim-Brooks
Copy link
Contributor

Alright. I just backported it.

@original-brownbear
Copy link
Member

@tbrooks8 thanks :) closing then, I think we're good.

@droberts195
Copy link
Contributor

It looks like the fix has only been backported as far as 6.x, but this is failing in 6.4 too. Does the fix need backporting to 6.4?

Here is an example of a very similar looking failure today in 6.4:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.4+multijob-unix-compatibility/os=ubuntu&&virtual/6/console

Alternatively, if this isn't a big problem and the fix doesn't need to be backported to 6.4 then please add a comment to the issue to confirm this.

@droberts195 droberts195 reopened this Jul 30, 2018
@original-brownbear
Copy link
Member

@tbrooks8 I guess your fixes should go to 6.4 too to fix tests there right?

@droberts195
Copy link
Contributor

Looks like a separate issue has been raised for 6.4: #32494

@original-brownbear
Copy link
Member

Right, closing here then :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations >test-failure Triaged test failures from CI v6.4.0
Projects
None yet
Development

No branches or pull requests

6 participants