Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] metric timer will trigger integer overflow . #33

Closed
FMX opened this issue Jan 6, 2022 · 1 comment
Closed

[BUG] metric timer will trigger integer overflow . #33

FMX opened this issue Jan 6, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@FMX
Copy link
Contributor

FMX commented Jan 6, 2022

What is the bug?

RssWorker metric system has a bug that will cause an integer overflow when runs long enough.

How to reproduce the bug?

Steps to reproduce the bug.

Could you share logs or screenshots?

If applicable, add logs/screenshots to help explain your problem.

22/01/06 10:24:53,820 ERROR [push-server-6-10] TransportRequestHandler: Error while invoking RpcHandler#receive() on PushData PushData{requestId=702699486, mode=1, shuffleKey=application_1640695558204_38730_1-2, partitionUniqueId=151-0, body size=15066}
java.lang.ArrayIndexOutOfBoundsException: -4095
at com.aliyun.emr.rss.server.common.metrics.ResettableSlidingWindowReservoir.update(ResettableSlidingWindowReservoir.scala:35)
at com.codahale.metrics.Histogram.update(Histogram.java:39)
at com.codahale.metrics.Timer.update(Timer.java:164)
at com.codahale.metrics.Timer.update(Timer.java:86)
at com.aliyun.emr.rss.server.common.metrics.source.AbstractSource.doStopTimer(AbstractSource.scala:148)
at com.aliyun.emr.rss.server.common.metrics.source.AbstractSource.stopTimer(AbstractSource.scala:131)
at com.aliyun.emr.rss.service.deploy.worker.Worker$$anon$7.onSuccess(Worker.scala:587)
at com.aliyun.emr.rss.service.deploy.worker.Worker.handlePushData(Worker.scala:660)
at com.aliyun.emr.rss.service.deploy.worker.PushDataRpcHandler.receivePushData(PushDataRpcHandler.java:56)

22/01/06 10:25:31,123 WARN [nioEventLoopGroup-11-1] DefaultChannelPipeline: An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.NegativeArraySizeException
at com.aliyun.emr.rss.server.common.metrics.ResettableSlidingWindowReservoir.getSnapshot(ResettableSlidingWindowReservoir.scala:40)
at com.codahale.metrics.Histogram.getSnapshot(Histogram.java:54)
at com.codahale.metrics.Timer.getSnapshot(Timer.java:159)
at com.aliyun.emr.rss.server.common.metrics.source.AbstractSource.recordTimer(AbstractSource.scala:251)
at com.aliyun.emr.rss.server.common.metrics.source.AbstractSource$$anonfun$getMetrics$4.apply(AbstractSource.scala:282)

/cc @waitinfuture

/assign @FMX

@FMX FMX added the bug Something isn't working label Jan 6, 2022
@FMX
Copy link
Contributor Author

FMX commented Jan 7, 2022

I think it`s fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant