Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10s latency introduced when no external network connectivity #5295

Closed
necrolyte2 opened this issue Feb 2, 2022 · 16 comments · Fixed by #5297
Closed

10s latency introduced when no external network connectivity #5295

necrolyte2 opened this issue Feb 2, 2022 · 16 comments · Fixed by #5297
Labels
bug Something isn't working

Comments

@necrolyte2
Copy link

necrolyte2 commented Feb 2, 2022

Describe the bug
There appears to be some sort of blocking occurring when there is no connection available for the application(disconnect WIFI...)

Steps to reproduce
REPRO

I tested using the above with versions 1.5.2, 1.6.2, 1.7.0, 1.7.2, 1.9.2, and 1.10.1
1.5.2 and 1.6.2 did not have the issue, but all other versions >= 1.7.0 did

What did you expect to see?
Calling the endpoint should take less than 100ms

What did you see instead?
A 10s latency is introduced

What version are you using?
Any version >= 1.7.0

Environment
Compiler: "AdoptOpenJDK 11

Additional context
The environment where our apps run have scenarios where they lose connectivity to the internet frequently which is where this bug was uncovered

@necrolyte2 necrolyte2 added the bug Something isn't working label Feb 2, 2022
@dgund14
Copy link

dgund14 commented Feb 2, 2022

possibly caused by adding this? maybe dns resolution that is timing out

@trask
Copy link
Member

trask commented Feb 2, 2022

@necrolyte2 can you capture a thread dump during the 10s pause so we can see where the blocking is occurring?

@necrolyte2
Copy link
Author

@trask How do we get a thread dump from java? Sorry, I know very little about the Java ecosystem. The jvm is running within a docker container inside of a VM so not entirely sure how to attach and get the thread dump

We were able to simply do a dns call on the VM running the container where the app is running and were seeing a 5s timeout

Wifi Up

colima:~# time nslookup dnstest
Server:         192.168.5.3
Address:        192.168.5.3:53

** server can't find dnstest: NXDOMAIN

** server can't find dnstest: NXDOMAIN

Command exited with non-zero status 1
real    0m 0.21s
user    0m 0.00s
sys     0m 0.00s

Wifi Down

colima:~# time nslookup dnstest
;; connection timed out; no servers could be reached

Command exited with non-zero status 1
real    0m 5.01s
user    0m 0.00s
sys     0m 0.00s

It really feels like it is being caused by some sort of DNS lookup which I think @dgund14 found some code introduced in 1.7.0.

These InetAddress lookup calls are tricky because they rely on the implementation of the host that is running the code to determine how lookups are done.

From my Mac laptop, I get an immediate response of UnknownHostException, where in the container it does not receive an exception, but instead just waits until it gets a response from dns.

@trask
Copy link
Member

trask commented Feb 2, 2022

hey @necrolyte2, I don't believe the code link above points to a method that performs any DNS resolution. jstack is probably the easiest way to capture a thread dump. if you can capture one during one of the 10s pauses, that should show us exactly where the issue is and what needs to be fixed

@dgund14
Copy link

dgund14 commented Feb 2, 2022

It looks like getHostName() can trigger a reverse lookup which is used on line 33 of that file.

@trask
Copy link
Member

trask commented Feb 2, 2022

@dgund14 thx! indeed that looks problematic, I will push a PR to experiment with removing that

@necrolyte2
Copy link
Author

Here is the output from jstack while the 10s timeout is occurring

2022-02-02 21:49:42
Full thread dump OpenJDK 64-Bit Server VM (11.0.13+8 mixed mode, sharing):

Threads class SMR info:
_java_thread_list=0x00007f1a606e1430, length=19, elements={
0x00007f1ab00f9800, 0x00007f1ab00fb800, 0x00007f1ab0102800, 0x00007f1ab0104800,
0x00007f1ab0106800, 0x00007f1ab0109000, 0x00007f1ab010b000, 0x00007f1ab0118800,
0x00007f1ab07fa000, 0x00007f1ab12cc800, 0x00007f1ab1349800, 0x00007f1ab0017800,
0x00007f1a60016000, 0x00007f1a6c283000, 0x00007f1a68001800, 0x00007f1a5c01d000,
0x00007f1a88001800, 0x00007f1a88003000, 0x00007f1a60018000
}

"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=2.80ms elapsed=728.35s tid=0x00007f1ab00f9800 nid=0x15 waiting on condition  [0x00007f1a83efc000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.ref.Reference.waitForReferencePendingList(java.base@11.0.13/Native Method)
	at java.lang.ref.Reference.processPendingReferences(java.base@11.0.13/Reference.java:241)
	at java.lang.ref.Reference$ReferenceHandler.run(java.base@11.0.13/Reference.java:213)

"Finalizer" #3 daemon prio=8 os_prio=0 cpu=1.21ms elapsed=728.35s tid=0x00007f1ab00fb800 nid=0x16 in Object.wait()  [0x00007f1a83dfb000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e15db798> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(java.base@11.0.13/ReferenceQueue.java:155)
	- waiting to re-lock in wait() <0x00000000e15db798> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(java.base@11.0.13/ReferenceQueue.java:176)
	at java.lang.ref.Finalizer$FinalizerThread.run(java.base@11.0.13/Finalizer.java:170)

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.35ms elapsed=728.31s tid=0x00007f1ab0102800 nid=0x17 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.05ms elapsed=728.31s tid=0x00007f1ab0104800 nid=0x18 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #6 daemon prio=9 os_prio=0 cpu=7724.52ms elapsed=728.31s tid=0x00007f1ab0106800 nid=0x19 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"C1 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=2810.53ms elapsed=728.31s tid=0x00007f1ab0109000 nid=0x1a waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"Sweeper thread" #8 daemon prio=9 os_prio=0 cpu=5.28ms elapsed=728.30s tid=0x00007f1ab010b000 nid=0x1b runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Common-Cleaner" #9 daemon prio=8 os_prio=0 cpu=6.07ms elapsed=728.14s tid=0x00007f1ab0118800 nid=0x1c in Object.wait()  [0x00007f1a83682000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e1668570> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(java.base@11.0.13/ReferenceQueue.java:155)
	- waiting to re-lock in wait() <0x00000000e1668570> (a java.lang.ref.ReferenceQueue$Lock)
	at jdk.internal.ref.CleanerImpl.run(java.base@11.0.13/CleanerImpl.java:148)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)
	at jdk.internal.misc.InnocuousThread.run(java.base@11.0.13/InnocuousThread.java:134)

"BatchSpanProcessor_WorkerThread-1" #10 daemon prio=5 os_prio=0 cpu=154.43ms elapsed=725.59s tid=0x00007f1ab07fa000 nid=0x1e waiting on condition  [0x00007f1a82907000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
	- parking to wait for  <0x00000000e1b22b50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.13/LockSupport.java:234)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.13/AbstractQueuedSynchronizer.java:2123)
	at java.util.concurrent.ArrayBlockingQueue.poll(java.base@11.0.13/ArrayBlockingQueue.java:432)
	at io.opentelemetry.sdk.trace.export.BatchSpanProcessor$Worker.run(BatchSpanProcessor.java:246)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ratpack-compute-1-1" #14 prio=10 os_prio=0 cpu=287.54ms elapsed=711.66s tid=0x00007f1ab12cc800 nid=0x20 runnable  [0x00007f1a80ef5000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:96)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:276)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:305)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ratpack-compute-1-2" #15 prio=10 os_prio=0 cpu=321.23ms elapsed=711.04s tid=0x00007f1ab1349800 nid=0x21 runnable  [0x00007f1a80df4000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:96)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:276)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:305)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"DestroyJavaVM" #17 prio=5 os_prio=0 cpu=8063.16ms elapsed=710.98s tid=0x00007f1ab0017800 nid=0xe waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"ratpack-compute-1-3" #18 prio=10 os_prio=0 cpu=876.42ms elapsed=683.25s tid=0x00007f1a60016000 nid=0x22 runnable  [0x00007f1a80cef000]
   java.lang.Thread.State: RUNNABLE
	at java.net.Inet4AddressImpl.lookupAllHostAddr(java.base@11.0.13/Native Method)
	at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(java.base@11.0.13/InetAddress.java:929)
	at java.net.InetAddress.getAddressesFromNameService(java.base@11.0.13/InetAddress.java:1519)
	at java.net.InetAddress$NameServiceAddresses.get(java.base@11.0.13/InetAddress.java:848)
	- locked <0x00000000e5458b70> (a java.net.InetAddress$NameServiceAddresses)
	at java.net.InetAddress.getAllByName0(java.base@11.0.13/InetAddress.java:1509)
	at java.net.InetAddress.getAllByName(java.base@11.0.13/InetAddress.java:1368)
	at java.net.InetAddress.getAllByName(java.base@11.0.13/InetAddress.java:1302)
	at java.net.InetAddress.getByName(java.base@11.0.13/InetAddress.java:1252)
	at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:156)
	at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:153)
	at java.security.AccessController.doPrivileged(java.base@11.0.13/Native Method)
	at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:153)
	at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:41)
	at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:61)
	at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:53)
	at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:55)
	at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:31)
	at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:106)
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver.lambda$resolve$0(InstrumentedAddressResolverGroup.java:81)
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver$$Lambda$386/0x00000008404c8040.get(Unknown Source)
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver.instrumentResolve(InstrumentedAddressResolverGroup.java:108)
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver.resolve(InstrumentedAddressResolverGroup.java:81)
	at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:206)
	at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:162)
	at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:116)
	at io.netty.channel.pool.SimpleChannelPool.connectChannel(SimpleChannelPool.java:265)
	at io.netty.channel.pool.SimpleChannelPool.acquireHealthyFromPoolOrNew(SimpleChannelPool.java:177)
	at io.netty.channel.pool.SimpleChannelPool.acquire(SimpleChannelPool.java:162)
	at io.netty.channel.pool.SimpleChannelPool.acquire(SimpleChannelPool.java:157)
	at ratpack.http.client.internal.RequestActionSupport.connect(RequestActionSupport.java:90)
	at ratpack.http.client.internal.DefaultHttpClient.lambda$request$1(DefaultHttpClient.java:305)
	at ratpack.http.client.internal.DefaultHttpClient$$Lambda$362/0x00000008404b7040.connect(Unknown Source)
	at ratpack.exec.internal.DefaultExecution.lambda$null$2(DefaultExecution.java:124)
	at ratpack.exec.internal.DefaultExecution$$Lambda$377/0x00000008404b3440.execute(Unknown Source)
	at io.opentelemetry.javaagent.instrumentation.ratpack.ActionWrapper.execute(ActionWrapper.java:26)
	at ratpack.exec.internal.DefaultExecution$SingleEventExecStream.exec(DefaultExecution.java:499)
	at ratpack.exec.internal.DefaultExecution.exec(DefaultExecution.java:224)
	at ratpack.exec.internal.DefaultExecution.exec(DefaultExecution.java:217)
	at ratpack.exec.internal.DefaultExecution.drain(DefaultExecution.java:187)
	at ratpack.exec.internal.DefaultExecController$1.start(DefaultExecController.java:191)
	at ratpack.handling.internal.DefaultContext.start(DefaultContext.java:101)
	at ratpack.server.internal.NettyHandlerAdapter.newRequest(NettyHandlerAdapter.java:183)
	at ratpack.server.internal.NettyHandlerAdapter.channelRead(NettyHandlerAdapter.java:92)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
	at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.server.HttpServerRequestTracingHandler.channelRead(HttpServerRequestTracingHandler.java:56)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:424)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ForkJoinPool.commonPool-worker-3" #19 daemon prio=10 os_prio=0 cpu=1.11ms elapsed=682.72s tid=0x00007f1a6c283000 nid=0x23 waiting on condition  [0x00007f1a809f2000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
	- parking to wait for  <0x00000000e352df00> (a java.util.concurrent.ForkJoinPool)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.13/LockSupport.java:194)
	at java.util.concurrent.ForkJoinPool.runWorker(java.base@11.0.13/ForkJoinPool.java:1628)
	at java.util.concurrent.ForkJoinWorkerThread.run(java.base@11.0.13/ForkJoinWorkerThread.java:183)

"Attach Listener" #23 daemon prio=9 os_prio=0 cpu=1.56ms elapsed=105.39s tid=0x00007f1a68001800 nid=0x68 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"OkHttpSender Dispatcher" #24 daemon prio=5 os_prio=0 cpu=18.49ms elapsed=55.02s tid=0x00007f1a5c01d000 nid=0x83 waiting on condition  [0x00007f1a806f1000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
	- parking to wait for  <0x00000000e183d088> (a java.util.concurrent.SynchronousQueue$TransferStack)
	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.13/LockSupport.java:234)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base@11.0.13/SynchronousQueue.java:462)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base@11.0.13/SynchronousQueue.java:361)
	at java.util.concurrent.SynchronousQueue.poll(java.base@11.0.13/SynchronousQueue.java:937)
	at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.13/ThreadPoolExecutor.java:1053)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.13/ThreadPoolExecutor.java:1114)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.13/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"OkHttp TaskRunner" #25 daemon prio=5 os_prio=0 cpu=2.30ms elapsed=55.02s tid=0x00007f1a88001800 nid=0x84 in Object.wait()  [0x00007f1a802ec000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e1916a70> (a okhttp3.internal.concurrent.TaskRunner)
	at java.lang.Object.wait(java.base@11.0.13/Object.java:462)
	at okhttp3.internal.concurrent.TaskRunner$RealBackend.coordinatorWait(TaskRunner.kt:294)
	at okhttp3.internal.concurrent.TaskRunner.awaitTaskToRun(TaskRunner.kt:218)
	at okhttp3.internal.concurrent.TaskRunner$runnable$1.run(TaskRunner.kt:59)
	- waiting to re-lock in wait() <0x00000000e1916a70> (a okhttp3.internal.concurrent.TaskRunner)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.13/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.13/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"Okio Watchdog" #26 daemon prio=5 os_prio=0 cpu=1.25ms elapsed=55.01s tid=0x00007f1a88003000 nid=0x85 in Object.wait()  [0x00007f1a801eb000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e5e6dd10> (a java.lang.Class for okio.AsyncTimeout)
	at java.lang.Object.wait(java.base@11.0.13/Object.java:462)
	at okio.AsyncTimeout$Companion.awaitTimeout$okio(AsyncTimeout.kt:316)
	at okio.AsyncTimeout$Watchdog.run(AsyncTimeout.kt:187)
	- waiting to re-lock in wait() <0x00000000e5e6dd10> (a java.lang.Class for okio.AsyncTimeout)

"ratpack-compute-1-4" #27 prio=10 os_prio=0 cpu=69.36ms elapsed=54.33s tid=0x00007f1a60018000 nid=0x86 runnable  [0x00007f1a54ffb000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:96)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:276)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:305)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"VM Thread" os_prio=0 cpu=484.24ms elapsed=728.38s tid=0x00007f1ab00f7000 nid=0x14 runnable  

"GC Thread#0" os_prio=0 cpu=231.79ms elapsed=728.43s tid=0x00007f1ab002f000 nid=0xf runnable  

"GC Thread#1" os_prio=0 cpu=320.49ms elapsed=727.42s tid=0x00007f1a78001000 nid=0x1d runnable  

"G1 Main Marker" os_prio=0 cpu=9.38ms elapsed=728.43s tid=0x00007f1ab003f000 nid=0x10 runnable  

"G1 Conc#0" os_prio=0 cpu=67.08ms elapsed=728.43s tid=0x00007f1ab0041000 nid=0x11 runnable  

"G1 Refine#0" os_prio=0 cpu=9.24ms elapsed=728.43s tid=0x00007f1ab0064000 nid=0x12 runnable  

"G1 Young RemSet Sampling" os_prio=0 cpu=402.89ms elapsed=728.43s tid=0x00007f1ab0066000 nid=0x13 runnable  
"VM Periodic Task Thread" os_prio=0 cpu=3753.00ms elapsed=719.30s tid=0x00007f1ab0b79000 nid=0x1f waiting on condition  

JNI global refs: 26, weak refs: 0

@trask
Copy link
Member

trask commented Feb 2, 2022

@necrolyte2 thx!

unfortunately I'm a little confused by where the stack trace is pointing to:

at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:106)
at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver.lambda$resolve$0(InstrumentedAddressResolverGroup.java:81)
at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver$$Lambda$386/0x00000008404c8040.get(Unknown Source)
at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver.instrumentResolve(InstrumentedAddressResolverGroup.java:108)
at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup$InstrumentedResolver.resolve(InstrumentedAddressResolverGroup.java:81)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:206)

because InstrumentedResolver is just capturing telemetry about an existing netty call to resolve, so I would expect this same netty call to resolve whether or not the javaagent was attached

just want to confirm is this result consistent? (the stack trace during the pause points to this location, and you don't see the slowness when the javaagent is not attached?)

@necrolyte2
Copy link
Author

It doesn't occur with any version below 1.7.0 of the agent

@trask
Copy link
Member

trask commented Feb 2, 2022

Can you capture another thread dump to see if you happen to get any other locations? The InstrumentedResolver was only introduced in 1.9.0 (#4627), which makes me think that's not the culprit

@necrolyte2
Copy link
Author

Previous dump was while running the agent with the changes from your draft PR.
This one is from running the agent from the main branch

2022-02-02 23:29:28
Full thread dump OpenJDK 64-Bit Server VM (11.0.13+8 mixed mode, sharing):

Threads class SMR info:
_java_thread_list=0x00007fd8e0003010, length=19, elements={
0x00007fd9280f9800, 0x00007fd9280fb800, 0x00007fd928102800, 0x00007fd928104800,
0x00007fd928106800, 0x00007fd928109000, 0x00007fd92810b000, 0x00007fd928119000,
0x00007fd9287ce000, 0x00007fd929519000, 0x00007fd92959d000, 0x00007fd928017800,
0x00007fd8d812c000, 0x00007fd8e404f800, 0x00007fd8ec013000, 0x00007fd8d80b0800,
0x00007fd8d4010000, 0x00007fd900001800, 0x00007fd8e0001800
}

"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=2.47ms elapsed=328.75s tid=0x00007fd9280f9800 nid=0x13 waiting on condition  [0x00007fd8fbe7b000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.ref.Reference.waitForReferencePendingList(java.base@11.0.13/Native Method)
	at java.lang.ref.Reference.processPendingReferences(java.base@11.0.13/Reference.java:241)
	at java.lang.ref.Reference$ReferenceHandler.run(java.base@11.0.13/Reference.java:213)

"Finalizer" #3 daemon prio=8 os_prio=0 cpu=1.33ms elapsed=328.75s tid=0x00007fd9280fb800 nid=0x14 in Object.wait()  [0x00007fd8fbd7a000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e15edc48> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(java.base@11.0.13/ReferenceQueue.java:155)
	- waiting to re-lock in wait() <0x00000000e15edc48> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(java.base@11.0.13/ReferenceQueue.java:176)
	at java.lang.ref.Finalizer$FinalizerThread.run(java.base@11.0.13/Finalizer.java:170)

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=12.51ms elapsed=328.73s tid=0x00007fd928102800 nid=0x15 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.21ms elapsed=328.73s tid=0x00007fd928104800 nid=0x16 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #6 daemon prio=9 os_prio=0 cpu=5946.65ms elapsed=328.73s tid=0x00007fd928106800 nid=0x17 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"C1 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=1989.93ms elapsed=328.73s tid=0x00007fd928109000 nid=0x18 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"Sweeper thread" #8 daemon prio=9 os_prio=0 cpu=4.57ms elapsed=328.73s tid=0x00007fd92810b000 nid=0x19 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Common-Cleaner" #9 daemon prio=8 os_prio=0 cpu=4.18ms elapsed=328.65s tid=0x00007fd928119000 nid=0x1a in Object.wait()  [0x00007fd8fb601000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e163b9e8> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(java.base@11.0.13/ReferenceQueue.java:155)
	- waiting to re-lock in wait() <0x00000000e163b9e8> (a java.lang.ref.ReferenceQueue$Lock)
	at jdk.internal.ref.CleanerImpl.run(java.base@11.0.13/CleanerImpl.java:148)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)
	at jdk.internal.misc.InnocuousThread.run(java.base@11.0.13/InnocuousThread.java:134)

"BatchSpanProcessor_WorkerThread-1" #10 daemon prio=5 os_prio=0 cpu=98.17ms elapsed=327.60s tid=0x00007fd9287ce000 nid=0x1c waiting on condition  [0x00007fd8fa886000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
	- parking to wait for  <0x00000000e1a687f0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.13/LockSupport.java:234)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.13/AbstractQueuedSynchronizer.java:2123)
	at java.util.concurrent.ArrayBlockingQueue.poll(java.base@11.0.13/ArrayBlockingQueue.java:432)
	at io.opentelemetry.sdk.trace.export.BatchSpanProcessor$Worker.run(BatchSpanProcessor.java:246)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ratpack-compute-1-1" #14 prio=10 os_prio=0 cpu=181.38ms elapsed=322.34s tid=0x00007fd929519000 nid=0x1e runnable  [0x00007fd8f8e74000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:96)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:276)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:305)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ratpack-compute-1-2" #15 prio=10 os_prio=0 cpu=234.11ms elapsed=321.81s tid=0x00007fd92959d000 nid=0x1f runnable  [0x00007fd8f8d73000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:96)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:276)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:305)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"DestroyJavaVM" #17 prio=5 os_prio=0 cpu=5838.59ms elapsed=321.76s tid=0x00007fd928017800 nid=0xc waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"ratpack-compute-1-3" #18 prio=10 os_prio=0 cpu=878.80ms elapsed=318.84s tid=0x00007fd8d812c000 nid=0x20 runnable  [0x00007fd8f8c71000]
   java.lang.Thread.State: RUNNABLE
	at java.net.Inet4AddressImpl.getHostByAddr(java.base@11.0.13/Native Method)
	at java.net.InetAddress$PlatformNameService.getHostByAddr(java.base@11.0.13/InetAddress.java:935)
	at java.net.InetAddress.getHostFromNameService(java.base@11.0.13/InetAddress.java:659)
	at java.net.InetAddress.getHostName(java.base@11.0.13/InetAddress.java:602)
	at java.net.InetAddress.getHostName(java.base@11.0.13/InetAddress.java:574)
	at io.opentelemetry.javaagent.shaded.instrumentation.api.instrumenter.net.InetSocketAddressNetServerAttributesGetter.peerName(InetSocketAddressNetServerAttributesGetter.java:33)
	at io.opentelemetry.javaagent.shaded.instrumentation.api.instrumenter.net.NetServerAttributesExtractor.onStart(NetServerAttributesExtractor.java:38)
	at io.opentelemetry.javaagent.shaded.instrumentation.api.instrumenter.Instrumenter.start(Instrumenter.java:174)
	at io.opentelemetry.javaagent.shaded.instrumentation.api.instrumenter.ServerInstrumenter.start(ServerInstrumenter.java:30)
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.server.HttpServerRequestTracingHandler.channelRead(HttpServerRequestTracingHandler.java:51)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:424)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ForkJoinPool.commonPool-worker-3" #19 daemon prio=10 os_prio=0 cpu=1.03ms elapsed=308.34s tid=0x00007fd8e404f800 nid=0x21 waiting on condition  [0x00007fd8f8971000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
	- parking to wait for  <0x00000000e20b9c40> (a java.util.concurrent.ForkJoinPool)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.13/LockSupport.java:194)
	at java.util.concurrent.ForkJoinPool.runWorker(java.base@11.0.13/ForkJoinPool.java:1628)
	at java.util.concurrent.ForkJoinWorkerThread.run(java.base@11.0.13/ForkJoinWorkerThread.java:183)

"OkHttp TaskRunner" #21 daemon prio=5 os_prio=0 cpu=1.61ms elapsed=307.49s tid=0x00007fd8ec013000 nid=0x23 in Object.wait()  [0x00007fd8f836c000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e1926c18> (a okhttp3.internal.concurrent.TaskRunner)
	at java.lang.Object.wait(java.base@11.0.13/Object.java:462)
	at okhttp3.internal.concurrent.TaskRunner$RealBackend.coordinatorWait(TaskRunner.kt:294)
	at okhttp3.internal.concurrent.TaskRunner.awaitTaskToRun(TaskRunner.kt:218)
	at okhttp3.internal.concurrent.TaskRunner$runnable$1.run(TaskRunner.kt:59)
	- waiting to re-lock in wait() <0x00000000e1926c18> (a okhttp3.internal.concurrent.TaskRunner)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.13/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.13/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"ratpack-compute-1-4" #23 prio=10 os_prio=0 cpu=46.96ms elapsed=36.98s tid=0x00007fd8d80b0800 nid=0x77 runnable  [0x00007fd8f826b000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:96)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:276)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:305)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory.lambda$newThread$0(DefaultExecController.java:137)
	at ratpack.exec.internal.DefaultExecController$ExecControllerBindingThreadFactory$$Lambda$313/0x000000084040a440.run(Unknown Source)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"OkHttpSender Dispatcher" #24 daemon prio=5 os_prio=0 cpu=5.87ms elapsed=32.27s tid=0x00007fd8d4010000 nid=0x79 waiting on condition  [0x00007fd8f8670000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
	- parking to wait for  <0x00000000e19ca410> (a java.util.concurrent.SynchronousQueue$TransferStack)
	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.13/LockSupport.java:234)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base@11.0.13/SynchronousQueue.java:462)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base@11.0.13/SynchronousQueue.java:361)
	at java.util.concurrent.SynchronousQueue.poll(java.base@11.0.13/SynchronousQueue.java:937)
	at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.13/ThreadPoolExecutor.java:1053)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.13/ThreadPoolExecutor.java:1114)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.13/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)

"Okio Watchdog" #25 daemon prio=5 os_prio=0 cpu=1.61ms elapsed=32.26s tid=0x00007fd900001800 nid=0x7a in Object.wait()  [0x00007fd8f816a000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.13/Native Method)
	- waiting on <0x00000000e78115c0> (a java.lang.Class for okio.AsyncTimeout)
	at okio.AsyncTimeout$Companion.awaitTimeout$okio(AsyncTimeout.kt:300)
	at okio.AsyncTimeout$Watchdog.run(AsyncTimeout.kt:187)
	- waiting to re-lock in wait() <0x00000000e78115c0> (a java.lang.Class for okio.AsyncTimeout)

"Attach Listener" #26 daemon prio=9 os_prio=0 cpu=1.58ms elapsed=15.67s tid=0x00007fd8e0001800 nid=0x8e waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"VM Thread" os_prio=0 cpu=504.15ms elapsed=328.76s tid=0x00007fd9280f7000 nid=0x12 runnable  

"GC Thread#0" os_prio=0 cpu=242.14ms elapsed=328.79s tid=0x00007fd92802f000 nid=0xd runnable  

"GC Thread#1" os_prio=0 cpu=239.79ms elapsed=328.44s tid=0x00007fd8f0001000 nid=0x1b runnable  

"G1 Main Marker" os_prio=0 cpu=4.68ms elapsed=328.79s tid=0x00007fd92803f000 nid=0xe runnable  

"G1 Conc#0" os_prio=0 cpu=97.93ms elapsed=328.79s tid=0x00007fd928041000 nid=0xf runnable  

"G1 Refine#0" os_prio=0 cpu=4.64ms elapsed=328.78s tid=0x00007fd928064000 nid=0x10 runnable  

"G1 Young RemSet Sampling" os_prio=0 cpu=268.98ms elapsed=328.78s tid=0x00007fd928066000 nid=0x11 runnable  
"VM Periodic Task Thread" os_prio=0 cpu=2220.94ms elapsed=325.08s tid=0x00007fd928d9c800 nid=0x1d waiting on condition  

JNI global refs: 26, weak refs: 0

@trask
Copy link
Member

trask commented Feb 2, 2022

oh, this is good, so maybe the PR fixes the issue? or is there still a difference with/without the patched agent?

@necrolyte2
Copy link
Author

Using these changes(which I was lazy so just made them all return null) seems to fix the issue. I'll quick use your code changes to verify that PR as well

diff --git a/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetClientAttributesGetter.java b/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetClientAttributesGetter.java
index 7975aa8630..000ab68ee9 100644
--- a/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetClientAttributesGetter.java
+++ b/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetClientAttributesGetter.java
@@ -5,7 +5,6 @@
 
 package io.opentelemetry.instrumentation.api.instrumenter.net;
 
-import java.net.InetAddress;
 import java.net.InetSocketAddress;
 import javax.annotation.Nullable;
 
@@ -25,37 +24,18 @@ public abstract class InetSocketAddressNetClientAttributesGetter<REQUEST, RESPON
   @Override
   @Nullable
   public final String peerName(REQUEST request, @Nullable RESPONSE response) {
-    InetSocketAddress address = getAddress(request, response);
-    if (address == null) {
-      return null;
-    }
-    if (address.getAddress() != null) {
-      return address.getAddress().getHostName();
-    }
-    return address.getHostString();
+    return null;
   }
 
   @Override
   @Nullable
   public final Integer peerPort(REQUEST request, @Nullable RESPONSE response) {
-    InetSocketAddress address = getAddress(request, response);
-    if (address == null) {
-      return null;
-    }
-    return address.getPort();
+    return null;
   }
 
   @Override
   @Nullable
   public final String peerIp(REQUEST request, @Nullable RESPONSE response) {
-    InetSocketAddress address = getAddress(request, response);
-    if (address == null) {
-      return null;
-    }
-    InetAddress remoteAddress = address.getAddress();
-    if (remoteAddress != null) {
-      return remoteAddress.getHostAddress();
-    }
     return null;
   }
 }
diff --git a/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetServerAttributesGetter.java b/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetServerAttributesGetter.java
index 92a286f37b..e7e13507cb 100644
--- a/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetServerAttributesGetter.java
+++ b/instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/instrumenter/net/InetSocketAddressNetServerAttributesGetter.java
@@ -5,7 +5,6 @@
 
 package io.opentelemetry.instrumentation.api.instrumenter.net;
 
-import java.net.InetAddress;
 import java.net.InetSocketAddress;
 import javax.annotation.Nullable;
 
@@ -25,37 +24,18 @@ public abstract class InetSocketAddressNetServerAttributesGetter<REQUEST>
   @Override
   @Nullable
   public final String peerName(REQUEST request) {
-    InetSocketAddress address = getAddress(request);
-    if (address == null) {
-      return null;
-    }
-    if (address.getAddress() != null) {
-      return address.getAddress().getHostName();
-    }
-    return address.getHostString();
+    return null;
   }
 
   @Override
   @Nullable
   public final Integer peerPort(REQUEST request) {
-    InetSocketAddress address = getAddress(request);
-    if (address == null) {
-      return null;
-    }
-    return address.getPort();
+    return null;
   }
 
   @Override
   @Nullable
   public final String peerIp(REQUEST request) {
-    InetSocketAddress address = getAddress(request);
-    if (address == null) {
-      return null;
-    }
-    InetAddress remoteAddress = address.getAddress();
-    if (remoteAddress != null) {
-      return remoteAddress.getHostAddress();
-    }
     return null;
   }
 }

Both of these are performed in the scenario of not having connectivity to the upstream dns server, but with the code in your draft PR

$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=doesnotexisthost

11.261637


$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=httpapp
did local url for httpapp<br/>{"status": "up"}
0.015626

Same test, but with code from main branch

$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=doesnotexisthost

20.114734


$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=httpapp
did local url for httpapp<br/>{"status": "up"}
10.053395

@necrolyte2
Copy link
Author

necrolyte2 commented Feb 2, 2022

Looks good!
This is coming from the same code change as in #5297

$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=doesnotexisthost

10.718004


$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=httpapp
did local url for httpapp<br/>{"status": "up"}
0.248555

@trask
Copy link
Member

trask commented Feb 3, 2022

awesome, thx for the help @necrolyte2 @dgund14!!

@necrolyte2
Copy link
Author

Just as a follow-up, I ran the same tests with the 1.5.2 agent just to compare which produced the same results as the PR which is good to see(as further validation).

$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=httpapp
did local url for httpapp<br/>{"status": "up"}
0.015770
$ curl -w '\n%{time_total}\n' http://localhost:5050/api/v1/localthing?hostname=httpapp2

10.032659

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants