-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUM-2355 fix: Network Instrumentation #1588
Conversation
Datadog ReportBranch report: ❄️ New Flaky Tests (1)
|
8d1c890
to
894bf48
Compare
f99a874
to
86001c6
Compare
set { | ||
pthread_rwlock_wrlock(&rwlock) | ||
value = newValue | ||
pthread_rwlock_unlock(&rwlock) | ||
} | ||
set { mutate { $0 = newValue } } | ||
} | ||
|
||
/// Provides a non-escaping closure for mutation. | ||
/// The lock will be acquired once for writing before invoking the closure. | ||
/// | ||
/// - Parameter closure: The closure with the mutable value. | ||
public func mutate(_ closure: (inout Value) -> Void) { | ||
public func mutate(_ closure: (inout Value) throws -> Void) rethrows { | ||
pthread_rwlock_wrlock(&rwlock) | ||
closure(&value) | ||
pthread_rwlock_unlock(&rwlock) | ||
defer { pthread_rwlock_unlock(&rwlock) } | ||
try closure(&value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related, but better :)
/// Datadog delegate object. | ||
/// | ||
/// The class implementing `DDURLSessionDelegateProviding` must ensure that following method calls are forwarded to `ddURLSessionDelegate`: | ||
/// - `func urlSession(_:task:didFinishCollecting:)` | ||
/// - `func urlSession(_:task:didCompleteWithError:)` | ||
/// - `func urlSession(_:dataTask:didReceive:)` | ||
var ddURLSessionDelegate: DatadogURLSessionDelegate { get } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to re-introduce this constrain to restore legacy behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which particular behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one:
Legacy
DatadogURLSessionDelegate
Changes in the legacy delegate are preventing from initialising multiple instances with different first-party hosts:
TheDatadogURLSessionDelegate.init
is doing a staticURLSessionInstrumentation.enable
which can be done only once for a given delegate type.
The DatadogURLSessionDelegate
class cannot be instrumented like a custom delegate type since it defines first-party hosts per instance. It will now define it's own NetworkInstrumentationSwizzler
instance to intercept task events, therefor the deprecated __URLSessionDelegateProviding
still needs to forward data to the ddURLSessionDelegate
.
@@ -28,7 +35,7 @@ open class DatadogURLSessionDelegate: NSObject, URLSessionDataDelegate { | |||
return URLSessionInterceptor.shared(in: core) | |||
} | |||
|
|||
/* private */ public let firstPartyHosts: FirstPartyHosts | |||
let swizzler = URLSessionSwizzler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instances of DatadogURLSessionDelegate
hold their own swizzling instance to instrument the task on resume.
e9228cc
to
b2bd78a
Compare
@@ -38,6 +38,9 @@ internal final class NetworkInstrumentationFeature: DatadogFeature { | |||
@ReadWriteLock | |||
internal var handlers: [DatadogURLSessionHandler] = [] | |||
|
|||
@ReadWriteLock | |||
internal var swizzlers: [ObjectIdentifier: URLSessionSwizzler] = [:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, never knew such type existed.
} | ||
} | ||
/// Swizzles `URLSession*` methods. | ||
internal final class URLSessionSwizzler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/Blocking
Combining all the swizzling code inside one type was never the intent but I understand the reasoning behind. However, this can be done with the existing approach by making the each swizzler type as intance type.
This makes the testing each swizzling class easy, better code navigation. You can compose those types in a NetworkInstrumentationSwizzler
type.
something like
class NetworkInstrumentationSwizzler {
let urlSessionSwizzler: URLSessionSwizzler
let urlSessionTaskDelegateSwizzler: URLSessionTaskDelegateSwizzler
....
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As suggested, I have split the swizzler class, see the last commit. But TBH, I don't see the benefit 🤔 It just adds another layer and doesn't help much with code navigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Massive efforts 👌 - I noticed only one blocking change, other look ok. It is really hard to see it well through (the domain is complex, code looks fine), hence I count on the quality of tests. Glad we add more integration scenarios.
Last question, about:
Task Interception
After first configuration, our new Network Instrumentation logic introduced in #1394 is intercepting all URLSessionTask regardless of their delegate.
Is there a test we add to verify that not instrumented requests are ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blocking/ The new logic introduces new global swizzlings that need to be covered in integrity checks. Here we remove old checks instead of updating them ⛔. Balancing following calls must be enforced by DatadogTestsObserver
, otherwise we will run into flakiness ❄️:
urlSessionSwizzler.swizzle()
urlSessionTaskSwizzler.swizzle()
urlSessionTaskDelegateSwizzler.swizzle()
urlSessionSwizzler.unswizzle()
urlSessionTaskSwizzler.unswizzle()
urlSessionTaskDelegateSwizzler.unswizzle()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, it was actually covered twice. Swizzling integrity is already checked by this which will fail if any swizzling is left, printing the method signatures 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, can you add some simple test case?
- swizzling once, Swizzling.methods.count == 1
- same method again, Swizzling.methods.count == 2
- another method, Swizzling.methods.count == 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this to MethodSwizzlerTests
👍
@@ -38,6 +38,9 @@ internal final class NetworkInstrumentationFeature: DatadogFeature { | |||
@ReadWriteLock | |||
internal var handlers: [DatadogURLSessionHandler] = [] | |||
|
|||
@ReadWriteLock | |||
internal var swizzlers: [ObjectIdentifier: NetworkInstrumentationSwizzler] = [:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion/ We should either mark it private
or make useful in troubleshooting. The problem with its usability is that ObjectIdentifier
completely erases the identity of underlying (identified) meta type. We can preserve it with properly abstracting the ObjectIdentifier
:
internal struct SwizzledDelegateClass: Hashable, Equatable {
internal let delegateClass: URLSessionDataDelegate.Type
private let identifier: ObjectIdentifier
init(delegateClass: URLSessionDataDelegate.Type) {
self.delegateClass = delegateClass
self.identifier = ObjectIdentifier(delegateClass)
}
func hash(into hasher: inout Hasher) { hasher.combine(identifier) }
static func == (lhs: SwizzledDelegateClass, rhs: SwizzledDelegateClass) -> Bool { lhs.identifier == rhs.identifier }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm making this private and adding debugDescription
to MethodSwizzler
that should gives us enough information to troubleshoot.
if let currentRequest = task.currentRequest { | ||
let request = self.intercept(request: currentRequest, additionalFirstPartyHosts: configuredFirstPartyHosts) | ||
task.dd.override(currentRequest: request) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question/ This condition is quite puzzling. In what circumstances we expect task
to not have currentRequest
? Let's explain it in the code comment if appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question, I don't know what could nillify the originalRequest
or currentRequest
🤔 the doc doesn't say..
DatadogInternal/Sources/NetworkInstrumentation/URLSession/DatadogURLSessionDelegate.swift
Outdated
Show resolved
Hide resolved
DatadogInternal/Sources/NetworkInstrumentation/URLSession/DatadogURLSessionDelegate.swift
Outdated
Show resolved
Hide resolved
/// Use a custom delegate. | ||
/// and define `firstPartyHosts` in feature configuration. | ||
case customWithFeatureFirstPartyHosts | ||
/// Use a custom delegate. | ||
/// and define `firstPartyHosts` with delegate's configuration. | ||
case customWithAdditionalFirstyPartyHosts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🏅
d276853
to
3e92e18
Compare
3e92e18
to
4bfa7c4
Compare
Datadog ReportBranch report: ✅ |
4bfa7c4
to
738d599
Compare
It is ready for another round of review :) @ncreated, to your question:
Yes, the |
@@ -132,34 +132,6 @@ internal class DatadogTestsObserver: NSObject, XCTestObservation { | |||
|
|||
If all above conditions are met, this failure might indicate a memory leak in the implementation. | |||
""" | |||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing these checks seems like a back step - in case of a memory leak we will not able to catch any lingering swizzlings.
A check something like if there is an instance of swizzler in memory or not will be helpful.
something like
func checkInstancesOfType(_ type: AnyClass) -> Bool {
let typeString = String(describing: type)
for bundle in Bundle.allBundles {
if let className = bundle.classNamed(typeString) {
// Check if an instance of the class exists
if class_getInstanceSize(className) > 0 {
return true
}
}
}
return false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is already there, see my comment. These checks were redundant, we are already asserting that no swizzling is left here. Failure will print the method signature, so we will know which sizzling is at fault.
guard let self = self, task.dd.delegate?.isKind(of: configuration.delegateClass) == true else { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the dd.delegate lifecycle is managed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a computed var
and not retained. see its implementation here
/// Datadog delegate object. | ||
/// | ||
/// The class implementing `DDURLSessionDelegateProviding` must ensure that following method calls are forwarded to `ddURLSessionDelegate`: | ||
/// - `func urlSession(_:task:didFinishCollecting:)` | ||
/// - `func urlSession(_:task:didCompleteWithError:)` | ||
/// - `func urlSession(_:dataTask:didReceive:)` | ||
var ddURLSessionDelegate: DatadogURLSessionDelegate { get } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which particular behavior?
DatadogInternal/Sources/NetworkInstrumentation/URLSession/URLSessionDataDelegateSwizzler.swift
Show resolved
Hide resolved
DatadogInternal/Sources/NetworkInstrumentation/URLSession/URLSessionTaskDelegateSwizzler.swift
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍. Seeing a lot of changes that involve OS-version-specific swizzlings, can we verify if unit tests pass all major iOS versions from 12 to 16 🙏? Currently we can't rely on nightly tests for this.
I have ran the test-suites on the following versions:
It fails on iOS 13, from what I see it's due to the |
Problem Statement
Task Interception
After first configuration, our new Network Instrumentation logic introduced in #1394 is intercepting all
URLSessionTask
regardless of their delegate.Starting an interception can result in Start Resource in RUM. When the custom delegate is not in play, the interception will never finish and so the RUM Resource will never stop. This means that the resource event won't be transmitted, but it will keep the Resource scopes active and therefore the view scopes open.
Task Completion
Tasks created with completion handler on pre-iOS 15 are never completed: We apply
#available(iOS 15, tvOS 15, *)
condition for completing the interception when receiving metrics, but we have no condition for pre-iOS 15.Swizzling Unbinding
Swizzlings on
URLSession*
are not removed when deallocating theNetworkInstrumentationFeature
, this is preventing from re-enabling the instrumentation on a new core instance. It is blocking the 'stop core instance' feature on iOS.Legacy
DatadogURLSessionDelegate
Changes in the legacy delegate are preventing from initialising multiple instances with different first-party hosts: The
DatadogURLSessionDelegate.init
is doing a staticURLSessionInstrumentation.enable
which can be done only once for a given delegate type.Solutions
Instance-based Swizzler
In #1564, we have introduced reliable unswizzling logic which allows to create multiple instances of swizzler for the same method. I'm leveraging this to associate swizzlers to an instance of
NetworkInstrumentationFeature
, we can then unswizzle safely at deallocation.The instance-based swizzling also allow to fix the legacy
DatadogURLSessionDelegate
by keeping a swizzler per delegate instance.Check for Task's delegate type
We now check for the delegate type before starting an interception,either when using the new
URLSessionInstrumentation
or the legacyDatadogURLSessionDelegate
.Task Completion
The
URLSessionSwizzler
now allow to swizzle theURLSession.dataTask(with:completionHandler:)
methods on bothURL
andURLRequest
and swizzling is applied for all OS versions.Review
I suggest to review the following files first:
Review checklist
Custom CI job configuration (optional)
tools/