-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Darwin DNS-SD implementation stops browsing services upon finding a wrong service. #19194
Comments
There is no timeout on browse right now, fwiw... I guess since we switched to browsing for the _CM subtype there's a problem if multiple commissionable things are all advertising because the discriminator filter is applied too late? Is that why this is being a problem? How often does this behavior manifest as an issue? The proposed solution on its own is not enough. The caller of Browse can only handle a single callback, so we would need to either change that or always keep going until the timeout and deliver all the results we found, which would leave to pretty undesirable behavior. And the caller is the generic platform mdns code, so changing that would involve changing all the other platform implementations too... |
Yes, that is the problem and it happened very frequently with test team. It may be unique in test setups where they might have some other commissionable node running. However, personally I would think it could happen frequently in real world, too, if the commissioning window is as long as a minute or so. |
This reverts commit a689c84. Now that we always register a new instance name when opening a new commissioning window the problem PR project-chip#17356 was trying to work around no longer applies. On the other hand, the new setup introduced a new problem: if there are multiple things advertising the _CM subtype (i.e. multiple things in comissioning mode at once), then we might find the first several (however much fits in a DNS packet) and then platform mdns will stop delivering results, per project-chip#19194 (which is about Darwin, but other platforms have similar issues). If we browse by discrimnator instead, the chance of multiple results is much lower, and hence the chance of finding the thing we care about is much higher.
) This reverts commit a689c84. Now that we always register a new instance name when opening a new commissioning window the problem PR #17356 was trying to work around no longer applies. On the other hand, the new setup introduced a new problem: if there are multiple things advertising the _CM subtype (i.e. multiple things in comissioning mode at once), then we might find the first several (however much fits in a DNS packet) and then platform mdns will stop delivering results, per #19194 (which is about Darwin, but other platforms have similar issues). If we browse by discrimnator instead, the chance of multiple results is much lower, and hence the chance of finding the thing we care about is much higher.
Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275
Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275
Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275
…at fixes DNS-SD browsing Add an API to stop a DNS-SD browse operation. (project-chip#22823) * Add an API to stop a DNS-SD browse operation. Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275 * Address review comments. * Address more review comments. [darwin] Use DNSServiceReconfirmRecord for A and AAAA records to miti… (project-chip#23067) * [Dnssd] Add ReconfirmRecord method to verify address that appears to be out of date * [SetUpCodePairer] Ask Dnssd to reconfirm discovered addresses if connecting to them ends with a CHIP_ERROR_TIMEOUT Fix Logging When Trying to Log Nullptr To Strings (project-chip#23604) This PR attempts to identify all cases where %s specifiers in the logging APIs (ChipLogError(), ChipLogProgress(), ChipLogDetail()) don't have a guaranteed non-null string parameter. In all identified cases the issue is fixed using StringOrNullMarker() helper method to guarantee it doesn't happen. Use the "right" byte-swapping function for port in Darwin DnssdImpl. (project-chip#23894) The incoming port is in host byte order and we are converting to network byte order, so should use htons (which happens to do the same thing as ntohs, so no behavior change). Co-authored-by: Andrei Litvin <andy314@gmail.com> Add a way for Resolver consumers to cancel operational resolve attempts. (project-chip#24010) * Add a way for Resolver consumers to cancel operational resolve attempts. Adds a way for consumers to notify Resolver when they no longer care about an operational resolve, so a Resolver implementation can keep track of how many consumers are interested and stop work as desired if no one is interested. Fixes project-chip#23881 * Address review comments. * Address review comments. Make sure we stop resolves triggered by a browse when the browse stops on Darwin. (project-chip#24733) * Make sure we stop resolves triggered by a browse when the browse stops on Darwin. Without this change, if there is a PTR record that matches whatever we are browsing but no corresponding SRV record, we would end up leaking a resolve forever. Tested by modifying minimal mdns SrvResponder::AddAllResponses to no-op instead of actually adding any responses, then trying to commission the device running the modified minimal mdns. Without this change, when the browse stops the resolves it triggered keep going. With this change, termination of the browse also terminates the resolves. Fixes project-chip#24074 * Also avoid leaking ResolveContext instances. * Fix handling of multiple interfaces. * Address review comment. Improve discovery logging on Darwin. (project-chip#24846) 1) Use progress, not detail, logging, because detail logging is not actually persisted in system logs. 2) Add logging to a few functions that were missing it. Remove the address type argument from ResolveNodeId. (project-chip#24006) All consumers were passing kAny in practice, and some of the backends (e.g. minimal mdns) had no capability to filter by type anyway.
* Add an API to stop a DNS-SD browse operation. Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275 * Address review comments. * Address more review comments.
* Add an API to stop a DNS-SD browse operation. Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275 * Address review comments. * Address more review comments.
* Add an API to stop a DNS-SD browse operation. Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275 * Address review comments. * Address more review comments.
* Add an API to stop a DNS-SD browse operation. Most backends don't implement this yet. Darwin does, and no longer stops Browse operations itself. Fixes project-chip#19194 May provide a way toward fixing project-chip#13275 * Address review comments. * Address more review comments.
Problem
Darwin chip::Dnssd::Browse() stops browsing when OnBrowse() was called without kDNSServiceFlagsMoreComing flag.
Per https://developer.apple.com/documentation/dnssd/1823436-anonymous/kdnsserviceflagsmorecoming?language=objc, the flag only means that there are no more "queued" results.
The issue with it is that if the desired service is discovered later than an undesirable service (service from another device for example), the desired service is never discovered within the timeout, causing failure.
Proposed Solution
Remove sdCtx->Finalize() call with the flag check, and make sure the context is finalized when desired service was found or when timeout occurs.
The text was updated successfully, but these errors were encountered: