-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support async NOC generation via op cred delegate API #7863
Conversation
- Reduce memory requirement for generating x509 certificatesconnectedhomeip/src/controller/OperationalCredentialsDelegate.h Lines 34 to 42 in af20809
This comment was generated by todo based on a
|
Fixes #7859 |
/rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds DeviceCommissioner -> OpCSRDelegate -> DeviceCommissioner workflow coupling. This may work for now, but I think overall it would be great if DeviceCommissioner had more callbacks in general (rather than its delegates) to notify a caller of some steps, so that they can go off, and do everything they need to do async, before just making a next call of "InstallNOC" or such on DeviceCommissioner, that proceed with the flow.
Instead of:
client app -> DeviceCommissioner -> OpCSRDelegate -> DeviceCommissioner.OnDeviceNOCGenerated
We would have something like:
client app -> DeviceCommissioner: start the commissioning as per today
DeviceCommissioner -> client app: OnCSRResponseReceived callback
client app -> client app: do what you need to process CSR, take time if you need...
client app -> DeviceCommissioner: InstallNOC(noc_chain)
This would reduce synchronous flow round trip coupling. Right now, with the callback approach, if you really do want to do real async work, you have to dispatch it while synchronously blocked within the callchain of processing the CSR response, which amounts to needing a different thread (or equivalently a separate dispatch queue entirely) and also not processing other events that may arise from the network, within the stack.
I do understand that the change here "fits what is already there", but we just keep extending the breadth of the long "SDK-owned" workflows and don't leave much control to the client to actually sequence the steps as needed, in their own time. It's odd for a delegate to call a callback of its owner, rather than just a method of its owner. The OperationalCredentialsDelegate is really own by the commissioning logic.
Wait, why? The whole point of this PR is that the commissioner calls As in, this PR pretty much does the "We would have something like" flow, with the one difference I see being that it only allows the app to provide the NOC async and still gets the ICA cert sync, right? |
I wasn't saying that the interface is vendor-specific. I'm pointing out that these are pure virtual methods and the implementations are vendor-specific. Because of this, we have an unusual situation where the majority of important code that needs to implement this doesn't live inside Therefore, I'm advocating for a different design approach for these particular features. My objection to this PR isn't necessarily the changes it introduces. My objection is to the prospect of many subsequent pull requests and many subsequent design iterations. I don't believe that is the right development approach for us to choose in this case. |
Unblocking for merge since we have a robust sideband conversation discussing broader needs.
Send DAC as input parameter to GenerateNodeOperationalCertificate()connectedhomeip/src/controller/CHIPDeviceController.cpp Lines 1277 to 1283 in 5bca5e2
This comment was generated by todo based on a
|
* | ||
* @return CHIP_ERROR CHIP_NO_ERROR on success, or corresponding error code. | ||
*/ | ||
virtual CHIP_ERROR GenerateNodeOperationalCertificate(const PeerId & peerId, const ByteSpan & csr, int64_t serialNumber, | ||
uint8_t * certBuf, uint32_t certBufSize, uint32_t & outCertLen) = 0; | ||
virtual CHIP_ERROR GenerateNodeOperationalCertificate(const Optional<NodeId> & nodeId, FabricId fabricId, const ByteSpan & csr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving as this PR is needed to unblock some activities.
However, I want to point out that even with NodeID as an optional parameter, this interface misses the mark. There should be no concept under project-chip/connectedhomeip/src
of the commissioner proposing a node ID, or even fabric ID, to the CA infrastructure. That is a possible implementation. It is equally possible that all node IDs will be assigned by the CA itself.
The right approach is for such local assignment of IDs in our example implementations to occur in code under project-chip/connectedhomeip/examples
. The interface here must not specify IDs; instead, an opaque context should be passed from the app. There will be an implicit contract between app and ecosystem-specific CA infrastructure as to what context is needed, but it is impossible for any code under project-chip/connectedhomeip/src
to know or predict what that will be.
The immediate consequence here is that any CA-assignment of a node ID or fabric ID other than what is passed is likely to break the commissioner. I don't think it's overstating the problem to call this a bug. And until it's fixed, any implementation that adopts an approach of CA-assignment of all IDs will have to fork and patch the sdk to work around the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msandstedt these are the main reasons we have the optional node ID in the current CA interface.
- In the current commissioner interface, the node ID is provided by the commissioner app when the
PairDevice()
is called. I understand that it's not optimal and may even be incorrect. But unless we update all current test tools and apps as part of this PR, we will end up breaking the tip of the tree. Given we are using test events as project schedule, and TE4 cut off is 1 week away, not sure if we want to introduce all those changes in this PR. UpdateNOC
command will update the existing NOC on a device with the new NOC. Would node ID be updated as part of this process? I tried reading the spec and could not conclude one way or another. But in case we want to retain the same node ID as part ofUpdateNOC
, this optional node ID parameter can be used to inform CA of the requested node ID value. Again, may be it is not needed, or there are other ways of achieving it. Just my thoughts..
If there is no use case where the optional node ID is needed, it can certainly be removed from the interface. Removing it right now will just cause the code to break.
The commissioner interface was designed before the specifications were finalized, and it is evolving as features are getting integrated. I understand the problem an evolving public API can cause to the customers of the API. Capturing the API requirements in an issue and getting it prioritized in a test event schedule might be a good way to get to the final solution in a deterministic timeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would node ID be updated as part of this process?
Yes, if the new certificate has a new node id.
I tried reading the spec and could not conclude one way or another.
"Effect When Received" for UpdateNOC says:
... such that the Node’s Operational Identifier within the Fabric immediately changes.
and
All internal data reflecting the prior operational identifier of the Node within the Fabric SHALL be revoked and removed, to an outcome equivalent to the disappearance of the prior Node, except for ....
If there's something we can do to make the spec clearer here, we should do it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In the current commissioner interface, the node ID is provided by the commissioner app when the
PairDevice()
is called. I understand that it's not optimal and may even be incorrect. But unless we update all current test tools and apps as part of this PR, we will end up breaking the tip of the tree. Given we are using test events as project schedule, and TE4 cut off is 1 week away, not sure if we want to introduce all those changes in this PR.
The problem is that with controllers in scope for TE4, it is no longer sufficient for the sdk to only support the in-tree use cases. The only alternative would be for test event participants to develop infrastructure solely for the purpose of test event compatibility. That's not desirable.
I've filed #7951 to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would node ID be updated as part of this process?
Yes, if the new certificate has a new node id.
I tried reading the spec and could not conclude one way or another.
"Effect When Received" for UpdateNOC says:
... such that the Node’s Operational Identifier within the Fabric immediately changes.
and
All internal data reflecting the prior operational identifier of the Node within the Fabric SHALL be revoked and removed, to an outcome equivalent to the disappearance of the prior Node, except for ....
If there's something we can do to make the spec clearer here, we should do it...
Is it desired/required for the node ID to be changed as part of UpdateNOC
command? Maybe it's left up to the commissioner implementation. If the use case allows the commissioner to continue to use the same node ID for the device, having the Optional node ID in the CA interface would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine for a commissioner implementation to propose its node ID to ecosystem-specific signing infrastructure. It's not appropriate to formalize that type of flow in the chip::controller::DeviceCommissioner
object.
We need to invert our design here. Ecosystem-specific signing infrastructure will require context to field CSRs that the sdk cannot know or predict. Issues like #6656 are well-meaning but misguided. We do not know how, where or if ecosystems will embed other RDNs. It makes no sense to formalize local APIs for this in the first place.
At the simplest level, app code should pass a void *
context that will be completely opaque to the sdk and only discernible by the associated OperationalCredentialsDelegate
implementation. Consider this: how does ecosystem-specific signing infrastructure verify the commissioner's permission to commission a particular node? OperationalCredentialsDelegate
needs some context from the app to prove this permission. What this should look like though is unknowable from the sdk perspective, just like fabric ID, node ID and tag assignment rules are unknowable from the sdk perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pan-apple @bzbarsky-apple @msandstedt what are the next steps here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we all agree that the app is the only thing that has the full context here; the SDK does not. So the SDK needs to communicate some sort of app-meaningful context indicator here.
In addition to that, we should figure out whether there are parts of the context that the SDK can provide to the app and if so whether that provision is useful in common cases. That could include things like "the CSR is for a node with a certain id", or "the CSR is for a node that does not yet have a node id", or whatnot. If we think the app would always have this information anyway, there's not much point communicating it, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was merged before comments were resolved, I thought I enabled this - will investigate. I want to make sure that we have a plan moving forward here though as there is obviously clarification required.
/rebase |
@mspang can you re-review given @andy31415 is out? @Damian-Nordic @hawk248 @jepenven-silabs, any thoughts here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's continue the design discussions in #7951
Thanks @pan-apple for working to address #7859
Co-authored-by: Michael Spang <mcspang@gmail.com>
Co-authored-by: Michael Spang <mcspang@gmail.com>
) * Support async NOC generation via op cred delegate API * address review comments * remove nodeId from NOC callback * fix Android build * Update src/controller/CHIPDeviceController.cpp Co-authored-by: Michael Spang <mcspang@gmail.com> * Update src/controller/CHIPDeviceController.cpp Co-authored-by: Michael Spang <mcspang@gmail.com> Co-authored-by: Michael Spang <mcspang@gmail.com>
Problem
The
OperationalCredentialDelegate
requires NOC to be generated synchronously. Some CA implementations may not support such design. The delegate should provide a callback mechanism so that the NOC can be provided to the controller layer once CA has generated it.Change overview
Updated
OperationalCredentialDelegate
API to take a callback which the delegate implementation will call on NOC generation. UpdatedExampleOperationalCredentialsIssuer
andCHIPOperationalCredentialsDelegate
to the new interface.Testing
Tested commissioning flow using Python controller, chip-tool and iOS CHIPTool apps.