-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail-safe logic does not match General Commissioning cluster #15585
Labels
commissioning
Involves placing devices on the network, initial setup
spec
Mismatch between spec and implementation
V1.0
Comments
tcarmelveilleux
added
V1.0
commissioning
Involves placing devices on the network, initial setup
V1_FC
labels
Feb 25, 2022
tcarmelveilleux
added
spec
Mismatch between spec and implementation
and removed
bug
Something isn't working
labels
Apr 12, 2022
Please also see #18861 (comment) as an area of race condition |
tcarmelveilleux
added a commit
to tcarmelveilleux/connectedhomeip
that referenced
this issue
Jun 13, 2022
To support moving to non-permanent storage, need to ensure there is never direct access to certificates from FabricInfo classes outside the FabricTable which owns all validations. This prevents dangling FabricInfo instances and enables the changes needed to make the fail-safe work to spec for AddNOC, UpdateNOC and AddTrustedRootCertificate. Issue project-chip#15585 Issue project-chip#7695 - Always go through the FabricTable, don't allow going directly via FabricInfo - Updated CASESession to go through FabricTable also - Getters for certs and root public key are now copying operations, rather than updating a ByteSpan to internally owned data (which may be stale!) - First step towards moving to spec-compliant lifecycle for UpdateNOC with the same model as OperationalKeystore - No functional changes, only structural changes Testing done: - Cert tests still pass - Unit tests still pass
tcarmelveilleux
added a commit
that referenced
this issue
Jun 13, 2022
* Remove direct operational certs access from FabricInfo To support moving to non-permanent storage, need to ensure there is never direct access to certificates from FabricInfo classes outside the FabricTable which owns all validations. This prevents dangling FabricInfo instances and enables the changes needed to make the fail-safe work to spec for AddNOC, UpdateNOC and AddTrustedRootCertificate. Issue #15585 Issue #7695 - Always go through the FabricTable, don't allow going directly via FabricInfo - Updated CASESession to go through FabricTable also - Getters for certs and root public key are now copying operations, rather than updating a ByteSpan to internally owned data (which may be stale!) - First step towards moving to spec-compliant lifecycle for UpdateNOC with the same model as OperationalKeystore - No functional changes, only structural changes Testing done: - Cert tests still pass - Unit tests still pass * Restyled by clang-format * Rename GetXXX to FetchXXX * Fix lint * Fix Darwin CI * Fix build on Darwin * Apply review comments Co-authored-by: Restyled.io <commits@restyled.io> Co-authored-by: Boris Zbarsky <bzbarsky@apple.com>
tcarmelveilleux
added a commit
to tcarmelveilleux/connectedhomeip
that referenced
this issue
Jun 15, 2022
- This PR is an intermediate fully unit-tested step towards replacing all NOC/ICAC/RCAC storage from Fabric-Table to allow spec-compliant fail-safe implementation - Right now, we have hacky code in opcreds cluster server and in FabricTable to attempt to affect fail-safe properly. It doesn't actually work and prevents the TrustedRootCertificates, NOCs and Fabrics attributes from being implemented properly, and prevents UpdateNOC from being implemented to spec Issue project-chip#18633 Issue project-chip#17208 Issue project-chip#15585 This PR implements an operational cert storage interface and provides an implementation based on exact storage from existing FabricTable for backwards compatibility. Its usage in FabricTable and via opcreds cluster is a follow-up. Testing done: - Added large-scale new code unit test - Unit tests pass - Not hooked-up to device software yet, so just the unit tests need to pass
andy31415
pushed a commit
that referenced
this issue
Jun 16, 2022
* Introduce fail-safe compliant Operational Cert storage - This PR is an intermediate fully unit-tested step towards replacing all NOC/ICAC/RCAC storage from Fabric-Table to allow spec-compliant fail-safe implementation - Right now, we have hacky code in opcreds cluster server and in FabricTable to attempt to affect fail-safe properly. It doesn't actually work and prevents the TrustedRootCertificates, NOCs and Fabrics attributes from being implemented properly, and prevents UpdateNOC from being implemented to spec Issue #18633 Issue #17208 Issue #15585 This PR implements an operational cert storage interface and provides an implementation based on exact storage from existing FabricTable for backwards compatibility. Its usage in FabricTable and via opcreds cluster is a follow-up. Testing done: - Added large-scale new code unit test - Unit tests pass - Not hooked-up to device software yet, so just the unit tests need to pass * Restyled by clang-format * Remove 2 unused variables in unit test to fix CI Co-authored-by: Restyled.io <commits@restyled.io>
tcarmelveilleux
added a commit
to tcarmelveilleux/connectedhomeip
that referenced
this issue
Jun 21, 2022
- FabricTable management during commissioning did not properly handle the fact that committing needs to only be done on CommissioningComplete, which prevented the AddTrustedRootCertificate, UpdateNOC and AddNOC command semantics to be implemented properly and prevented proper state observation of operational credential clusters server Fixes project-chip#7695 Issue project-chip#8905 Fixes project-chip#18633 Issue project-chip#17208 Fixes project-chip#15585 This PR: - Removes direct access to FabricInfo from everywhere, which caused possibly stale FabricInfo references during commissioning. - Remove immediate committing of fabric table on UpdateNOC. - Make Fabrics, NOCs and TrustedRootCertificates attributes reflect proper partial state during fail-safe, by using the shadow data capabilities of OperationalCertificateStore and by updates to FabricInfo - Make it possible to unit test fabric table by providing the necessary lifecycle public APIs to test every modality of the commissioning flow - Make Server and DeviceController use OperationalCertificateStore to allow proper external lifecycle management of the operational cert chain. - Update all examples/controller code to new API - Remove dangerous internal APIs from FabricTable and replace with direct accessors where needed - Add more of the necessary spec validations to the UpdateNOC and AddNOC flows Testing done: - Updated all unit tests, all pass - Cert tests still pass as before - Working on further integration tests and unit tests as a follow-up noting that current state has not regressed on existing test coverage, and that new usage of OperationalCertificateStore class in FabricTable gains a large amount of additional coverage transitively via some of the existing tests making use of FabricTable.
tcarmelveilleux
added a commit
that referenced
this issue
Jun 24, 2022
* Implement shadow fail-safe data in FabricTable - FabricTable management during commissioning did not properly handle the fact that committing needs to only be done on CommissioningComplete, which prevented the AddTrustedRootCertificate, UpdateNOC and AddNOC command semantics to be implemented properly and prevented proper state observation of operational credential clusters server Fixes #7695 Issue #8905 Fixes #18633 Issue #17208 Fixes #15585 This PR: - Removes direct access to FabricInfo from everywhere, which caused possibly stale FabricInfo references during commissioning. - Remove immediate committing of fabric table on UpdateNOC. - Make Fabrics, NOCs and TrustedRootCertificates attributes reflect proper partial state during fail-safe, by using the shadow data capabilities of OperationalCertificateStore and by updates to FabricInfo - Make it possible to unit test fabric table by providing the necessary lifecycle public APIs to test every modality of the commissioning flow - Make Server and DeviceController use OperationalCertificateStore to allow proper external lifecycle management of the operational cert chain. - Update all examples/controller code to new API - Remove dangerous internal APIs from FabricTable and replace with direct accessors where needed - Add more of the necessary spec validations to the UpdateNOC and AddNOC flows Testing done: - Updated all unit tests, all pass - Cert tests still pass as before - Working on further integration tests and unit tests as a follow-up noting that current state has not regressed on existing test coverage, and that new usage of OperationalCertificateStore class in FabricTable gains a large amount of additional coverage transitively via some of the existing tests making use of FabricTable. * Restyled by clang-format * Fixes after merging master * Rename callback of FabricTable::Delegate - Remove needless ones - Make the callbacks do nothing by default to avoid more "Intentionally left blank" - Renamed to actually reflect what is happening * Rekick restyle * restyle * Align last known good time commit / revert to shadow fail-safe approach Fabric commit / revert is being refactored to only persist fabric data when we actually commit on receipt of CommissioningComplete. This reworks Last known Good Time to use the same strategy. Now, instead of persisting both last known good time and a fail-safe backup at the time of certificate installation, a single, updated last known good time is stored in RAM at the time of certificate installation. Then, on commit, this is persisted, but never before. * Fix CI * Fix TV app * Reduce flash usage on K32W0 by disabling detail (verbose) logs * Disable progress logging in TI platform lock example to save flash * Disable progress logging in TI platform shell example to save flash * Reduce stack usage of TestEventLogging * Apply review comments from @msandstedt * Save stack for Nordic build * Save stack for Nordic build, some more * Added unit tests for all basic operations * Fix merge conflict * Restyle * Change nrf connect stack limit up by 512 bytes for fake unit test comparisons * Revert "Save stack for Nordic build" This reverts commit 52699c4. * Revert "Save stack for Nordic build, some more" This reverts commit e62391e. * Fix UpdateLabel * Apply review comments, add ACL fabric removal * Apply review comments * per bzbarsky, clear in-memory Last Known Good Time if revert fails * Apply fixes from @bzbarsky-apple * Apply more review comments * Restyled * Fix tests * Add unit test for iterator * Iterator now works * Fix semantic merge conflict * Restyled * Fix predicates for storage presence temporarily issue #16958 * Fix Darwin build * Add missing docs, move methods to private that shouldn't be public * Change stack warning temporarily to pass on nRFConnect * Make MatterControllerFactory use const FabricInfo * Restyle * Fix semantic conflict on SessionManager * Fix an init ordering issue in TestSessionManager.cpp * Fix SessionManager shutdown * Restyled Co-authored-by: Restyled.io <commits@restyled.io> Co-authored-by: Michael Sandstedt <michael.sandstedt@smartthings.com> Co-authored-by: Boris Zbarsky <bzbarsky@apple.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
commissioning
Involves placing devices on the network, initial setup
spec
Mismatch between spec and implementation
V1.0
Problem
Fail-safe expiry logic currently does a full factory reset, rather than doing the spec-required behavior of:
This causes the possibility of having N+1 admin commissioning to terminate in a state worse than the starting state.
Proposed Solution
The text was updated successfully, but these errors were encountered: