Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: mediation routing keys as did key #2516

Merged

Conversation

dbluhm
Copy link
Member

@dbluhm dbluhm commented Sep 25, 2023

This PR is a minor rethink to #2502; as discussed in my comment on that PR, I think I/we went a bit too far with base58 to did:key normalization. This is the minimum set of changes required to fix the two issues.

Fixes #2492.
Fixes #2357.

These changes also cause did:keys to be sent out in mediation grant messages and, subsequently, connections/did exchange as the routingKeys in the DID Document. This has interop implications. A big motivator for these changes however is to enable AFJ/Bifold based agents to connect through a public DID with mediation where AFJ/Bifold is expecting did:keys in the routing keys in the endpoint attrib. So this should help push us towards "good" interop and should only break "bad" interop.

Updates: This PR has evolved. Here is the current state of the changes.

  • Coordinate Mediation protocol will accept either base58 or did:key (this was already implemented)
  • Coordinate Mediation protocol will emit did:key (this was already implemented)
  • Coordinate Mediation protocol manager will store the routing keys as did:key values. This is the critical new stuff that solves the issues linked above.
  • Connections protocol will accept any of base58, did:key, or did:key refs encoded routingKeys (this was already implemented and recently augmented by fix (backport): routing behind mediator  #2537).
  • Connections protocol will emit base58 encoded routingKeys (for legacy support).
  • DID Exchange will accept any of base58, did:key, or did:key ref encoded routingKeys (this was already implemented and recently augmented by fix (backport): routing behind mediator  #2537).
  • DID Exchange will emit did:key ref encoded routingKeys (this is new).
  • OOB will accept either did:key or did:key ref encoded routingKeys (this is new).
  • OOB will emit did:key ref encoded routingKeys for inline services (this is new).

Copy link
Contributor

@usingtechnology usingtechnology left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dbluhm
Copy link
Member Author

dbluhm commented Sep 26, 2023

I've got some integration test issues to sort out.

@usingtechnology usingtechnology dismissed their stale review September 26, 2023 01:27

need to run tests successfully

@usingtechnology
Copy link
Contributor

I've got some integration test issues to sort out.

Ok, that's good to hear actually. My run of BDD wasn't working (hanging?) but I figured you had run them and that it was just my machine acting up (as it did earlier today) and figured it was time for a reboot. I'll cancel the review/approval and try running BDD after your updates.

@dbluhm
Copy link
Member Author

dbluhm commented Sep 26, 2023

A new development: given our recent tightening up of DID Doc semantics and using PyDID to interact with DID Documents, it came to my attention that we were out of spec with the routingKeys values we were sending and accepting previously. Old validation required did:keys without a fragment. The only place we really call out what we're expecting in routingKeys is in the OOB RFC under the description for the services. The text says that they should be DID key references but the examples show just did:keys. It is also mentioned in the DID Doc conventions RFC.

I have corrected our implementation to send and expect did:key references.

This might present an interop challenge. I'll see if I can check in with AFJ's DID Exchange/OOB implementation and see what it is expecting.

@dbluhm
Copy link
Member Author

dbluhm commented Sep 26, 2023

My investigation suggests that AFJ will accept either a DID Key or a DID Key reference. However, it will emit just a DID Key, not a DID Key reference, in OOB invitations but sends references in the DID Docs it forms for DID Exchange...

Perhaps I can adjust things to accept either in OOB but emit the full references for any we send out.

@usingtechnology
Copy link
Contributor

Ran changes locally...BDD tests look good by pytests are failing.

@TimoGlastra
Copy link
Member

My investigation suggests that AFJ will accept either a DID Key or a DID Key reference. However, it will emit just a DID Key, not a DID Key reference, in OOB invitations but sends references in the DID Docs it forms for DID Exchange...

We ran into this issue in the AFJ implementation of OOB, but as the RFC just used keys (no reference) and ACA-Py's implementation also did that we went for the non-key reference. I think there is an issue somewhere (but it could also be that this was discussed in discord)

@dbluhm dbluhm marked this pull request as draft September 28, 2023 13:56
@swcurran
Copy link
Member

FYI -- when I ran the AATH acapy run set, the following two tests fail:

Failing scenarios:
  features/0211-coordinate-mediation.feature:50  Request mediation with the mediator accepting the mediation request and creating a connection using the mediator -- @2.1 0160 connection
  features/0211-coordinate-mediation.feature:149  Two agents creating a connection using a mediator with overlapping transports -- @2.1 Acme and Faber creating a 0160 connection using Bob a mediator with overlapping transports

To run the specific tests use, update requirements-main.txt to use the branch to be merged and then:

./manage run -d acapy-main -t @T002-RFC0211,@T004-RFC0211,@T005-RFC0211 -t @RFC0160

Example error from run:

     Assertion Failed: FAILED SUB-STEP: And "Faber" receives the connection invitation
      Substep info: Assertion Failed: resp_status 422 is not 200; {"routingKeys": {"0": ["Value did:key:z6MkfDd4T7QAeuLrjeigXe8HfdvqNHWA5KGmMjkrLceE27Bz#z6MkfDd4T7QAeuLrjeigXe8HfdvqNHWA5KGmMjkrLceE27Bz is not a raw Ed25519VerificationKey2018 key"]}}
      Traceback (of failed substep):
        File "/usr/local/lib/python3.7/site-packages/behave/model.py", line 1329, in run
          match.run(runner.context)
        File "/usr/local/lib/python3.7/site-packages/behave/matchers.py", line 98, in run
          self.func(context, *args, **kwargs)
        File "features/steps/0160-connection.py", line 152, in step_impl
          assert resp_status == 200, f"resp_status {resp_status} is not 200; {resp_text}"
      
      Captured stdout:
      Data table in step contains an unrecognized role 'recipient', must be inviter, invitee, inviteinterceptor, issuer, holder, verifier, prover, requester, responder, mediator, and recipient
      Data table in step contains an unrecognized role 'sender', must be inviter, invitee, inviteinterceptor, issuer, holder, verifier, prover, requester, responder, mediator, and recipient

@dbluhm
Copy link
Member Author

dbluhm commented Oct 8, 2023

This PR is now ready for review. I ended up trimming out some additional unused code relating to the previously removed old routing protocol and "inbound connection" handling on the create_did_doc method. Also, I cleaned up some items relating to the route managers to further abstract out multi-tenancy from connection protocols.

I've adjusted the implementation to make sure that either did:key or the full did:key references are accepted in out-of-band invitations. With the changes from #2536, did:key or did:key refs can be accepted in connections/did exchange protocols. We will always emit did:key refs for both OOB and DIDExchange. AFJ should be able to process this but other agents may have trouble if they were slightly out of spec like ACA-Py was.

@usingtechnology
Copy link
Contributor

Lots of poking around today and running tests and AATH.
Something on my local dev doesn't like running pytests, but can run ./scripts/run_tests successfully and that's what counts.

Moving on to AATH... there are issues that we need to address in that repo.

We need to update 0160-connection.py to handle recipient and sender - those are not in the case/switch statement so it raises that error when it shouldn't. What I did to bypass this probably isn't proper but it got me further down the call stack to uncover other issues.

Next there is an actual issue with the tests. When there are no inbound transports, the backchannel code doesn't create an endpoint parameter, so we hit an ACA-py config error.

aries_cloudagent.config.error.ArgsParseError: -e/--endpoint is required.

Simple enough to add a default http endpoint.

However, then you hit another config error: aries_cloudagent.config.error.ArgsParseError: -it/--inbound-transport is required.

I am not sure how this scenario ever ran successfully since the test is to not have inbound transport and for 3 years that configuration has been required. At least from what I can see.

WIth ALL of that being said, this branch/PR does have an issue to address that is uncovered in AATH... routing keys not Ed25519VerificationKey2018.

  features/0211-coordinate-mediation.feature:50  Request mediation with the mediator accepting the mediation request and creating a connection using the mediator -- @2.1 0160 connection

fails with:

      Assertion Failed: FAILED SUB-STEP: And "Faber" receives the connection invitation
      Substep info: Assertion Failed: resp_status 422 is not 200; {"routingKeys": {"0": ["Value did:key:z6MkrSmucdbqKmWhVxJM4q2nyAFD1pCLNrfQWttos4ntoqMj#z6MkrSmucdbqKmWhVxJM4q2nyAFD1pCLNrfQWttos4ntoqMj is not a raw Ed25519VerificationKey2018 key"]}}
      Traceback (of failed substep):
        File "/usr/local/lib/python3.7/site-packages/behave/model.py", line 1329, in run
          match.run(runner.context)
        File "/usr/local/lib/python3.7/site-packages/behave/matchers.py", line 98, in run
          self.func(context, *args, **kwargs)
        File "features/steps/0160-connection.py", line 152, in step_impl
          assert resp_status == 200, f"resp_status {resp_status} is not 200; {resp_text}"
  features/0211-coordinate-mediation.feature:149  Two agents creating a connection using a mediator with overlapping transports -- @2.1 Acme and Faber creating a 0160 connection using Bob a mediator with overlapping transports

fails with:

      Assertion Failed: FAILED SUB-STEP: And "Acme" receives the connection invitation
      Substep info: Assertion Failed: resp_status 422 is not 200; {"routingKeys": {"0": ["Value did:key:z6MkrSmucdbqKmWhVxJM4q2nyAFD1pCLNrfQWttos4ntoqMj#z6MkrSmucdbqKmWhVxJM4q2nyAFD1pCLNrfQWttos4ntoqMj is not a raw Ed25519VerificationKey2018 key"]}}
      Traceback (of failed substep):
        File "/usr/local/lib/python3.7/site-packages/behave/model.py", line 1329, in run
          match.run(runner.context)
        File "/usr/local/lib/python3.7/site-packages/behave/matchers.py", line 98, in run
          self.func(context, *args, **kwargs)
        File "features/steps/0160-connection.py", line 152, in step_impl
          assert resp_status == 200, f"resp_status {resp_status} is not 200; {resp_text}"

@usingtechnology
Copy link
Contributor

That seems like a long comment, but think it boils down to a routing keys fix, and we add issues to AATH for some other things.

@dbluhm
Copy link
Member Author

dbluhm commented Oct 10, 2023

Thanks for the testing @usingtechnology! I'll dig into this when I get a chance.

@dbluhm
Copy link
Member Author

dbluhm commented Oct 16, 2023

I'm able to reproduce the issue in my own test environment. Interestingly, it's specific to the connection protocol Admin API. DID Exchange is working fine. Looks like it's a matter of the connection invitation schema being picky about what the routing keys look like.

@dbluhm
Copy link
Member Author

dbluhm commented Oct 16, 2023

After double checking the RFC for the connection protocol, it does actually say that the routing keys must be raw keys and not references. I'll make sure these changes keep that consistent.

@dbluhm
Copy link
Member Author

dbluhm commented Oct 18, 2023

@usingtechnology @swcurran this PR is ready for review!

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Don't reinstall all deps just because some local code changed

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
This enables retrieving the base mediation record in connection
protocol managers without the managers having to be aware of
multitenancy specifics.

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
This makes it more convenient to call from connection protocol managers

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
This makes it easier to call from protocol managers

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
This makes it easier to call from protocol managers

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
In connection protocol.

Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
@dbluhm dbluhm force-pushed the fix/mediation-routing-keys-as-did-key branch from 756b513 to a8cf6ad Compare October 18, 2023 19:02
@swcurran
Copy link
Member

Ran it against the AATH test suite and all passed. Mind you — that was before the 23 commits that @dbluhm just added. Do I need to re-run the AATH test, Daniel?

@dbluhm
Copy link
Member Author

dbluhm commented Oct 18, 2023

Yeah, I think another round of testing is appropriate.

@usingtechnology
Copy link
Contributor

AATH update requirements-main.txt

aries-cloudagent[indy, bbs, askar]@git+https://github.com/hyperledger/aries-cloudagent-python@de09845e4ac32081b2825b448dbb1485c5dbdf5d
acapy-resolver-universal@git+https://github.com/sicpa-dlab/acapy-resolver-universal.git@v0.1.0
BUILD_AGENTS=acapy-main ./manage build
./manage run -d acapy-main -t @T002-RFC0211,@T004-RFC0211,@T005-RFC0211 -t @RFC0160

Still get the errors like:

      Assertion Failed: FAILED SUB-STEP: And "Acme" receives the connection invitation
      Substep info: Assertion Failed: resp_status 422 is not 200; {"routingKeys": {"0": ["Value did:key:z6MkfsqtL1nN8Rj94LXBxYXRrbYZfz9kvac4Br3LZkJXRDAr is not a raw Ed25519VerificationKey2018 key"]}}
      Traceback (of failed substep):
        File "/usr/local/lib/python3.7/site-packages/behave/model.py", line 1329, in run
          match.run(runner.context)
        File "/usr/local/lib/python3.7/site-packages/behave/matchers.py", line 98, in run
          self.func(context, *args, **kwargs)
        File "features/steps/0160-connection.py", line 152, in step_impl
          assert resp_status == 200, f"resp_status {resp_status} is not 200; {resp_text}"

@dbluhm
Copy link
Member Author

dbluhm commented Oct 19, 2023

@usingtechnology I think the commit to checkout should be a8cf6ad

@swcurran
Copy link
Member

FYI — when I run PR tests in AATH, I update the requirements-main.txt file to:

aries-cloudagent[indy, bbs, askar]@git+https://github.com/<user>/aries-cloudagent-python@<branch in user’s fork>
acapy-resolver-universal@git+https://github.com/sicpa-dlab/acapy-resolver-universal.git@v0.1.0

So for this one:

aries-cloudagent[indy, bbs, askar]@git+https://github.com/dbluhm/aries-cloudagent-python@fix/mediation-routing-keys-as-did-key
acapy-resolver-universal@git+https://github.com/sicpa-dlab/acapy-resolver-universal.git@v0.1.0

@usingtechnology
Copy link
Contributor

When I use the correct commit hash, all looks good. pytests and AATH all check out. Great work!

Copy link
Contributor

@usingtechnology usingtechnology left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 - pytests and AATH all check out.

@sonarcloud
Copy link

sonarcloud bot commented Oct 19, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@swcurran swcurran merged commit 900b1d8 into hyperledger:main Oct 20, 2023
9 checks passed
@dbluhm dbluhm deleted the fix/mediation-routing-keys-as-did-key branch January 30, 2024 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Routing keys for public DIDs should be published as did:keys Mediation webhook routing keys aren't did:keys
4 participants