Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSSH 8.4: Add remaining algorithms #91

Closed
xvzcf opened this issue Apr 16, 2021 · 9 comments
Closed

OpenSSH 8.4: Add remaining algorithms #91

xvzcf opened this issue Apr 16, 2021 · 9 comments
Assignees

Comments

@xvzcf
Copy link

xvzcf commented Apr 16, 2021

8.4 only has a few OQS algorithms at the moment. We should add the rest in once we deal with #89 and #90

@baentsch
Copy link
Member

baentsch commented May 20, 2021

So by working this issue, we decided to postpone #89.

Now, when following the build-and-test instructions in the documentation, make tests fails:

Could not upgrade key /root/git/openssh/regress/ssh-oqsdefault-agent.pub to certificate: invalid argument
FATAL: ca sign failed

--> Looking at oqs-test/run-tests.sh I suppose this is a known failure and missing in the set-up documentation, right?

Also "irritating" is that no OQS tests get executed when doing as documented:

python3 -m nose --rednose --verbose 
0 tests run in 0.0 seconds (0 tests passed)

Again, is this a case of missing documentation update @xvzcf ? I'm now operating on branch "mb-v8test" in case you'd like to correct the updates regarding documentation and tests that I'm making so I'm not chasing wrong goals when moving to Ubuntu 20 and extending the algorithm list...

Edit: The tests run_tests.sh and try_connection.py worked identically OK in Ubuntu18 and 20. Good. Even after adding "dilithium-aes". Even better.

Surprising: try_connection.py only works after run_tests.sh has been executed before: Is that intentional? Problem is that the shellscript takes ages to complete and the python script (if run to test all algorithm combinations) also takes a very long time (170 tests/seconds after just adding dilithium-aes).

Then: Should the description in the README (for "manual" SSH setup and execution) work OK? Already on the unchanged code, if I try it for "dilithium2", it fails:

root@6ddea30f00ac:~# /opt/openssh/bin/ssh -p 2222 localhost                           -o KexAlgorithms=frodokem-640-aes-sha256                           -o HostKeyAlgorithms=ssh-dilithium2                          -o PubkeyAcceptedKeyTypes=ssh-dilithium2                           -o StrictHostKeyChecking=no                           -i ~/ssh_client/id_dilithium2
key DILITHIUM2 SHA256:fS4kMEl16hoAan5myFh6WdJ0rrHv+fdoSAaRYyKv3mk returned incorrect signature type
sign_and_send_pubkey: signing failed for DILITHIUM2 "/root/ssh_client/id_dilithium2": signature algorithm not supported

--> Do I have to dig deeper into the openssh-8 logic or are these issues for you expected/easy to rectify, @xvzcf ?

@xvzcf
Copy link
Author

xvzcf commented May 20, 2021

Now, when following the build-and-test instructions in the documentation, make tests fails:

Could not upgrade key /root/git/openssh/regress/ssh-oqsdefault-agent.pub to certificate: invalid argument
FATAL: ca sign failed

--> Looking at oqs-test/run-tests.sh I suppose this is a known failure and missing in the set-up documentation, right?

Yes this does look like a known failure. Does oqs-test/run-tests.sh pass?

Again, is this a case of missing documentation update @xvzcf ? I'm now operating on branch "mb-v8test" in case you'd like to correct the updates regarding documentation and tests that I'm making so I'm not chasing wrong goals when moving to Ubuntu 20 and extending the algorithm list...

Yes this is a case of missing documentation. The current (incomplete) set of test commands can be found in the config.yml, and I can update the README in OQS-v8.

Surprising: try_connection.py only works after run_tests.sh has been executed before: Is that intentional? Problem is that the shellscript takes ages to complete and the python script (if run to test all algorithm combinations) also takes a very long time (170 tests/seconds after just adding dilithium-aes).

Yes, this is intentional (so that the python script does not have to set up the ssh{d}_config, keys, etc). It is possible to do just the setup without running all the regression tests (something like make tests -e LTESTS="" I believe) and this could be added to the python script perhaps. The python script also does not test all possible combinations, just picks a random signature and key-exchange algorithm and sees if the tests work with the choices.

Then: Should the description in the README (for "manual" SSH setup and execution) work OK? Already on the unchanged code, if I try it for "dilithium2", it fails:

root@6ddea30f00ac:~# /opt/openssh/bin/ssh -p 2222 localhost                           -o KexAlgorithms=frodokem-640-aes-sha256                           -o HostKeyAlgorithms=ssh-dilithium2                          -o PubkeyAcceptedKeyTypes=ssh-dilithium2                           -o StrictHostKeyChecking=no                           -i ~/ssh_client/id_dilithium2
key DILITHIUM2 SHA256:fS4kMEl16hoAan5myFh6WdJ0rrHv+fdoSAaRYyKv3mk returned incorrect signature type
sign_and_send_pubkey: signing failed for DILITHIUM2 "/root/ssh_client/id_dilithium2": signature algorithm not supported

--> Do I have to dig deeper into the openssh-8 logic or are these issues for you expected/easy to rectify, @xvzcf ?

I'm not sure off the top of my head what's going on here, but it seems like there's a good lead sign_and_send_pubkey to follow up.

@baentsch
Copy link
Member

The python script also does not test all possible combinations, just picks a random signature and key-exchange algorithm and sees if the tests work with the choices

No longer :-) If you give it a parameter, it iterates through all alg combinations.

I'm not sure off the top of my head what's going on here, but it seems like there's a good lead sign_and_send_pubkey to follow up.

OK, will debug into it.

@xvzcf
Copy link
Author

xvzcf commented May 20, 2021

No longer :-) If you give it a parameter, it iterates through all alg combinations.

A potential problem here is that we'd be taking ~2 hours to run the tests, just like before. I avoided doing all the combinations for that reason. My preference would be to augment the built-in regression tests first before looking at try_connection.py.

@baentsch
Copy link
Member

My preference would be to augment the built-in regression tests first before looking at try_connection.py.

I'd also prefer that for regular (CI) operations. But this parameter delivers a fast way to check end-to-end if everything works: I wasn't exactly anxiously looking forward to diving into understanding the openssh test suite logic....

@baentsch
Copy link
Member

baentsch commented May 21, 2021

@xvzcf FYI an update: try_connection.py now changed to run each QSC algorithm exactly once (run here) if passed the parameter "doone": Fast and (reasonably) thorough, so a good compromise. If given no parameter, it still only tests one combination randomly; if given the parameter "doall" it exercises all combinations.

run_tests.sh unfortunately begins to fail now with too many algorithms enabled; so one more thing to debug... --> Should we again add an algorithm enablement option in generate.yml? If so, which algorithms should we enable by default? Was there an upper limit to the number of algs that could be supported in openssh-7.x that you can recall?

Edit: Found a limiting constant:

openssh/match.c

Line 268 in 371c2e5

#define MAX_PROP 40

Increasing that fixes the latest problem. Do you see any reason why we should refrain from doing so?

@baentsch
Copy link
Member

baentsch commented May 22, 2021

Was there an upper limit to the number of algs that could be supported in openssh-7.x that you can recall?

Answering my own question: Yes, this switch alone makes more than 64 (maybe only 32 for small platforms?) signature algorithms a bad idea:

openssh/ssh-keyscan.c

Lines 800 to 802 in 371c2e5

int type = sshkey_type_from_name(tname);
switch (type) {

--> Adding sig-algorithm en/disablement logic into the code generator (YML and logic).

But even then, basic tests on simple algorithms begin to fail after only adding half the algorithms... Quite some more debugging required....

Edit: And this debugging isn't fun: sshd goes to 100% CPU utilization for 1-2 minutes until it responds -- ultimately the client exists with

debug1: Authentication succeeded (publickey).
Authenticated to 127.0.0.1 ([127.0.0.1]:4242).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug3: send packet: type 90
debug1: Entering interactive session.
debug1: pledge: network
Bad packet length 3875568.
debug3: send packet: type 1
ssh_dispatch_run_fatal: Connection to 127.0.0.1 port 4242: Connection corrupted

--> Does this look familiar? This is going "deeper" than I'm comfortable with (and wanted to spend time on).

@baentsch
Copy link
Member

Problem was a buffer size limitation:

#define PACKET_MAX_SIZE (256 * 1024)
--> way too small for the combination McEliece & Rainbow.

Merging #97 closes this issue for good.

@baentsch
Copy link
Member

baentsch commented Jun 1, 2021

Closed by #97

@baentsch baentsch closed this as completed Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants