-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pkcs7 #6672
Fix pkcs7 #6672
Conversation
9dc3f6a
to
8137ea1
Compare
Thanks! We're obviously going to prioritize #6670, so once that's merged we'll ask you to rebase on top of it. |
You’re welcome!
That is indeed the correct decision for obvious reasons. It is worth noting that this PR is not backwards compatible: there are inputs that are rejected with this PR that were accepted before it. Also, this PR does not address determining if the certificates in a PKCS7 (really CMS, Cryptographic Message Syntax) message should be trusted. Even after reading the code I am not sure how that is done in this code. In practice, one will likely want checks such as “there must be an EKU that states this cert can be used for signing X”. |
@DemiMarie #6670 has just been merged, can you rebase this to resolve conflicts? Thanks! |
8656410
to
5224654
Compare
@mpg I did the rebase, but the resulting code is quite clumsy, as the test cases generated by fuzzing trip have invalid encapsulated data and are therefore rejected before the code they are meant to test gets to run. I chose to defer rejection until later so as to not break the tests, but better test cases (which I am not really sure how to generate) would be a better approach. |
Thanks. We want to replace those test cases by more robust ones, using the fuzzer inputs was a just quick fix. I've filed #6690 to track this. Can you please make a note of the places where you know some validation is still missing? |
I just pushed a fix for the last place where I know validation to be missing at the ASN.1 level. I did not add a test case, though; mutating ASN.1 by hand is not my strength. It is quite possible that more validation is missing and I recommend doing a full review of the code before shipping it. At a minimum, the TODO for comparing the issuer and serial number in the signature against the certificate should be fixed, and I also recommend using the DigestAlgorithmIdentifier in the signature to avoid having to rehash data. There is a much more serious problem with the current API, however: it does not check that the certificate was signed by a trusted authority. Right now, anyone can sign a message that the PKCS7 APIs will consider valid, which is horribly insecure. I filed #6692 for that. Also, PKCS7 is obsolete; the replacement is Cryptographic Message Syntax (CMS). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @DemiMarie, thanks for the fixes!
It looks to me like this PR does two, mostly distinct things:
- It fixes up the currently exposed API of the parser parser, and adds extra validation
- It adds a new API for dealing with data embedded data in the PKCS7 signed data datastructure.
The current (very restricted) API was defined by the folks from IBM to fit their narrow needs:
- They have a pre-shared certificate they are validating the signatures against. (the embedded certs are ignored)
- They are validating an external blob or the hash of an external blob.
Because they only care about the signatures in the SignedData
struct, the current API is usable / useful, even though all of the struct's fields are private, and lack accessors.
Since we have no way of getting at the embedded data without using private fields at the moment, I don't think we can merge mbedtls_pkcs7_signed_content_verify
without adding an accessor for the data portion at the very least.
Considering all of the above, would you be okay with spliting 3038516 out into its separate PR, where we can consider what APIs we need to add to make it useful - and focus this PR on fixing up the currently exposed API?
|
||
ret = pkcs7_get_digest_algorithm( p, end_signer, &signer->alg_identifier ); | ||
if( ret != 0 ) | ||
goto out; | ||
|
||
/* Assume authenticatedAttributes is nonexistent */ | ||
|
||
/* Asssume signedAttrs is nonexistent */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authenticated / unauthenticated attributes are the phraseology used by the PKCS7 RFC.
Typo "Asssume" -> "Assume"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which RFC are you referring to? My understanding is that PKCS7 is obsolete and has been replaced by CMS, and that CMS refers to signed attributes here. Authenticated attributes are in the context of AEAD-encrypted and authenticated messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to RFC 2315, which documents PKCS7 1.5 - it's the version that was referenced when the module was originally implemented.
After quickly reading through the list of changes in RFC5652, CMS seems to be fully backwards compatible with the subset we currently implement - but we should use wording consistent with the ASN.1 description of the type on top of this function definition.
* content | ||
* [0] EXPLICIT ANY DEFINED BY contentType OPTIONAL } | ||
**/ | ||
static int pkcs7_get_content_info_type( unsigned char **p, unsigned char *end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of getting rid of this function completely, let's rename it pkcs7_get_content_type
, and Make it only parse the contentType
field, not the entire ContentInfo Sequence.
That way this function's behaviour will be consistent with all the other pkcs7_get_*
functions, and we don't end up duplicating as much code.
I.e. remove the first mbedtls_asn1_get_tag
call, and don't reset the pointer on an error condition.
Also, @DemiMarie - could you run a coverage test (as described in #6745) - and add test cases to exercise any new branches added by this PR? Pre-existing coverage gaps don't need to be fixed in this PR, we can handle those separately - but we don't want to add new gaps. |
Otherwise invalid data could be accepted. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
No change in behavior Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Since only one content type (signed data) is supported, storing the content type just wastes memory. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
There must not be any. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
No change in behavior. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
They will always be constant. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
A CMS signature can have internal data, but mbedTLS does not support verifying such signatures. Reject them during parsing. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Such junk should not appear. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Since only a single hash algorithm is currenlty supported, this avoids having to perform hashing more than once. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
if( ret == MBEDTLS_ERR_ASN1_UNEXPECTED_TAG ) | ||
return( 0 ); | ||
if( ret != 0 ) | ||
return( MBEDTLS_ERROR_ADD( MBEDTLS_ERR_PKCS7_INVALID_FORMAT, ret ) ); | ||
start = *p; | ||
end_set = *p + len1; | ||
|
||
ret = mbedtls_asn1_get_tag( p, end_set, &len2, MBEDTLS_ASN1_CONSTRUCTED | ||
| MBEDTLS_ASN1_SEQUENCE ); | ||
if( ret != 0 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have new code style standard: https://mbed-tls.readthedocs.io/en/latest/kb/how-to/rewrite-branch-for-coding-style/ , seems we have no need to remove the braces here.
@@ -212,17 +175,12 @@ static int pkcs7_get_certificates( unsigned char **p, unsigned char *end, | |||
* The behaviour would be improved with addition of multiple signer support. | |||
*/ | |||
if ( end_cert != end_set ) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have new code style standard: https://mbed-tls.readthedocs.io/en/latest/kb/how-to/rewrite-branch-for-coding-style/ , seems we have no need to remove the braces here.
Hi @DemiMarie - are you OK to split this into two as Bence suggested above, and rebase / reformat with new code style? If you don't have time, we can pick it up. We'd like to progress this PR next, now that #6816 is just about there, so please let us know what's best. For rebasing, I'd suggest start off by rebasing on top of #6818 as I expect this to land pretty soon. We'd like to pick up the first part (the fixes) first, and would come to the second part (new feature for dealing with embedded data) after we've completed work on the initial parsing functionality. |
@daverodgman: splitting this PR into two is fine, but I do not have time to do it myself right now. Feel free to close the PR or to push to the branch directly (you should be able to if you have write access to this repo). |
OK, thanks, we will pick this up then. We will probably create a new PR taking commits from this one for the fixes, and similarly, raise a second PR for the new feature (#7054 ). We appreciate your work on PKCS 7 - thanks for your involvement here! |
You’re welcome! |
Description
This fixes various problems in the PKCS7 parser. See individual commit messages for details.
Fixes #6671.
Gatekeeper checklist
Notes for the submitter
Please refer to the contributing guidelines, especially the
checklist for PR contributors.