-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fault mitigation #5646
Fault mitigation #5646
Conversation
@jenswi-linaro are there any performance tests you plan to run? I have several boards (xilinx, nxp, stm32...) that I could use if you need help. Not that performance is more critical than security in this case so may be irrelevant. |
* This is implicit since we're normally trying to protect things post | ||
* boot and booting takes quite some time. | ||
* | ||
* [1] https://www.riscure.com/uploads/2020/05/Riscure_Whitepaper_Fault_Mitigation_Patterns_final.pdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jenswi-linaro the link is not visible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a copy at https://web.archive.org/web/20220616035354/https://www.riscure.com/uploads/2020/05/Riscure_Whitepaper_Fault_Mitigation_Patterns_final.pdf perhaps we could use that link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll update.
@ldts, thanks for the offer. I haven't planned any performance testing because I believe the added overhead will not be noticeable combined with signature verification. That said, if you notice anything odd please let me know. |
aba492b
to
af7b7bf
Compare
Rebased to resolve a merge conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor things in commit "Basic fault mitigation routines"
* While a function is executed it can update its state as a way of keeping | ||
* track of important passages inside the function. Before the function | ||
* returns with for instance ftmn_return_res() to check that the | ||
* accumulated state matches the expected state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setence sounds strange.
| (...). BeforeWhen the function
| * returns with for instance ftmn_return_ res() to it is checked that the
| * accumulated state matches the expected state.
?
* FTMN_PANIC() - FTMN specific panic function | ||
* | ||
* This function is called whenever the FTMN function detects an | ||
* inconsistency. An inconsistency is able to occur if the system is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 space chars
* | ||
* This function is called whenever the FTMN function detects an | ||
* inconsistency. An inconsistency is able to occur if the system is | ||
* subject to an fault injection attack, in this case doing a panic() isn't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/an/a/
#ifdef __KERNEL__ | ||
#define FTMN_PANIC() panic(); | ||
#else | ||
#define FTMN_PANIC() TEE_Panic(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A specific ID could be helpful when printed in generic core debug trace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I never look at the TEE_Panic ID. Do you have any particular ID in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TEE_ERROR_SECURITY
or TEE_ERROR_BAD_STATE
could be good candidates.
But i admit it will not be specifically obvious it specifically from the fault injection countermeasure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only occasion when I have found specific values to be helpful in TEE_Panic()
, is when running negative tests in CI, when the xtest
output is visible but not the secure world log. Not very applicable to this case I suppose. But FWIW either ways are OK with me (0 or something else).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care which we take either. @etienne-lms which do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not willing to track glitch attacks on CI runs? :)
Ok, 0
is fine.
static inline void __ftmn_callee_done(struct ftmn_func_arg *arg, | ||
unsigned long my_hash, unsigned long res) | ||
{ | ||
if (IS_ENABLED(CFG_CORE_FAULT_MITIGATION)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reson not testing arg
here, compared to other neighbour functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well spotted, I'll fix.
* | ||
* The passed result will be stored in the struct ftmn_func_arg struct | ||
* supplied by the caller. This function can be called any number of times | ||
* by the callee, provided that one of the FTMN_CALLEE_DONE_XXX() function has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/function/functions/
* __ftmn_get_tsd_func_arg(). | ||
* | ||
* The FTMN_CALLE_* functions only work with the real function name so the | ||
* old hash must be removed and replaces with the new for the calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/replaces/replaced/
(my_old_hash), FTMN_FUNC_HASH(__func__)) | ||
|
||
/* | ||
* FTMN_SET_CHECK_RES() - records a result in local checked state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d/record/Record/
ditto for few macros below.
core/tests/ftmn_boot_tests.c
Outdated
#include <fault_mitigation.h> | ||
#include <initcall.h> | ||
#include <kernel/thread.h> | ||
#include <types_ext.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
order
core/kernel/ree_fs_ta.c
Outdated
@@ -54,6 +54,7 @@ | |||
#include <tee/tee_ta_enc_manager.h> | |||
#include <tee/uuid.h> | |||
#include <utee_defines.h> | |||
#include <fault_mitigation.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
order
Addressed all comments except the one concerning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I find the modified code a bit difficult to read unfortunately, but I suppose there is not much that can be done. The impact could have been much worse ;)
A couple of comments below, then for the whole:
Acked-by: Jerome Forissier <jerome.forissier@linaro.org>
|
||
typedef int (*ftmn_memcmp_t)(const void *p1, const void *p2, size_t nb); | ||
|
||
/* The default hash used when xoring the result in struct ftmn_check */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these values chosen? More or less randomly I suppose? Any particular property they'd better have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, just random.
{ | ||
#if defined(CFG_CORE_FAULT_MITIGATION) && defined(__KERNEL__) | ||
return &thread_get_tsd()->ftmn_arg; | ||
#elif defined(CFG_CORE_FAULT_MITIGATION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CFG_CORE
should not be used for user-space features IMO. How about CFG_FAULT_MITIGATION
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
Adds basic fault mitigation routines designed to help protecting from fault injection attacks on the hardware. This is by no means bullet proof, but it should at least improve the situation. These routines focus on verifying that a function has been called and that the returned value matches the result from the function. This is done by having a handshake between the caller and the callee where also the return value is transmitted in a separate channel. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds some simple test for the fault mitigation routines. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds fault mitigation in mbedtls_rsa_rsassa_pss_verify_ext() by using the macro FTMN_CALLEE_DONE_MEMCMP() instead of memcmp() when checking that the hash in the RSA signature is matching the expected value. FTMN_CALLEE_DONE_MEMCMP() saves on success the result in a thread local storage if fault mitigations was enabled when the function was called. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
…fy() Adds fault mitigation in mbedtls_rsa_rsassa_pkcs1_v15_verify() by using the macro FTMN_CALLEE_DONE_MEMCMP() instead of just mbedtls_safer_memcmp() when checking that the hash in the RSA signature is matching the expected value. FTMN_CALLEE_DONE_MEMCMP() saves on success the result in a thread local storage if fault mitigations was enabled when the function was called. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds fault mitigations in crypto_acipher_rsassa_verify() by checking that the internal call to memcmp() when verifying the hash in the RSA signature was called and was successful. The internal call to memcmp() records the result of the comparison if successful. This is double checked against the normal return value from the called pk_info->verify_func(). If the normal return value is OK then the recorded return value must match or we're likely subject to a fault injection attack and we're triggering a panic. If the normal return value isn't OK we don't care about the recorded value, it's overridden by a new error code. In this case we don't know if we're subject to a fault injection attack or not, the important thing to make sure that the calling function doesn't miss the error. This fault mitigation is only enabled with the calling function enabled fault mitigations and CFG_CORE_FAULT_MITIGATION is 'y'. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds fault mitigations in crypto_acipher_rsassa_verify() and dependent functions in libTomCrypt in order to include the critical final memcompare. This fault mitigation is only enabled with the calling function enabled fault mitigations and CFG_CORE_FAULT_MITIGATION is 'y'. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
…a_verify() Adds a stubbed fault mitigation for the drivers version of crypto_acipher_rsassa_verify). End the function with FTMN_CALLEE_DONE() to record that the function was indeed called and a redundant copy of the return value. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds fault mitigations to shdr_verify_signature() and shdr_verify_signature2(). shdr_verify_signature() and shdr_verify_signature2() are called using the wrapper FTMN_CALL_FUNC() which verifies that the correct function was called and that the return value hasn't been tampered with. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds and enables fault mitigation in buf_ta_open() to check both the signature of the TA and then also the hash of the TA before returning success. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Adds and enables fault mitigation in ree_fs_ta_open() to check the signature of the TA before returning success. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
6a93132
to
705df78
Compare
Squashed and tags applied. |
LGTM. |
Tag applied. |
size_t hash_size = 0; | ||
size_t hash_algo = 0; | ||
|
||
if (shdr->magic != SHDR_MAGIC) | ||
return TEE_ERROR_SECURITY; | ||
goto err; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to 'goto err' on OP-TEE 3.14 and ran 'xtest -t regression 1008' and encountered an 'unhandled pageable abort' error. After analysis, I think it is because crypto_acipher_alloc_rsa_public_key is not executed but crypto_acipher_free_rsa_public_key is executed. Not sure if the latest version is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the (latest) implementation, I think it currently works fine as key
is 0 initialized and all public key free possible callback handles (ltc, mbedtls, versal, caam and se050 do behave nicely in such context. Yet, I may have missed something and all in one, it looks a bit fragile and I think you're right, shdr_verify_signature()
exit error sequence should not always call crypto_acipher_free_rsa_public_key()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is really because the key was not initialized to 0 on the old version. After initialization, all xtests passed. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For info: looking into struct rsa_keypair
management, few other places assume it is safe to call crypto_acipher_free_rsa_public_key()
on a key ref that was zero initialized and not allocated or partially allocated, for example sw_crypto_acipher_alloc_rsa_keypair()
in lib/libtomcrypt/rsa.c and lib/libmbedtls/core/rsa.c. So contrary to above my comment, current latest implementation is fine.
This is #5247 rebased and updated.
The use case is to make
buf_ta_open()
more resilient to fault injection attacks on the hardware, more specifically glitching attacks. The basic assumption is that glitch can cause one or more instructions in sequence to act as nops when executed. Anything can happen of course but I'm using this approximation to have something more concrete to work with.The fault mitigations are supposed to add zero overhead unless enabled and if enabled have a modest overhead.
In this PR I've made fault mitigations enabled by default with
CFG_CORE_FAULT_MITIGATION?=y
inmk/config.mk
. Should it be disabled by default instead?