-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Bug: using EncryptedSharedPreferences, it can cause crashes to users right when initializing it #535
Comments
any comment about this ? I detected this issue occurs in ( Not on all devices ):
|
Do you have a stack trace? How frequently has this happened? |
@thaidn Check the original post. It has enough. |
Fatal Exception: java.security.KeyStoreException: the master key android-keystore://androidx_security_master_key exists but is unusable |
Do we have any fix or workaround for this crash already? Currently we have on Firebase 23 reports of this exception:
The exceptions occur in these devices, mainly in Android 7 (57%) and 6 (26%): Huawei Honor 8 Currently we are catching the exception, but by doing so, every method invoked to the EncryptedSharedPreferences instance afterwards will throw an NPE. |
This is an incredibly frustrating issue and makes the latest versions of this library essentially useless. The workarounds I found were -
to your Application in AndroidManifest.xml. This prevents your encrypted files from getting backed up - EncryptedSharedPreferences can't open them on reinstall anyway.
One or the other may be enough, but this nasty issue popped up for me the day-of our first production release, so I threw the whole kitchen sink at it. As far as I can tell, the issue hasn't been seen since. One other alternative I considered was just not using EncryptedSharedPreferences at all and just falling back to good ol' SharedPreferences. Obviously you lose the entire point of the library, but at least it appears to work more often than not. |
@jarrodrobins Wait, can I avoid the flags of backup, and use the alpha01 and it should work fine? All of the crashes I've reported won't occur again? |
Wait, when you switched to 1.1.0-alpha01, the problem went away? |
We've had the app with allowBackUp as false for a long time before the crash showed up so that's not going to solve the problem. These are very rare cases though and I'm tented to propose showing some dialog to the user explaining na critical error occurred. Because it is a critical error... |
That was my experience with 1.1.0-alpha01 (NOT 02 or 03), yes. It may not be necessary to disable backups, but a number of other posts (eg StackOverflow) referenced that as a possible workaround. The 'testing' I was doing involved finding a teammate who was able to replicate the issue and having them install a bunch of different test builds changing various bits and pieces as it was not something I could replicate locally. The working build for them involved both 'fixes' but yes, it's possible changing it to alpha01 is enough. I implemented this workaround on my project a while ago so my memory is a little hazy on every single test variant I tried, so you might want to double check on your end if you want to try downgrading alone. This post above references the backup fix. One person in there said alpha01 alone didn't solve their issue, for what it's worth. |
The thing is, like I said, it's a very rare occasion, you could go for a long time without having any cases thinking you solved the issue. |
While getting it to occur is very rare, once it does, it occurs 100% of the time following that. Like I said, I couldn't get it to occur locally, and relied on a user to help me work around it. All I know is those two fixes stopped that user from getting the crash (after getting them to install at least five other intermediate builds trying different things). |
But if you don't have the backup flags, would it still occur on alpha 02 ? |
Yes... I wonder if someone over at Google could give us a hint. |
¯_(ツ)_/¯ When I was testing it, alpha02 still caused the crash without the backup flags. But you may want to test it yourself if you're able to replicate it. Ideally Google would just fix the damn thing, considering it's been a problem for at least a year that I'm aware of. |
Ideally yeah. I mean, we use their IDE everyday and they have all our data... |
This is sorta expected. First of all, I apologize for causing so many pains and frustrations. I made a wrong decision trusting Android Keystore which happens to be super unreliable, especially on exotic devices. Let me explain what I think is happening, why there's little Google can do even though I'd spent days on this, and possible workarounds with caveats. Background & root causeThe dependency chain is: Users -> Jetpack Security -> Tink -> Android Keystore -> OEM firmware/hardware. I work on Tink and contributed to Jetpack Security. I think the root cause of the crashes is neither in Tink nor Jetpack, but in Android Keystore and OEM firmware/hardware. Shared prefs are encrypted with a Tink keyset (which is a protobuf). The keyset is encrypted with a master key stored in Android Keystore. The encrypted keyset itself is stored as a special value in the encrypted shared prefs file. We've found that Android Keystore occasionally corrupts the master key on certain devices. We don't know why, we think it could be due to faulty OEM firmware/hardware. When the master key is corrupted, Tink won't be able to decrypt, and return the error WorkaroundThere are two ways to recover from a corrupted master key: 1/ If you don't have any existing encrypted prefs, you can delete the master key (using deleteEntry()), and try again. 2/ If you have existing prefs, it's very likely that they will be lost forever. I'm so sorry! To prevent further crashes, you can:
I'll talk to the Jetpack team to see if we can add a function to do this for you. In the meantime, you have to delete these things by yourself. For example, if you use the following code:
Then you want to delete the file |
Thanks for the really solid explanation @thaidn. I appreciate that this issue is across several teams and is hard for you guys to track down. I'll give these workarounds a try. Thank you! |
@thaidn As you think that it's a hardware issue, can you please tell Google to add a test to licensing devices (I think it's called CTS), so that at least for new devices, this issue won't occur? |
Thanks for your time and information. I got a question though: to delete the master key would this code suffice?
|
You have to load the Android KeyStore.
|
Hmmm. That can also crash 🤔 Oh well, it's our best bet anyway. Thanks for the help again. |
I'm using the following code in an attempt to reset the master key and all (the shared prefs file is also correctly being deleted, according to the log message):
But after running this, the next call to EncryptedSharedPreferences.create() fails with that protobuf exception you mentioned:
Is there anything else I'm missing here? |
I could be wrong @Tharkius but i did something like this and I no longer get that exception.
|
Doesn't deleting the data ruin how the app might work, though ? |
There's not much point in keeping data you can't decrypt anymore... |
So it's not a workaround to still use what you had before. It's resetting of the data you've saved, just to avoid a crash, no? Suppose this is the code I use:
What exactly should I change here for this workaround? |
I had the same issue even with the To reproduce this error, you can make some loops with new threads doing operations with the encrypted preferences. |
Can you provide a stack trace? From your description, it seems to be a bug in EncryptedSharedPreferences itself, and not in AndroidKeysetManager. |
same problem here |
This has worked for the majority of cases where there was an issue on phones that had issue when updating secuirty-cypto versions, but a few months ago I started seeing new cases where we get |
I am also geeting same carsh on firebase.
I am using androidx.security:security-crypto:1.1.0-alpha06 compile and targetSdk. = 34
|
That you now may get a android.security.KeyStoreException instead of a InvalidProtocolBufferException is expected, I changed that in 592c2eb. This happens if the master key stored in android keystore is not able to decrypt the encrypted keyset used by EncryptedSharedPreferences. The question is, why did you get into this state? There are two possibilities: I think it is more likely that you are in situation b). This may have happen if you deleted the master key from keystore. The next time a "MasterKey" object is built, it will automatically create a new master key in keystore, which will not be able to decrypt the keyset encrypted with the deleted master key. Another possibility is that there is a race-condition, where two threads at the same time create an EncryptedSharedPreferences. Both threads see that there is not yet a master key, and both create a new one, and in the end the encrypted keyset is not from the master key in keystore. Looking at the code, I don't think this race-condtion may happen if the EncryptedSharedPreferences is always created with the "MasterKey" object, since the key will always be created by the same function, and it uses a lock, see If, however, EncryptedSharedPreferences is created once with the MasterKey object in one thread, and in a different thread using the deprecated "create" function that uses a key alias, then there may be a race condition because the deprecated create function will let the AndroidKeysetManager create the master key using a different lock. We can't really fix this race condition without breaking the API. So please don't use the deprecated create function. If there are other race conditions in this code that I don't know, I'd need a code example that shows the problem. One way to make sure that you don't run into such a race-condition is to create a MasterKey object at the startup of your application, before you start any other threads. Then this MasterKey object will create the master key in keystore, and it will not be overwritten by any other thread. |
I have updated the sdk code to 34 and update cryto lib from alpha-3 to alpha-6. Then this crash is happening. |
As I said, there are many reasons why you can't decrypt the master key. Without a detailed description on how we can reproduce the problem, there is nothing we can do. The Tink Java library is now hosted on https://github.com/tink-crypto/tink-java. Please open an issue there if you have a way to reproduce the problem. |
Access to the master key that is used to wrap Ed25519 keys relies on Android Keystore, Google Tink and androidx.security.crypto. We are using alpha versions of the latter library, and it is notoriously flaky (see e.g. tink-crypto/tink#535). The main issue that caused me to go over this class was that on newly created API 30 devices, a Ed25519 key with device lock protection could never be used, even if they had just been created (i.e. the newly created master key was probably not corrupted). I realized that *something* had changed so that we now need to authenticate in order to just create the private key object. This has not been the case in my previous tests; authentication was not required until we tried to _use_ they key to do something (in our case, sign an arbitrary string). Since we need to authenticate in more situations than before, I added a function which can be called whenever we catch a UserNotAuthenticatedException. Concrete changes: - Upgrade androidx.security.crypto from 1.1.0-alpha06 (which relies on a newer version of Google Tink, which is hopefully more stable). Note that this requires compiling against SDK >= 33, but I believe that should be fine. - When preparing to use a Ed25519 key for signing, use a signing algorithm which actually works with those keys. This has been causing an InvalidKeyException during each use of a Ed25519 for a long time. - Catch all UserNotAuthenticatedExceptions and try biometric auth when they occur. - Remove the mustAuthenticate attribute, as it becomes useless now that we rely on catching UserNotAuthenticatedExceptions. - Make privateKeyLoadAttempts a class attribute which can be updated from many places. - Don't swallow and silently log all other exception types when failing to use a key for signing. We want to know when (yes, when, not if...) this code breaks again. - Clean up duplicated RequiresApi annotations.
Access to the master key that is used to wrap Ed25519 keys relies on Android Keystore, Google Tink and androidx.security.crypto. We are using alpha versions of the latter library, and it is notoriously flaky (see e.g. tink-crypto/tink#535). The main issue that caused me to go over this class was that on newly created API 30 devices, a Ed25519 key with device lock protection could never be used, even if they had just been created (i.e. the newly created master key was probably not corrupted). I realized that *something* had changed so that we now need to authenticate in order to just create the private key object. This has not been the case in my previous tests; authentication was not required until we tried to _use_ they key to do something (in our case, sign an arbitrary string). Since we need to authenticate in more situations than before, I added a function which can be called whenever we catch a UserNotAuthenticatedException. Concrete changes: - Upgrade androidx.security.crypto from 1.1.0-alpha06 (which relies on a newer version of Google Tink, which is hopefully more stable). Note that this requires compiling against SDK >= 33, but I believe that should be fine. - When preparing to use a Ed25519 key for signing, use a signing algorithm which actually works with those keys. This has been causing an InvalidKeyException during each use of a Ed25519 for a long time. - Catch all UserNotAuthenticatedExceptions and try biometric auth when they occur. - Remove the mustAuthenticate attribute, as it becomes useless now that we rely on catching UserNotAuthenticatedExceptions. - Make privateKeyLoadAttempts a class attribute which can be updated from many places. - Don't swallow and silently log all other exception types when failing to use a key for signing. We want to know when (yes, when, not if...) this code breaks again. - Clean up duplicated RequiresApi annotations.
Feels like this will help many fellow wanderers investigating their AEADBadTagException |
This:
https://issuetracker.google.com/issues/176215143
I was told I should write about this here:
https://issuetracker.google.com/issues/185219606#comment4
The text was updated successfully, but these errors were encountered: