-
-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Raising PBKDF2 iterations, password strength and StatiCrypt security model #159
Comments
Wonderful information, thank you for taking the time to write this up, great job. Going to take me some time for my old brain to marinate on this, but going to study every word in class next week, |
More profiling on the time it takes to compute PBKDF2 on my laptop with a decent CPU (Ryzen 7 4750U). Using Crypto-JS on the decrypt page and averaging two runs for each (for backward compatibility reason, we still do 1k iterations based on SHA1 then Xk iterations based on SHA256):
Testing with WebCrypto on Bitwarden help page (which is actually doing more than just computing PBKDF2):
So WebCrypto is indeed much faster. I think it'd be good for the encrypted page to be accessible even on low end machines. Bitwarden places the bar at 600k or 160ms on my computer. That would be just a few thousands iterations with Crypto-JS - I think we can stretch things a bit and go to 15k total. That would be ~320ms on my laptop, if we assume a low end one would be 4 times slower that's 1.2s to decrypt. Not great, but not inaccessible, plus if you use the "Remember-me" feature you only need to compute it once. And it still adds an order of magnitude to the brute-forcing difficulty, roughly equivalent to adding a random digit to the password. This trade-off is made more acceptable by the next step: moving to WebCrypto asap. This will allow to raise PBKDF2 to 600k iterations. Thanks to the way the So my current plan is to:
|
This addresses the concerns and plan detailed in #159
I've just released 1.4.3 increasing v1 iteration count to 15k - with this, all proposed changes outlined here should be complete. I'm now closing that issue. Thank you for reading or participating in the reflection! |
This issue describes my current understanding of PBKDF2 and the security model of StatiCrypt, and why a concern about the PBKDF2 iteration count in StatiCrypt was raised last week-end. This is my current opinion, and might change later.
As I'm writing it, it looks like this is going to be quite long - like, lengthy blog post long. Sorry, it's just a really interesting topic! I tried to express things clearly and keep it practical, I hope it's somewhat pleasant to read. If you get bored by the theory, you can jump straight to the end section, What can StatiCrypt do about this?
Issue status and commenting: I'm opening this issue to clarify what I understand on the topic for myself, talk about StatiCrypt security openly, and get feedback from whoever reads it. I'm not saying this is definitely right or correct, and if you think I'm wrong or short-sighted somewhere I'd love to learn more - that's the point of the issue, please feel free to reply! If you have different ideas about how StatiCrypt should go forward, you can share it too (with arguments so it's efficient). Standard internet commenting rules, be kind and be constructive, apply as always.
What's being reported
StatiCrypt works by using PBKDF2, a hashing algorithm, to hash the password a number of times before using the hash to encrypt the file. This makes the encryption key exactly 256 bits long so it can be used in AES-256, and repeating the hashing process will slow down brute-force attacks - the process of deriving the encryption key from the password becomes slow, so someone trying to get a working decryption key will spend more time generating them, and the cost of the attack will rise. I understand this as a standard way of doing things.
The security concern here, that I got from a few of other people this past week-end (StatiCrypt has been shared on HN past Saturday), is that the iteration number of the hashing function is lower than what's typically recommended. It's currently at 1000, but the recommended number by OWASP is now 600k for SHA-256 based PBKDF2 and 1300k for SHA1 - more on these numbers and whether they're appropriate later. The recommended number of iteration is regularly raised to match the improvement in hardware technology. Looking into the crypto-js code, since it's not mentioned in their doc, it looks like the default underlying hashing algorithm is SHA1.
The consequence is that it's easier (= less expensive) to brute-force the password than it could be.
Password, brute-forcing and threat modeling
So what's the impact? It depends on your threat model and chosen password. Because in StatiCrypt the encrypted data is very often public, an attacker can easily get a copy and try to crack it offline, and you have no way of changing your password or restricting access. The whole security strategy relies on making brute-forcing more expensive than what the people you want to hide your file from are ready to invest.
There are two opposing factors in balance here:
You can influence 1. by choosing the threat you want to protect against (eg: "random people on the internet ready to pay a couple hundred $ on cloud GPUs for computing hashes") and not protect against (eg: "I don't care if nation-states can read this file").
You can influence 2. by picking a strong password. StatiCrypt can help make it stronger by setting the PBKDF2 iterations high.
How much do PBKDF2 iterations matter?
This recent article on PBKDF2 iterations, which I really recommend, explains that PBKDF2 iterations are a helping factor but that the password entropy is what matters most. Quoting from it:
Why does the password matter more than PBKDF2 iterations? If your password is a random string of alphanum characters + 8 symbols (
$!@...
), each character is drawn from26*2 (letters upper/lowercase) + 10 (digits) + 8 (symbols)
equals 70 options. Each time you add another character you multiply your password entropy (~ the number of passwords an attacker has to try) by 70.PBKDF2 iterations, on the other side, stack roughly linearly - doubling your number of iterations make the hashing and so the brute-forcing time twice as long. So going from 1k iterations to 600k makes the cracking time 600 times longer, equivalent to having 1.5 more characters on your password.1 (If I'm wrong here, I'd really appreciate someone letting me know!)
If you already have a really strong, long password, you still have a strong password (probably still really strong) relative to what it would be in another tool doing 600k iterations. Keep in mind we can only compare relative strategies, there is no absolute.
What about weak or medium-strength passwords?
This is where PBKDF2 is most useful - it gives a baseline resistance against brute-forcing to the password. In a regular context, people's passwords are often pretty bad, meaning their entropy is low, meaning they are picked from a small possible space (so not a long string of random characters). Using PBKDF2 can add the equivalent of a few more random characters to the password, pushing it into the realm of more ok password, the meaning of "ok" depending on the threat model.
Regarding resistance to brute-forcing, going from 1 iteration to 1k is like adding 1.6 alphanum+symbols characters to a password, from 1 iteration to 600k like adding 3.1 characters. Wikipedia cites a 2007 study over 500k user saying the average password entropy is 40.5 bits, or ~6.5 alphanum+symbols characters. Adding 3.1 characters to that is pretty useful!
So why not just do 10 billion billion iterations and call it a day? PBKDF2 iterations have a drawback that strong passwords do not: they take time. Strong passwords make brute-forcing more difficult by increasing the space of possible passwords an attacker has to try. PBKDF2 makes it more difficult by increasing the time required to try a single password.
A legitimate user knows their password, so the password space they have to try has a size of 1 and a longer password doesn't delay them when they decrypt the file. But they will have to run PBKDF2 on that single password: running 10 billion billion iterations even once would take forever for them too. So you can only pick the highest number of iterations that keeps the time to decrypt acceptable for a legitimate user.
This limits how many iterations you can add, and how much entropy you can add through PBKDF2 (and is why Neil Maden writes "The point of this is not to convince you to increase your PBKDF2 iterations. [It] is to point out that there is no sane parameters for password hashing that provide anything like the security levels expected in modern cryptography", meaning that can turn a bad password into a strong cryptographic key).
That's why having a really strong password has to be at the center of the strategy for StatiCrypt.
What can StatiCrypt do about this?
Finally, the practical part for this repo! What's the impact in the context of StatiCrypt, and what can we do better?
Strategy 1: better passwords
As we've seen, having a really strong password is the crux of the matter, so I think this should be the main strategy.
We try to make it pretty quite clear that brute-forcing is very easy to try and that people should use a "long, unusual passphrase", as is mentionned multiple times on the README. I hope we can assume that StatiCrypt's context therefore isn't the same as the general web, and that people are using stronger passwords than the general public. I don't know by how much.
We can also be better job. My general philosophy is that StatiCrypt as a tool should treat users as adults and not completely forbid a short password, but we can nudge people towards having a strong one.
My current ideas are:
displaying a warning (on the CLI and the web) when someone enters what looks like a weak password. Something like
The weak password detection could depend on the type of password - 18 characters for lowercase alpha, 14 for alphanum+symbols for example.
updating the README to give more indication on what's a strong password for StatiCrypt
prompting for confirmation when using a short password, the same as above but adding
This is breaking change, so this will have to be for StatiCrypt next major version.
Strategy 2: higher PBKDF
To provide more defenses to users who still use a weak password, I think it makes sense to run PBKDF2 with a higher number of iterations. The limit is to keep encrypt/decrypt time acceptable. The problem is we run it in a JS implementation, which is pretty slow, and that we run it:
We could say that it's ok to take a while running the JS implementation at encryption time because it's done once and you could adapt your workflow (it's currently done once for each file you encrypt, but I've been wanting to support encrypting multiple files natively for a while and this would fix this).
But we don't have any information on the type of browser that will decrypt the file and the machine it runs on. So I think we have to be pretty conservative.
The good thing is when using the remember-me feature the hashed password is stored, so we don't have to recompute PBKDF2 each time. Same with the auto-decrypt links.
I need to run some most tests to see how high we can raise the iteration count without making decrypt time unacceptable. From what I saw 10k is probably doable, 50k maybe, 100k I'm not sure, 600k definitely not. We're limited by the JS implementation's speed - maybe switching to WebCrypto in the next major version will allow to raise that count.
How to make these changes
Updating the README + adding a warning on a weak password can be done immediately without breaking anything.
Naively using more iterations on PBKDF2 is a breaking change, though. The way to generate the key for encryption/decryption changes, so the previously saved hashes (in remember-me and auto-decrypt links) won't work. It's possible to circumvent that by applying PBKDF2 to the hash itself - doing
pbkdf2(pbkdf2("password", 1000), 599000)
(taking 600k as an example) results in roughly the same thing as doing 600k in one go security-wise, and previously saved hashes can still unlock the file.But this means the decryption logic has to change to accomodate the new iteration count. This is a breaking change for people using a custom
password_template.html
file - if we just increase the iteration count, this will break their website. Luckily, we can sneak around this for people who are using a version of the template higher than 2.2.0, released on November 2022, where we refactored the encrypting/decrypting code outside of the template, so it can be updated without touching the password template.Based on all of this conversation, my current planned strategy is:
SECURITY WARNING
asking them to update (we could try to edit their custom password template on the fly, but it seems pretty brittle)The other options I see are either: doing a major version bump to signify a breaking change (but a number of people might not upgrade), or a patch update where we encrypt with higher PBKDF2 iterations for everyone even when they use a custom password template, but this means some people will have their website break and we also risk them just rolling back to the last version that was working for them.
In this proposed solution, people exposed to a security risk after the patch are the subset of users using a custom
password_template.html
created before 2.2 and a weak password. I would hope most of them used a weak password on StatiCrypt because they don't really need to protect it from any sort of motivated attacker. After a while, we can substract from this subset the people who see the WARNING message and update. I don't see a way to reduce the impact more than that at the moment.If you've read this far, thank you!
I'd love to get feedback and hear what you think about this all. If I've made mistakes, made wrong hypothesis, or you would like to see a different course of action, please let me know. The conclusions are tentative and might change as I work on implementing these changes in practice. I have a PR almost ready and will open it soon.
Footnotes
ln(600)/ln(70)
, random alphanum+symbols characters ↩The text was updated successfully, but these errors were encountered: