-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: RFC: Create type SecureString #24738
Conversation
Note that this PR also addresses the same issue that was addressed in #24731. The solution proposed here is definitely more developer friendly and contains less gotchas when developing new code around secure strings. |
base/strings/types.jl
Outdated
|
||
function SecureString(str::AbstractString) | ||
s = new(str) | ||
finalizer(securezero!, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand a design based on finalizers, since finalizers may take a long time to call. Don't you need to zero the data as soon as it falls out of scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best practise would be to securely zero the data as soon it is no longer in use. The finalizer is used mostly as a fail-safe to ensure that at the very least the data is zeroed when the the instance is garbage collected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make sure to mention this in the docstring for SecureString
base/strings/types.jl
Outdated
``` | ||
""" | ||
mutable struct SecureString <: AbstractString | ||
string::String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector{UInt8}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I need to switch this to use non-immutable memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nice thing I have noticed about using String
is that when doing SecureString("password")
the string literal memory will also be wiped out when running securezero!(::SecureString)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above is also true when using Vector{UInt8}
internally as vector can access the memory of the original String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above is also true when using Vector{UInt8} internally as vector can access the memory of the original String
Yes, but (1) you're not supposed to use such a vector to mutate a string, (2) we plan to fix this to least make it much harder to mutate a string via a vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wiping string literals is definitely not a good idea. It basically breaks your program. If secure data is put in a string literal there's nothing we can do about it.
After giving this some more thought I think there are two approaches we can take when dealing with secure data:
The main issue with explicitly copying and wiping data is that it is error prone it is easy to forget to wipe the data. Additionally making duplicate copies of secure data seems like a bad solution as it increases the chances for exposure. Disadvantages of relying on finalizers is the finalizers may take a long time to call. This could be mitigated with some kind of reference counting approach. |
My end goal here is to find an approach where we can have secure string data without having it be harder to use than any other |
I just think that if you're worried about secure data being in memory for too long, then a finalizer-based approach is inherently insufficient. See also #11207. |
I did some experimentation with trying to implement reference count and some nice results with using |
The current implementation is not dissimilar to just using
I've also introduced a new exported function |
Again, I'm skeptical that anyone who needs this functionality should be relying on finalizers. |
The finalizer is just used as a failsafe. Best practise is to use |
The name |
The name |
I can't tell from a quick perusal if this is the case, but I do like the approach here: having a separate type allows us to limit the behaviors that one can do with this type – such as conversion and copying. Just using To that end, I wonder if this shouldn't be a completely opaque data type instead of a subtype of struct Secret
data::Vector{UInt8}
end It would have no methods except constructors and |
Overall I like this idea. I think we may want a few additional methods like |
I think that doing |
Given the importance of properly handling confidential data, another idea might be to require the user to explicitly shred the secret when they're done with it, and to complain if one is ever left un-shredded. From this perspective, cleaning up the memory "at some future point" with a finalizer feels like a convenience that could mask a bug – better to throw an error instead. (This also feels like a place where Rust-like ownership or |
I like that idea but I'm not sure if we can actually raise an error in a finalizer. Which task gets the error in that case? I also vaguely recall that printing in a finalizer is a problem. |
Would it make sense/be possible to use |
Currently, exceptions caught during finalization print something to stderr rather than throwing "normally" to the caller. |
Printing a warning seems like it might be good. Of course, if only the GC has a reference to a secret then it's pretty safe – after all, malicious code doesn't have a reference to it either. Sure, proactively "shredding" the data is better, but there's nothing that wrong with GC shredding it either. |
My current approach is to print a warning if the data has not been shredded and then proceed to shred the data. |
That sounds like the best option. |
Sorry, didn't mean to close and reopen. |
I've done a bunch of work on this but I'm not sure I can have it ready for the feature freeze. The good news is that while implementing this I've uncovered and fixed some issues I'll make some PRs for yet. |
How would people feel if at the very least I try to get the rename of |
This is not a public API, right? If it's not public, you can always change it after the feature freeze. |
Bump on this, it would be nice to stop all the CI failures due to the runtime incorrectly overwriting immutable memory. |
Did most of the rebase. Will hopefully push something for tomorrow morning. |
Originally used `isequal` to deal with `Nullable`
Finished the rebase. I've tested the LibGit2 usage of |
return s | ||
end | ||
|
||
isshredded(s::SecureString) = sum(s.data) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want all(s.data .== 0)
; as there's a (minute) possibility that this can overflow to precisely zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather all(iszero, s.data)
.
I like the direction of this PR, but I think we basically want to get rid of lines 38-52 in With regards to the "right API" for dealing with these objects, I don't think having |
I'm in agreement with the calls for read and write APIs. In fact, I think this would be much easier to use safely as an IO object — perhaps |
Succeeded by: #27565 |
Part of the problem in #24731 is that when working with structures which try to securely wipe themselves when finalized you end up wiping the underlying data once the first struct has been finalized. In an ideal world we would only securely wipe the data once the last reference has been removed.
SecureString
is a new type which allows other structures to reference secure data which will only be securely wiped once theSecureString
is no longer referenced or explicitly wiped. The new string type also allows me to stop usingdeepcopy
to work around unwantedsecurezero!
calls which avoids having to duplicating sensitive information.