Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 27755 - Using the Subtle Crypto Interface with Streams #73

Open
mwatson2 opened this issue May 24, 2016 · 91 comments
Open

Bug 27755 - Using the Subtle Crypto Interface with Streams #73

mwatson2 opened this issue May 24, 2016 · 91 comments

Comments

@mwatson2
Copy link
Collaborator

Bug 27755:

Though the StreamsAPI is referenced in Informative Reference, the functions under window.crypto.subtle are specified with only one-shot data inputs.

Use-cases: Data may not be available at once. Data may be too huge to keep in memory.

For encrypt()/decrypt() it would make sense to have a streaming readable output if the input is a readable stream.

@jimsch
Copy link
Contributor

jimsch commented May 24, 2016

After listening to Ryan rage about the use of BER encoding for ASN.1 objects, I have a feeling that this should be closed as won't fix because it presents a security issue. When one looks at the encrypt/decrypt APIs for authenticated encryption, it is required that the entire stream be observed on the decrypt side and could be argued that it needs to be observed on the encrypt side prior to emitting the processed stream. This is due to the fact that if the decryption process does not validate then no output is to be produced for consumption. Allowing this to be done in a streaming fashion means that the browser potentially needs to have an infinite size buffer to hold the intermediate result to be returned to the client.

Similar issues hold for processing of signature values for the new X448 EdDSA algorithm where the message M is hashed twice. Allowing for an indefinite length input means that there are potential buffer overrun problems.

@feross
Copy link

feross commented May 24, 2016

Node.js has a streaming crypto API without any security issues:

const crypto = require('crypto');
const hash = crypto.createHash('sha256');

hash.update('some data to hash');
hash.update('more data');
hash.update('even more data');
console.log(hash.digest('hex'));

Why can't the web platform?

@indutny
Copy link

indutny commented May 24, 2016

I absolutely agree with @feross on this. Most (if not all) of the APIs can work in a streaming mode without any security issues. In fact, this is how these APIs are exposed in OpenSSL, so they always work in a streaming mode under the hood anyway, regardless of what high-level API may look like.

@jimsch
Copy link
Contributor

jimsch commented May 24, 2016

All of the current hash functions that I am familiar with will allow for streaming APIs because they are built using a Merkle–Damgård construction. This means that they are processed on a block by block basis. However there are algorithms for which this is not doable. For example, the EdDSA algorithm that I mentioned above computes:

R = fn( SHAKE256(dom(F, C) || prefix || M, 114) )
and then
k = SHAKE256(dom(F, C) || R || A || M, 114)

as you can see, you need to all of the message M to compute R before you can start doing the computation of k. This means that the entire message needs to be buffered unlike the hash example you gave above.

Note also the comment that I made on authenticated decryption where the entire message needs to be kept before doing the validation step at the end.

@indutny
Copy link

indutny commented May 24, 2016

@jimsch in your description SHAKE256 appears to be just a hashing function, most of the hashing functions support streaming input. There is nothing that could prevent one from creating two streaming SHAKE256 hashes and using their digests at the end of the stream to compute R and k.

Authenticated decryption should work as well, as far as I can tell... Though, the fact that the integrity is checked only at the end of decryption process means that the API will be kind of awkward. I don't think that there are much pros of using streams for authenticated decryption.

@jimsch
Copy link
Contributor

jimsch commented May 24, 2016

@indutny please re-read my previous post and look at the requirements to finish R before using M for k

@indutny
Copy link

indutny commented May 24, 2016

@jimsch oh I see it now. Sorry about that! Yeah, streaming won't work for this kind of encryption/decryption schemes indeed.

Still many hashes and ciphers work just fine with streams.

@tanx
Copy link

tanx commented May 24, 2016

A native streaming api would indeed be great. Our use-case would be large file encryption in OpenPGP.js.

@mwatson2
Copy link
Collaborator Author

If we address this, I think it will not be in this version since it requires substantial work.

@mwatson2 mwatson2 added this to the VNext milestone May 24, 2016
@hhalpin
Copy link

hhalpin commented Jun 20, 2016

I imagine we can close this as won't fix, but when streaming stabilizes we can then revisit as part of maintenance of the spec since as @jimsch correctly points out, it won't work for quite a few algorithms. We could also try to test to see if anyone supports streaming - any ideas?

@hhalpin
Copy link

hhalpin commented Jun 20, 2016

v.Next.

@evilaliv3
Copy link

Is there any update on this topic?

@roccomuso
Copy link

+1

@ericmackrodt
Copy link

If streaming/progressive encryption isn't implemented, it's going to hugely limit the scope of usage of the API. I really need that kind of functionality for the software I work on.

@neckaros
Copy link

neckaros commented May 9, 2017

+1!

@neckaros
Copy link

neckaros commented May 9, 2017

Privacy is a groing concern. Being able to decrypt locally without consuming too much memory is a must i think.
For exemple encrypt huge file locally as you send it to a server so the server never has the decrypted data.
It works well on nodejs

@alanwaketan
Copy link

I think digest maybe a good point to start with.

@thiccar
Copy link

thiccar commented Jul 14, 2017

+1000

@daviddias
Copy link

daviddias commented Aug 28, 2017

Hi all, bringing this issue back up. Any updates or recent discussion on it?

I believe that the security considerations do not hold and it what it promotes is for users to find other ways to encrypt their files as the usage of browsers to share large documents grows. Possibly by having to shim their own encryption streaming API which will be considerably slower than a native one through WebCrypto.

@JulianKlug
Copy link

+1

@johnozbay
Copy link

100% agreed with @ericmackrodt & @neckaros & @diasdavid. With GDPR on the horizon this would make things a lot more easier for European establishments.

@dead-claudia
Copy link

@jimsch By any chance, could a streaming API be provided for those encryption schemes that could be streamed? Just because it's not possible for some doesn't make it impossible for all (and there's different tradeoffs for each). And one good example of this is with client decryption of large files on mobile (only high end phones/tablets have the RAM available to reliably decrypt a 750MB video download in-memory).

@jimsch
Copy link
Contributor

jimsch commented Mar 27, 2018

It could, on the other hand there may be other things that could be done as well. For example one could do chunked encryption of large objects such as video which is designed to be streamed so that each chunk can be independently decrypted and streamed. The world is moving towards only using authenticated encrypted algorithms and doing streaming such as you suggest means that you are willing to use a decrypted stream that may have been corrupted w/o being able to detect this.

Additionally, one would need to get a group of people together at the W3C who are interested in doing an update to the document and then decide which algorithms could/should be streamable and which should not.

achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 11, 2024
TLDR: the bundle size has been reduced by ~50KB

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in @libp2p/crypto-aes-ctr
- RSA encryption/decryption and PEM import/export are now in @libp2p/crypto-rsa

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to require a padding algorithm that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pull out into @libp2p/crypto-rsa which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/crypto-rsa, streaming AES-CTR ciphers are in @libp2p/crypto-aes-ctr
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 11, 2024
TLDR: the bundle size has been reduced by ~50KB

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in @libp2p/crypto-aes-ctr
- RSA encryption/decryption and PEM import/export are now in @libp2p/crypto-rsa

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to require a padding algorithm that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pull out into @libp2p/crypto-rsa which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/crypto-rsa, streaming AES-CTR ciphers are in @libp2p/crypto-aes-ctr
achingbrain added a commit to libp2p/js-libp2p that referenced this issue Jan 12, 2024
TLDR: the bundle size has been reduced by about 1/3rd

- parsing/creating PEM/pkix/pkcs1 files is now done by asn1.js
- Streaming AES-CTR ciphers are now in [@libp2p/aes-ctr](https://github.com/libp2p/js-libp2p-aes-ctr)
- RSA encryption/decryption and PEM import/export are now in [@libp2p/rsa](https://github.com/libp2p/js-libp2p-rsa)

## AES-CTR

WebCrypto [doesn't support streaming ciphers](w3c/webcrypto#73).

We have a node-forge-backed shim that allows using streaming AES-CTR in browsers but we don't use it anywhere, so this has been split out into it's own module as `@libp2p/aes-ctr`.

## RSA encrypt/decrypt

This was added to `@libp2p/crypto` to [support webrtc-stardust](libp2p/js-libp2p-crypto#125 (comment)) but that effort didn't go anywhere and we don't use these methods anywhere else in the stack.

For reasons lost to the mists of time, we chose to use a [padding algorithm](https://github.com/libp2p/js-libp2p-crypto/blob/3d0fd234deb73984ddf0f7c9959bbca92194926a/src/keys/rsa.ts#L59) that WebCrypto doesn't support so node-forge (or some other userland implemenation) will always be necessary in browsers, so these ops have been pulled out into `@libp2p/rsa` which people can use if they need it.

This is now done by manipulating the asn1 structures directly.

## PEM/pkix/pkcs1

The previous PEM import/export is also ported to `@libp2p/crypto-rsa` because it seems to handle more weird edge cases introduced by OpenSSL.

These could be handled in `@libp2p/crypto` eventually but for now it at least supports round-tripping it's own PEM files.

Fixes #2086

BREAKING CHANGE: Legacy RSA operations are now in @libp2p/rsa, streaming AES-CTR ciphers are in @libp2p/aes-ctr
@vszakd
Copy link

vszakd commented Mar 8, 2024

+1

@gobengo
Copy link

gobengo commented Sep 23, 2024

@jasnell @tniessen wdyt of adding DigestStream to https://github.com/wintercg/proposal-webcrypto-streams ?

@twiss
Copy link
Member

twiss commented Nov 20, 2024

There's an issue about moving the Web Crypto Streams draft to WICG here: WICG/proposals#185.

Implementers: please voice your support there if you're in favor of this work 😊

@davidben
Copy link
Collaborator

This seems to go the route of automatically streaming every algorithm, which is not the way to do this. Cryptographic algorithms being streamable is the exception rather than the rule. In particular, modern symmetric cryptography uses AEADs, which are not streamable. There are some libraries which make them streamable, but this is a mistake. See https://www.imperialviolet.org/2015/05/16/aeads.html#:~:text=AEADs%20with%20large%20plaintexts

@dead-claudia
Copy link

@davidben To be fair, this is much simpler to spec and implement. And OpenSSL and its derivatives provide exactly this same kind of API, so browsers might not even be managing this buffer directly.

@twiss
Copy link
Member

twiss commented Nov 21, 2024

@davidben The issue brought up there mainly affects decryption, not encryption. Though, you might (reasonably) argue that if we shouldn't have streaming decryption, we shouldn't have streaming encryption either, but there are plausible scenarios where it might still be useful, e.g. if the encrypting client might be memory-constrained but the decrypting client isn't (think SecureDrop or so).

I would be cautious to specify streaming encryption and/or decryption of only the non-AEAD modes, as it might incentivize using those, while we probably want to incentivize using AEAD. So if we don't want to have streaming encryption for AEAD, perhaps we shouldn't have streaming encryption at all, and just recommend chunking the plaintext (although this of course won't be possible for legacy protocols).

So, I think that starting with streaming hashing, for example as proposed in #73 (comment), might be better (and simpler).

However, having a draft in the WICG where we can work on that & iterate on the API still wouldn't hurt, IMHO 😊

@jasnell
Copy link
Collaborator

jasnell commented Nov 21, 2024

@gobengo ... sorry I just spotted your question here:

@jasnell @tniessen wdyt of adding DigestStream to wintercg/proposal-webcrypto-streams ?

Definitely +1

@jasnell
Copy link
Collaborator

jasnell commented Nov 21, 2024

@davidben:

This seems to go the route of automatically streaming every algorithm, which is not the way to do this. Cryptographic algorithms being streamable is the exception rather than the rule....

Yep, this has come up in conversations a few times and I would consider the initial explainer / draft that we put together in wintercg as just an initial starting point. I think there's a TON of refinement and clarification that is needed. We wouldn't intend for "automatically streaming every algorithm" to be the actual approach we take.

@knightcode
Copy link

Given the position that OpenSSL has taken, It doesn't seem like it's your job to protect us from ourselves or dictate system design.

@twiss
Copy link
Member

twiss commented Nov 21, 2024

Well - with no offense to the OpenSSL team, I don't think we should take their API as the end-all and be-all of crypto APIs.

AEAD modes are typically defined as single-shot functions, that don't allow passing a stream. Deviating from that would need a stronger justification than "OpenSSL does it", IMHO.

@davidben
Copy link
Collaborator

Indeed many libraries do not expose streaming AEADs, including some of those used by browsers. OpenSSL's streaming API is an artifact of them trying to mix multiple, unrelated algorithms into the same API. (A similar mistake that has been made here.)

Any proposal with a streaming AEAD is a non-starter here. AEADs are simply not streaming primitives.

@knightcode
Copy link

I don't mean to stir trouble, but you didn't address my point. It was not "OpenSSL does it." It was "it's not your decision to make."

@jpsugar-flow
Copy link

jpsugar-flow commented Nov 21, 2024

How much more work would it create to split hashing from encryption here? It seems that the latter is contentious but the former is not.

@dead-claudia
Copy link

Also, due to the current state of cryptography algorithms and their usage, stream input to hashes is a bit more critical.

  • Most symmetric encryption currently uses AES, a block cipher. The streaming modes for block ciphers like AES could mostly be done in userland as (aside from GCM and GCM-SIV) most of them are very straightforward to implement atop sync algorithms. GCM is trickier and requires a bit of math, and counter mode requires an awkward carry calculation, but the rest of the common ones are just key-width XORs at specific times.
  • Asymmetric encryption usually operates on whole buffers in practice, as the relevant messages are normally only kilobytes long.
  • Hashing is very frequently done over streams of indefinite length. S3's API for instance accepts an x-amz-checksum-{ALGO} header where two of the algorithms available are SHA-1 and SHA-256. This, for a chunk of several gigabytes, would necessitate a streaming hash before uploading in most browsers.

@bradisbell
Copy link

Unfortunately, the authors and core proponents of this spec couldn't care less about hashing use cases outside of crypto. We've hashed over this (heh) on previous threads while trying to get something as simple as SHA-1 on "insecure" origins:

The key comment from @twiss:

In any case, I empathize with your use case but it's not one that Web Crypto was intended to serve.

#170 (comment)

Therefore, I doubt we'll get built-in hashing of streams without the associated crypto side... but who knows, maybe someone will prove me wrong. We're stuck with DIY in the meantime. At least it's possible.

@twiss
Copy link
Member

twiss commented Nov 22, 2024

@bradisbell Bringing in unrelated issues isn't really relevant, and also not really helpful while we're actively discussing the addition of streaming to Web Crypto.

@dead-claudia and @jpsugar-flow I tend to agree.

Taking a step back, and perhaps contrary to what I said before in this thread; if we want to focus on adding streaming hashing first since it's least controversial, and if everyone is happy with the proposal in #73 (comment) (i.e. extending the existing function rather than adding a new DigestStream object or some such), perhaps we don't strictly need to have a separate draft in WICG for that, and we could just directly make a PR here?

If folks are OK with that, I can write something up.

@jasnell
Copy link
Collaborator

jasnell commented Nov 22, 2024

... and if everyone is happy with the proposal in #73 (comment) (i.e. extending the existing function rather than adding a new DigestStream object or some such),

@jakearchibald's proposal looks good to me. Workers will need to keep DigestStream around forever as we don't make breaking changes but having the existing crypto.subtle.digest accept a ReadableStream as input would work just as well for our uses cases. Actually, I would go one step further and say that instead of just accepting a ReadableStream, allow it to accept any Iterable and AsyncIterable.

I imagine that crypto.subtle.sign and crypto.subtle.verify could both also follow the same pattern without much controversy.

@taralx
Copy link

taralx commented Nov 22, 2024

Oh, I missed that proposal. Very elegant. +1 from me. (I'm jpsugar-flow, but this is not work-related for me.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests