-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MANTA-5083 JAVA client is taking too long to get file when range start is big. #562
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be prudent to have a dependent test-case reproducing the conditions that were explained earlier in the PR's premise and making comparative evaluations accordingly
...t-unshaded/src/main/java/com/joyent/manta/client/crypto/MantaEncryptedObjectInputStream.java
Outdated
Show resolved
Hide resolved
...-manta-client-unshaded/src/main/java/com/joyent/manta/client/crypto/AesCtrCipherDetails.java
Outdated
Show resolved
Hide resolved
All tests passed:
|
final long startingBlock = position / blockSize; | ||
final byte[] counter = ByteBuffer.allocate(16).putLong(startingBlock).array(); | ||
final BigInteger ivBigInt = new BigInteger(iv); | ||
byte[] updatedIV = Arrays.copyOf(ivBigInt.add(BigInteger.valueOf(startingBlock)).toByteArray(), 16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we potentially replace 16
with getIVLengthInBytes()
?
This corrects a bug introduced in #562 that prevented correct decryption of some objects encrypted using AES CTR mode. The objects susceptible to this bug are ones where IV values generated during the encryption process are interpreted as negative BigInteger values. The calculation of applying the CTR mode offset to the initial, randomly-generated IV and the conversion back to a byte array yielded incorrect results in such cases and prevented correct decryption of the associated objects. This PR changes the method used to determine the IV value to use for decrypting the request to instead use operations directly on the byte array rather than converting the values to BigInteger values. This PR also adds testing to ensure that the IV calculation when using an offset of zero results in an IV that is identical to the input IV. It also expands the canRandomlyReadPlaintextPositionFromCiphertext testing to exercise both the former and current methods of advancing the Cipher to the requested cipher block. The testing compares each decryption result to the expected plaintext as well as compares the results of each method to one another.
The heart of this problem is that we need a way when using AES/CTR mode to increment the counter that's used along with the nonce or initialization vector (IV) as input to the decryption process based on the block that the bytes returned from the Range header request belong to. Presently, our solution to that is the calls to Cipher.update, but this is an expensive method when really all we need is to increment a counter to reflect the current block being decrypted. So instead what we can do is to take the IV and increment that value manually (I could not find any built-in Java API for this purpose) and then we can proceed to decrypt the block just like before. Calculating this block-targeted IV value is very cheap compared to the update calls.
I created a test program that riffs on the Range header example already in the repo. It uploads an object to Manta and then downloads 1024 bytes chunks from the beggining, middle, and end of the object and reports the time it took and if the downloaded data matched what we expected. Here is the full code I used:
Here is a run of the test program without the changes for this PR (i.e. It is using the
Cipher.update
method to increment the AES/CTR counter):As has been reported the time complete the request for each file chunk increases as the offset into the object increases.
Now here is a run of the program with the changes from this PR:
The reported times for each chunk request are now within the same order of magnitude which is exactly what we want while the data for each chunk still matches our expected data indicating successful decryption.