-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#172621882] generate zip file with user data #43
Conversation
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
…tions-admin into 172621882-accesso-ai-dati
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
…tions-admin into 172621882-accesso-ai-dati
Affected stories
New dependencies added: archiverAuthor: Chris Talkington Description: a streaming interface for archive generation Homepage: https://github.com/archiverjs/node-archiver
|
Created | about 1 year ago |
Last Updated | 3 months ago |
License | MIT |
Maintainers | 1 |
Releases | 9 |
Direct Dependencies | aes-js , archiver , archiver-utils , compress-commons and zip-stream |
Keywords | zip, encryption, aes, archiver and password |
README
archiver-zip-encrypted
AES-256 and legacy Zip 2.0 encryption for Zip files.
Plugin for archiver that adds encryption
capabilities to Zip compression. Pure JS, no external zip software needed.
Install
npm install archiver-zip-encrypted --save
Usage
const archiver = require('archiver');
// register format for archiver
// note: only do it once per Node.js process/application, as duplicate registration will throw an error
archiver.registerFormat('zip-encrypted', require("archiver-zip-encrypted"));
// create archive and specify method of encryption and password
let archive = archiver.create('zip-encrypted', {zlib: {level: 8}, encryptionMethod: 'aes256', password: '123'});
archive.append('File contents', {name: 'file.name'})
// ... add contents to archive as usual using archiver
Encryption methods
Plugin supports 2 encryption methods:
- 'aes256' - this is implementation of AES-256 encryption introduced by WinZip in 2003.
It is the most safe option in regards of encryption, but limits possibilities of opening resulting archives.
It's known to be supported by recent versions 7-Zip and WinZip. It is NOT supported by
Linux unzip 6.00 (by Info-Zip). It is also NOT supported by Windows explorer (i.e. not possible to open Zip file as folder),
even in Windows 10. - 'zip20' - this is implementation of legacy Zip 2.0 encryption (also called "ZipCrypto" in 7-Zip application).
This is the first encryption method added to Zip format and hence is widely supported, in particular
by standard tools in Linux and Windows. However its security is proven to be breakable
so I would not recommend using it unless you absolutely have to make it work w/o external software.
For more information on these encryption methods and its drawbacks in particular see WinZip documentation.
It's worth noting that neither of these encryption methods encrypt file names and their metainformation,
such as original size, filesystem dates, permissions etc.
Generated by 🚫 dangerJS
const writableBlobStream = blobService.createWriteStreamToBlockBlob( | ||
userDataContainerName, | ||
blobName, | ||
(err, _) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not clear why do we need the callback here ? cannot we just call
const writableBlobStream = blobService.createWriteStreamToBlockBlob(userDataContainerName, blobName)
and write to the archiver stream using append
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The callback refers to the saving on the remote storage (docs).
We use append
to add content as you'd expect, but the final result of the operation is given in the callback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right but the callback is optional (it will write on the remote storage even if you don't provide any). why do we need it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because otherwise there would not be not a way to know if and when the blob has been saved to the storage.
The flow is: zipStream --> blobStream --> blobSaving. We can listen for the end
and finish
event of the streams, but still the saving step can fail imho
@@ -0,0 +1,51 @@ | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using a package ie https://github.com/klughammer/node-randomstring instead of writing our custom implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will if we'll need more complex stuff, so far it's easy job and I'd rather not add another dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do that instead, your code is unsafely using Math.random, we need something better:
https://github.com/klughammer/node-randomstring/blob/master/lib/randomstring.js#L9
more code = more bugs, please use randomstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bonus: is already included: https://github.com/pagopa/io-functions-admin/blob/master/package.json#L61
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random string isn't working for us as it doesn't ensure the string will contain some for character for any character group. I'd need to fin another lib to include
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does this requirement come from ? you can concatenate more random strings if you really need this (do we?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Passwords are needed to be strong". What does strong mean? I don't know. My interpretation was:
- 18 characters
- at least one uppercase letter
- at least one lowercase letter
- at least one number
- at least one symbol
- no pattern
Should we check with security guys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to pietro we don't need a constraint on the number of symbols for each set, so a 18 character string with one character set that has upper/lower case alphanumeric characters and special symbols is ok
utils/zip.ts
Outdated
? archiver("zip-encrypted", { | ||
encryptionMethod, | ||
password, | ||
zlib: { level: 8 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd extract this value into DEFAULT_ZLIB_LEVEL
} | ||
} | ||
); | ||
readableZipStream.pipe(writableBlobStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pipe should be called before any operation on the stream (append / finalize)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is implicitly true already as append
and finalize
from the archiver
module operate on another eventloop iteration. I agree we should write explicit code
I see three options:
- we use the
archiver.append
function outside of thezip
module, so we can call it after the zip stream has been piped. This leads to a leak of the archiver api to the activity handler, which may not know about it - we keep the
archiver.append
call inside thezip
module, as it is now, but we enforce aprocess.nextTick
- (don't know if it works) we keep the
archiver.append
call inside thezip
module, but we wait for thepipe
event of the readable stream (so we're sure we won't load files before it has been piped to somewhere)
Keep in mind that the main reason of having a zip
module is that module initialization must happen once per application, otherwise it would throw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd simply put all the implementation here and remove the zip module. importing modules must not have side effects, even more if this may cause an exception. if you need to initialize something, implement some init() method and make it safe.
ExtractUserDataActivity/handler.ts
Outdated
createCompressedStream: ( | ||
// tslint:disable-next-line: no-any | ||
data: Record<string, any>, | ||
password: NonEmptyString | ||
) => stream.Readable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would have been useful to pass this as a parameter in case you'd used some generic typings
(ie. () => stream.Readable) but since this is the exact type of createCompressedStream
then you can just import the method and use it directly without passing it as a parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it mainly for testing, so it's easier to mock such function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll find a way to mock it anyway ie jest.mock("../zip")
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
Co-authored-by: Danilo Spinelli <gunzip@users.noreply.github.com>
@@ -0,0 +1,51 @@ | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do that instead, your code is unsafely using Math.random, we need something better:
https://github.com/klughammer/node-randomstring/blob/master/lib/randomstring.js#L9
more code = more bugs, please use randomstring
Codecov Report
@@ Coverage Diff @@
## master #43 +/- ##
=======================================
Coverage 86.91% 86.91%
=======================================
Files 24 24
Lines 795 795
Branches 51 51
=======================================
Hits 691 691
Misses 103 103
Partials 1 1 Continue to review full report at Codecov.
|
* Mock implementation of the zip module | ||
*/ | ||
|
||
export const createCompressedStream = jest.fn(() => ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally we don't need this kind of mocks, simply call jest.mock(<module path>)
will transform every exported functions form that module into a mock (jest.fn)
@@ -0,0 +1,51 @@ | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to pietro we don't need a constraint on the number of symbols for each set, so a 18 character string with one character set that has upper/lower case alphanumeric characters and special symbols is ok
]); | ||
export type StrongPassword = t.TypeOf<typeof StrongPassword>; | ||
|
||
const shuffleString = (str: string): string => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see the comment above, remove this module and just generate a 18 character string from the needed charset
password?: NonEmptyString, | ||
encryptionMethod: EncryptionMethodEnum = EncryptionMethodEnum.ZIP20 | ||
): Readable => { | ||
const zipArchive = password |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in which case we want a passwordless zip ? this is not needed.
} | ||
} | ||
); | ||
readableZipStream.pipe(writableBlobStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd simply put all the implementation here and remove the zip module. importing modules must not have side effects, even more if this may cause an exception. if you need to initialize something, implement some init() method and make it safe.
The following implements the generation of a zip bundle with all user data, and save it to a storage.
The functionality is exposed as a runnable script for local execution
Run locally
Notable points
zip20
algorithmYet to be solved
AllUserData.encode
fails on actual db data. This is important as the encoder would purge any sensitive data. While this is unsolved, a patch is applied.utils/zip
module should be unit tested