Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement fallback hashing algorithm when crypto module is not available #19941

Merged
merged 4 commits into from
Jan 26, 2018

Conversation

bttf
Copy link
Contributor

@bttf bttf commented Nov 11, 2017

Improves upon #19101; Fixes #19100

@msftclas
Copy link

msftclas commented Nov 11, 2017

CLA assistant check
All CLA requirements met.

}
else {
// djb2 hashing algorithm: http://www.cse.yorku.ca/~oz/hash.html
const chars = data.split("").map(str => str.charCodeAt(0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a exported function with /*@internal*/ and also use it at https://github.com/Microsoft/TypeScript/blob/master/src/compiler/watch.ts#L645 in the system.createHash undefined case.

@mihailik
Copy link
Contributor

mihailik commented Nov 23, 2017

Thanks for the addition @bttf , I've been lazy on this one.

Just looking at the original comments, one place concerns me: https://github.com/Microsoft/TypeScript/blob/0d9bc55033a3b5c8f68622803409476c7a6f82b4/src/compiler/sys.ts#L64

Why does it say 'should be cryptographically secure'?

I did raise this question:

@mhegazy just to check: collisions due to weak hash are only affecting performance, cannot cause incorrect behaviour, is that right?

reply:

that is my expectation as well.

Expectation is not 100% the same confidence as knowing for sure. Looking at the usage in watch.ts the answer isn't that clear. @sheetalkamat you've put lots of work in there, can you clarify please?

How does the logic in watch.ts handle hash collisions?

Many thanks!

@sheetalkamat
Copy link
Member

@mihailik The hash is created from .d.ts(so that we know shape of the module). If hash1 corresponds to text1 and hash2 corresponds to text2 then when shape of the module changes from text1 to text2, if hash1 = hash2 it would mean that we wont emit any other modules that depend on (import directly or indirectly).

@mihailik
Copy link
Contributor

mihailik commented Nov 28, 2017

@sheetalkamat that backtracks on previous clarification by @mhegazy. And it implies the current released TSC is guaranteed to produce incorrect build results randomly (albeit rarely).

Also the risk with the weak hash is higher.

Should the weak hash implmentation instead return the argument as result, testing harness does that already? If they don't have full crypto, it's going to be slower - but not crash, neither produce broken builds.

https://github.com/Microsoft/TypeScript/blob/7a4557331195e982abb7e1d493b42d185e301857/src/harness/harness.ts#L761

* http://www.cse.yorku.ca/~oz/hash.html
*/
/* @internal */
export function generateDjb2Hash(data: string): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this to be exported i would say. just make createHash always implemented in sys.

@@ -494,9 +511,16 @@ namespace ts {
}
},
createHash(data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider defining two functions instead:

function createMD5HashUsingNativeCrypto(data: string) {
    const hash = _crypto.createHash("md5");
    hash.update(data);
    return hash.digest("hex");
}
function createDjb2Hash(data) {
    const chars = data.split("").map(str => str.charCodeAt(0));
    return `${chars.reduce((prev, curr) => ((prev << 5) + prev) + curr, 5381)}`;
}

then in here make createHash either one or the other based on the existance of _crypto,e.g.:

{
...
createHash: _crypto ? createMD5HashUsingNativeCrypto : createDjb2Hash
...
}

@@ -642,7 +642,7 @@ namespace ts {
}

function computeHash(data: string) {
return system.createHash ? system.createHash(data) : data;
return system.createHash ? system.createHash(data) : generateDjb2Hash(data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use createHash when available. you already made it use generateDjb2Hash in sys. watch is no triggered except on node anyways.

@mhegazy
Copy link
Contributor

mhegazy commented Jan 11, 2018

@sheetalkamat what about @mihailik suggestion of just returning the string.. it is definitely more accurate..

@sheetalkamat
Copy link
Member

We can use same string but then it also means that huge string is stored in cache and would be there for each output for life time (till it changes to something else) (in tsc --w or compile on save from editor)

@mhegazy
Copy link
Contributor

mhegazy commented Jan 11, 2018

good point.

@mhegazy
Copy link
Contributor

mhegazy commented Jan 11, 2018

@bttf i think this change i ready to go. i am just in the middle of snapping for TS 2.7. i will merge this once that is done.

@mhegazy mhegazy merged commit 9677b06 into microsoft:master Jan 26, 2018
@bttf bttf deleted the patch-19100 branch February 2, 2018 16:26
@microsoft microsoft locked and limited conversation to collaborators Jul 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants