Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jest-haste-map: recover from duplicate IDs #3647

Merged
merged 6 commits into from
May 24, 2017

Conversation

jeanlauliac
Copy link
Contributor

This supersedes #3107. This changeset adds a new top-level structure to the "HasteMap" object kept internally. This structure keeps track of duplicated modules. We cannot keep track of these in the existing map field, that accepts only a single resolution per module name/ID. An alternative solution would have been to change the structure of map, for example to keep an array of file paths for each module name/ID. However, this would incur significant overhead persisting the whole "HasteMap". By adding a separate index structure, that only contains the strict necessary, we ensure minimum overhead is added for persistence.

Test plan

Unit tests. As you can see, I fixed the unit tests that were testing the broken behavior, that is now correct.

}
let dupsByPl = hasteMap.duplicates[id];
if (dupsByPl == null) {
dupsByPl = hasteMap.duplicates[id] = (Object.create(null): any);
Copy link
Contributor Author

@jeanlauliac jeanlauliac May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these any are necessary, otherwise Flow complains below that is cannot find "g" (special case of "platform") in the object. The problem is that we're precisely assigning "g", not trying to fetch it, so that's a little weird. I think it's a false positive from Flow.

expect(data.map.Strawberry[H.GENERIC_PLATFORM][0]).toEqual(
'/fruits/raspberry.js',
);
// duplicate modules are removed so that it doesn't cause
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are sentences, please capitalize the first character :)

@@ -325,6 +325,28 @@ class HasteMap extends EventEmitter {
throw new Error(message);
}
this._console.warn(message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this warning (Can be done in a follow-up PR)? It doesn't provide much use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, we need to leave it so that people know there's duplication during watchmode. Without it it would say "module doesn't exist", without any actionable feedback.

What we can do later is improve the API so that consumers can identify duplicated IDs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, I think in the initial PR the way it was implemented is that we moved this error handling up the stack so that consumers can decide what to do it. We may want to do that because otherwise, in RN, we won't be able to show a redbox. Most people don't look at the cli output during development. However, this can of course be done later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it as a future step, as this changeset make it no worse than the current situation. I definitely think doing "console.warn/log" in the middle of a library is an anti-pattern and that we should expose it as an error condition through getModule instead.

if (Object.keys(moduleMap).length === 1) {
delete map[id];
}
let dupsByPl = hasteMap.duplicates[id];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind spelling out dupsByPlatform? Personally I'd prefer duplicatesByPlatform but I think dups is not ambiguous, so it is fine,

let dupsByPl = hasteMap.duplicates[id];
if (dupsByPl == null) {
dupsByPl = hasteMap.duplicates[id] = (Object.create(null): any);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could turn this into const, which is imho more readable, but up to you:

const dupsByPl = hasteMap.duplicates[id] == null
  ? (hasteMap.duplicates[id] = (Object.create(null): any))
  : hasteMap.duplicates[id];

I guess it only matters if you are trying to avoid let :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but then it does 2 key accesses instead of one for no clear benefit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I get it. This is because at that point the original module won't be part of the moduleMap, so the previous if-statement with existingModule && existingModule[H.PATH] !== module[H.PATH] will be false.

if (dups != null) {
dups[module[H.PATH]] = module[H.TYPE];
}
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the purpose of this block. Can you elaborate on why we need lines 343-349? As I understand it, if an id with a collision has yet another collision, it will overwrite whatever is currently in the duplicates map for that platform. Why?

Copy link
Contributor Author

@jeanlauliac jeanlauliac May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block checks if there's already duplicates for that name yeah (in which case there are minimum 2 elements in dups already). We need to add the file we're processing to the list of duplicates for that name. We are not overwriting the others here, we are adding it to the 'set'.

dupsByPl = hasteMap.duplicates[moduleName] = (copy(dupsByPl): any);
dups = dupsByPl[platform] = (copy(dups): any);
const dedupType = dups[filePath];
delete dups[filePath];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, the remaining thing I'm confused about is why the currently changed file gets removed from the duplicate map on this line.

Copy link
Contributor Author

@jeanlauliac jeanlauliac May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the code above the callsite of _recoverDuplicates. What we do is that we remove all references to the changed code, both in the hasteMap.files index and in the hasteMap.map, so that when we process it we know we just have to insert it, there's nothing else to remove elsewhere.

Here is the same idea. Basically as soon as a file changed, whether deleted/changed, we remove it from the duplication index as if it had been deleted. Later on, we run the file-processing code above, that will re-create the duplication index for that file in case the ID conflict still happens. Of course that means if a conflicted file is changed again and again, we'll delete and re-create the duplication index for its ID again and again. But, this makes the logic simpler and more consistent with the rest of the jest-haste-map architecture. We could implement a complicated logic that checks if the ID changed, and delete the file from the duplication index only if it did change. But, I believe that logic would be more complex, and more importantly, more fragile. It'd also be harder to implement given the current structure of the system ^_^

Does that answer the question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thank you, that totally makes sense. Nice work!

@cpojer cpojer merged commit 02e20c9 into jestjs:master May 24, 2017
tushardhole pushed a commit to tushardhole/jest that referenced this pull request Aug 21, 2017
* jest-haste-map: do not expose any module when ID is duplicate

* jest-haste-map: build up an index of duplicate IDs

* jest-haste-map: recover from duplicate module IDs

* jest-haste-map: fix prettier

* fix all tests+flow

* jest-haste-map: address comments
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants