Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance when resolving the workspace root #6530

Merged
merged 7 commits into from
Jul 16, 2024

Conversation

ivyspirit
Copy link
Contributor

@ivyspirit ivyspirit commented Jun 26, 2024

Checklist

  • I have filed an issue about this change and discussed potential changes with the maintainers.
  • I have received the approval from the maintainers to make this change.
  • This is not a stylistic, refactoring, or cleanup change.

Please note that the maintainers will not be reviewing this change until all checkboxes are ticked. See
the Contributions section in the README for more
details.

Discussion thread for this change

Issue number: 5719

Description of this change

We have pretty big bazel project, and we are using the rule_jvm_external pinned feature, which download all the external dependencies to the {baze-base}/external dir. The external folder could grow very big. One example the smallest cache folder for us:

% echo "Total directories:" $(find {root}/.cache/bazel/arm64/96fb5a4ccfb8aa9cafd25443b98fa7e6/external -type d | wc -l) && du -sh {root}.cache/bazel/arm64/96fb5a4ccfb8aa9cafd25443b98fa7e6/external

Total directories: 54733
5.8G	{root}/.cache/bazel/arm64/96fb5a4ccfb8aa9cafd25443b98fa7e6/external

The original logic when resolving the dependency label, it loops through the entire {baze-base}/external dir, and check if the file is dir, then create a map for with the workspaceName as key and its dir path as value. Which when the external is really big the IO operation could hang there for a long time cause the IDE freeze.

This change use the {baze-base}/external/{workspaceName} to construct the workspace root dir. And if it existed then construct the WorkspaceRoot and return.

@ivyspirit
Copy link
Contributor Author

Fixing the test

@ivyspirit ivyspirit closed this Jun 26, 2024
@ivyspirit ivyspirit reopened this Jun 27, 2024
@github-actions github-actions bot added the awaiting-review Awaiting review from Bazel team on PRs label Jun 27, 2024
@ivyspirit
Copy link
Contributor Author

ivyspirit commented Jun 27, 2024

The ExternalWorkspaceReferenceTest test failed, bc i checkfile.exist()here and here before i return the workspaceRoot. However in the test TestFileSystem creates the PsiFile does not seem to create file in the test dir? Any suggestion? Once I removed the file exist check the tests all passes.

@sgowroji sgowroji added product: IntelliJ IntelliJ plugin awaiting-user-response Awaiting response from author on PRs and removed awaiting-review Awaiting review from Bazel team on PRs labels Jun 27, 2024
@ivyspirit
Copy link
Contributor Author

The ExternalWorkspaceReferenceTest test failed, bc i checkfile.exist()here and here before i return the workspaceRoot. However in the test TestFileSystem creates the PsiFile does not seem to create file in the test dir? Any suggestion? Once I removed the file exist check the tests all passes.

@sgowroji do you have any suggestion on this? in the test it create a VirtualFile by TempFileSystem. Which the file does not exist. But in my code I need to check the file exist before return. I checked the code base didn't see an example in the test to handle this case. Any suggestion would be appreciated

@sgowroji sgowroji added awaiting-review Awaiting review from Bazel team on PRs and removed awaiting-user-response Awaiting response from author on PRs labels Jun 28, 2024
@ivyspirit
Copy link
Contributor Author

ivyspirit commented Jun 29, 2024

@mai93 i have fixed the tests. All of them are due to the mock set up. Please take a look. Thanks!

@mai93
Copy link
Collaborator

mai93 commented Jul 2, 2024

LGTM from me, @tpasternak if you have time can you review this?

@tpasternak
Copy link
Collaborator

yep, I'll try it later today

Copy link
Collaborator

@tpasternak tpasternak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a really good finding. Thank you for the contribution! I just left some comments, but please don't consider it a finalized review, so you can hold on with fixing that. I'd like to try it out a little more tomorrow

return ImmutableMap.of();
logger.debug("getExternalWorkspaceRootsFile for " + workspaceName);
File externalBase = SyncCache.getInstance(project)
.get(workspaceName, (Project theProject, BlazeProjectData projectData) -> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I have no idea what are the consequences of this, all other usages of SyncCache are hardcoded 😅 It's a global storage so might cause conflicts. I would prefer to keep the data structure (map?) under the ProjectHelper key and update it on demand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the workspaceName as key to store the workspaceRoot path. It might be a bit overkill to use it. I also think I could just keep a synchronized map instance in this class. But since the workspaceRoot map before was saved to the SyncCache so I went for it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but there was a whole map in a single entry for a key named WorkspaceHelper.class while after your change there are N entries there.

Copy link
Contributor

@ilisc2 ilisc2 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the prev implementation cache the entire map with the WorkspaceHelper class name. The map is huge with tens of thousands of entries, most of them is not useful. Because when you use the rule_jvm_external pinned version, all the external dependencies got downloaded directly into the external dir. The map will cache all the external dependencies dir.

With the name, it will only cache the workspaceName that you installed with the rules_jvm_external, with its path. for example android_mvn, test_mvn... It is not going to be lots of them. I can do an inspection and past the cache counts here, for our case. Considering we've already have lots of different namespaces.

Copy link
Collaborator

@tpasternak tpasternak Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check this out
image

Previously, SyncCache had a static number of entries, typically one per purpose, with hardcoded keys in the plugin's codebase. With your change, the cache becomes dynamic, and the number of entries varies.

The risk is that external workspace names are flat-structured. If another service follows your pattern, conflicts may arise. Other services might also need to store external workspace-related data and should reserve their own keys.

I'm not suggesting we revert to the old map system, but we should keep data within a map, not at the top level of sync-cache.

cc @ujohnny

Copy link
Contributor

@ilisc2 ilisc2 Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I can create a map and cache it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the impl to use a local map to cache the workspace root dirs. Using the SyncCache does not make sense if we want to cache an map and consistently modifying the entries of the map. This should also fixed the issue you saw below when the init bazelProjectData is null? The workspace root will only be cached if the value if the dir existed, means the bazelProjectData is not null. Once the project is done initialization if you do the query again the value should returned as expected. @tpasternak


Path relativePath = bazelRootPath.relativize(path);
if (relativePath.getNameCount() > 0) {
String firstFolder = relativePath.getName(0).toString();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but this might be conflicting with --experimental_sibling_repository_layout.

Apart from that it seems that bazel allows external directory name in the source root. We probably need to handle this case, too. But the old algorithm deosn't seem to support it, too, so we probably shouldn't care

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the old logic assume everything is under external, if not things will break i think. But I can check that case. Would it be ok to do a followup PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if old code looks the same then yes, no need to fix it

Copy link
Collaborator

@tpasternak tpasternak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I found this bug:

  1. Import https://github.com/bazelbuild/bazel
  2. Open the top-level BUILD file

image


Ok, this probably happens, because when you run sync, the blazeProjectData entry might be null so the null is written to the cache

if (blazeProjectData == null) {
logger.debug("the blazeProjectData is null " + project.getName());
return null;
}

@ilisc2
Copy link
Contributor

ilisc2 commented Jul 9, 2024

Also I found this bug:

  1. Import https://github.com/bazelbuild/bazel
  2. Open the top-level BUILD file

image

Ok, this probably happens, because when you run sync, the blazeProjectData entry might be null so the null is written to the cache

if (blazeProjectData == null) {
logger.debug("the blazeProjectData is null " + project.getName());
return null;
}

The local map cache should help on this. although before the blazeProjectData finish init the workspace root will not work. But when it query again once it is done initialization it should work. Wonder how did the old implementation work with this case? seems there would be NPEs. And once the map is cached to the SyncCache it will not be modified, so it might not recover.

Copy link
Collaborator

@tpasternak tpasternak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the last issue, sorry for such a ping pong.

By the way I just noticed that the current solution, as well as the previous one doesn't work with bzlmod, where the paths do have <reponame>~<something stem, but that's another story

File[] children = provider.listFiles(getExternalSourceRoot(blazeProjectData));
if (children == null) {
return ImmutableMap.of();
if (!workspaceRootCache.containsKey(workspaceName)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a primary use case for SyncCache, but I think it's still a good idea to keep it. Otherwise (what happens now) the cache is not cleared on resync. How about this way? Sorry for yet another round but i think it could lead to some bugs

@@ -201,7 +202,7 @@
     if (bazelProjectData == null) {
       return null;
     }
-
+    var workspaceRootCache = SyncCache.getInstance(project).get(WorkspaceHelper.class, (p, data) -> new ConcurrentHashMap<String, WorkspaceRoot>());
     if (!workspaceRootCache.containsKey(workspaceName)) {
       File externalBase = new File(bazelProjectData.getBlazeInfo().getOutputBase(),
           "external/" + workspaceName);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more on the resync case? you mean within one user session when user click on the sync? for that case we don't want to clear the cache right? The external workspace dirs are not going to change. We do want to keep them right? Just trying to understand the case of the cache lifecycle. I thought the cache should stay as long as the WorkspaceHelper instance stays. @tpasternak

Copy link
Collaborator

@tpasternak tpasternak Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this data's lifetime was previously managed by SyncCache class, which is cleared automatically every time when the sync occurs. I would prefer to keep this behavior. Otherwise it might cause problems when external repositories are renamed etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm if the external workspace root is renamed, the build dep would need to be renamed too right? eg, for some reason the namespace installed is changed in WORKSPACE from:

maven_install(
    name = "maven",
    artifacts = [
       //artifacts
    ],
    repositories = [
        "https://repo1.maven.org/maven2",
    ],
)

to

maven_install(
    name = "changed_maven",
    artifacts = [
       //artifacts
    ],
    repositories = [
        "https://repo1.maven.org/maven2",
    ],
)

Then wherever reference that namespace would need to be changed from

@maven//:artifact",

to

@changed_maven//:artifact",

right?

Actually i think busting the cache each time when resync is one of the reasons causing the IDE performance slow. But I could be missing some cases here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the primary reason for the improvement was the cache previously being filled with all external repositories at once. Thanks to your change, it now fills lazily, one-by-one. The initial version of your PR also reused CacheSync, which cleared data during each resync. This approach seemed to be working well.

Additionally, it’s not just about renaming but also about cleaning the cache. It shouldn't grow uncontrollably.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpasternak reverted. Please take a look!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, so after your change, we're putting non-qualified names into SyncCache again, which could cause conflicts if other services use the external repo name as keys. How about we try this approach instead? #6530 (comment)

Btw, the cache is not only used during sync, but whenever you click on a label in starlark code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I actually thought abt doing this but it was bit strange to me that we could not utilize the blazeProjectData provided by the SyncCache and need to keep modifying the cache value(the map). But I understand your concern. Please take another look.

@tpasternak tpasternak merged commit a04e9ab into bazelbuild:master Jul 16, 2024
6 checks passed
@github-actions github-actions bot removed the awaiting-review Awaiting review from Bazel team on PRs label Jul 16, 2024
LeFrosch pushed a commit to LeFrosch/intellij-bazel that referenced this pull request Jul 22, 2024
* Improve performance when resolving the workspace root

* fix test

* check if in unit test mode

* use a local map to cache the workspace root dirs

* Revert "use a local map to cache the workspace root dirs"

This reverts commit f1eac03.

* cache the map to the syncCache

* cleanup

---------

Co-authored-by: Ivy Li <ili@snapchat.com>
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
Description from original PR (#6530):

The original logic when resolving the dependency label, it loops through the entire {baze-base}/external dir, and check if the file is dir, then create a map for with the workspaceName as key and its dir path as value. Which when the external is really big the IO operation could hang there for a long time cause the IDE freeze.

This change use the {baze-base}/external/{workspaceName} to construct the workspace root dir. And if it existed then construct the WorkspaceRoot and return.

PiperOrigin-RevId: 679264404
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product: IntelliJ IntelliJ plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants