Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improves performance of ThumbnailPanel significantly #87

Closed
wants to merge 1 commit into from

Conversation

gingerr
Copy link
Contributor

@gingerr gingerr commented Nov 26, 2019

One performance issue we had is related how layers are handled. This commit allows subviews of Cell to be drawn into its layer and the layer be drawn asynchronously. In my test use-case with 16 windows with 3440x1417 sizing this brought the required time from key event reaction to ThumbnailPanel orderOut from 650ms down to 250ms. There is eventually more performance to gain by explicit control of .layerContentsRedrawPolicy. This commit also adds two NSLog statements for easy timing observations in the Run console.

Related to #45 (closes?)

One performance issue we had is related how layers are handled. This commit allows subviews of Cell to be drawn into its layer and the layer be drawn asynchronously. In my test use-case with 16 windows with 3440x1417 sizing this brought the required time from key event reaction to ThumbnailPanel orderOut from 650ms down to 250ms. There is eventually more performance to gain by explicit control of .layerContentsRedrawPolicy. This commit also adds two NSLog statements for easy timing observations in the Run console.

Related to lwouis#45
@gingerr
Copy link
Contributor Author

gingerr commented Nov 26, 2019

Btw. the first panel summon takes a little longer. Subsequent summons are improved greatly on my system.

@gingerr
Copy link
Contributor Author

gingerr commented Nov 26, 2019

@lwouis
Copy link
Owner

lwouis commented Nov 26, 2019

I checked out the code, then removed the 2 layer lines and tested locally with like 50 finder windows: 1s to 1.1s. I then put back the 2 layer lines: 1s to 1.1s.

Basically I see no performance improvements with or without this PR :/

Looking in Instruments again, I see most of the time being spent in AXUIElementCopyAttributeValue in the background thread, and in Cell.updateWithNewContent. This was the same situation last time we discussed it on Discord.

I'm confused about how the layer situation is taking time on your machine

Update: actually in Instruments, the total time is 250ms, not the full 1s that it actually takes. And indeed when I comment out the filter in AccessibilityApis.windows, 200ms go away. It shows that there is a real problem with AXUIElementCopyAttributeValue, and that we should try to remove its usage, but the time shown by NSLog is still 800ms (1s - 200ms), which is not reflected in Instruments for some reason

@gingerr
Copy link
Contributor Author

gingerr commented Nov 26, 2019

Dammit. Made the same observation on my MacBook running the latest macOS, it makes no difference. My home computer is using an desktop cpu and running on nvidia drivers and also using an older macOS version (high sierra). No idea what of that is causing a difference there. Need to invest more time into this.

Thank you for checking it out and testing and profiling it. I also need to find a better lifecycle method (other than orderOut) cause visually it feels like the NSLog statement in orderOut() appears later in the console than the view actually disappearing. The console could be also running on a different thread and be a little off/async.

@lwouis
Copy link
Owner

lwouis commented Nov 26, 2019

Yes, basically I think we haven't really diagnosed the performance bottleneck correctly yet. First step should be to find a way to actually diagnose. I don't know why Instruments is not helping here :/

@gingerr
Copy link
Contributor Author

gingerr commented Nov 26, 2019

Yeah instruments/profiler does not show the full story or i don't interpret it correctly and thats why i added the NSLog statements.

I did not check AccessibilityApis at all cause the expensive operation on my desktop system is something with images/layers. This class does is not related to fetching the images right? This is about identifying the real/relevant windows?

@lwouis
Copy link
Owner

lwouis commented Nov 26, 2019

This class does is not related to fetching the images right? This is about identifying the real/relevant windows?

Calls to AXUIElementCopyAttributeValue are made to find if the window is of a minimum size (to avoid Chrome search box to show in the thumbnails since it's an NSWindow for some reason). For some reason these calls are expensive. Maybe they copy the image to measure the size at some point to get the dimensions? Who knows. The fact is that it's expensive so as much as possible we should avoid calling it. The same logic from the CG API family is cheap. Maybe checking the size there is enough? To be tested

@lwouis
Copy link
Owner

lwouis commented Nov 26, 2019

image

Maybe the Thread State Trace tells the story? Maybe it is indeed the WindowServer taking time, and somehow that time doesn't appear in Instruments Time Profiler? I'm guessing the CG api to get the screenshots are XPC calls, so their execution happens in the WindowServer process instead of the app threads, so we don't see that time spent in the Time Profiler.

This seems to go in the direction discussed by @galli-leo here. I guess in my previous inspections I focused on the time in the Time Profiler without realizing it's not matching the wall-clock time of what I'm actually seeing on screen. Now that I think about it's clear that if it takes 3s to display on sceeen, and Instrument reports 250ms or something, then the story is happening outside the app.

A last note: I tested HyperSwitch, and they display instantly my 50 finder windows. I tried having a video running and I see that they don't refresh the thumbnails on every alt-tab, so I guess they don't have the issue because they render the thumbnails in the background all the time in advance, and update periodically.

@koekeishiya
Copy link

koekeishiya commented Nov 26, 2019

@lwouis

Calls to AXUIElementCopyAttributeValue are made to find if the window is of a minimum size

All AX API calls are expensive (and slow) as they are serialized requests. If one application takes a long time to respond, this will delay every other request to different applications even if you are doing them through multiple threads.

If all you care about is getting the current dimensions of the window, you can use the following private API call (located in SkyLight.framework).

extern CGError SLSGetWindowBounds(int cid, uint32_t wid, CGRect *frame);

CGRect frame = {};
SLSGetWindowBounds(g_connection, window->id, &frame);

@lwouis
Copy link
Owner

lwouis commented Dec 26, 2019

I'm closing this PR as its improvements are unclear, and anyway I have significantly improved performance in an upcoming PR when I changed pretty much every OS call. See #14 (comment)

@lwouis lwouis closed this Dec 26, 2019
@gingerr gingerr deleted the feature/performance branch March 30, 2020 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants