Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display windows from other Spaces #14

Closed
lwouis opened this issue Aug 30, 2019 · 17 comments
Closed

Display windows from other Spaces #14

lwouis opened this issue Aug 30, 2019 · 17 comments
Labels
enhancement New feature or request

Comments

@lwouis
Copy link
Owner

lwouis commented Aug 30, 2019

Currently windows from other spaces, including fullscreen'd windows that are in their own space, are not displayed in the thumbnails

@lwouis lwouis added the enhancement New feature or request label Aug 30, 2019
@coxjc
Copy link

coxjc commented Oct 15, 2019

i'd like to bump this. i generally keep all my apps in full-screen, so this isn't uber useful until this enhancement is done.

@lwouis lwouis added the M size label Oct 17, 2019
@lwouis
Copy link
Owner Author

lwouis commented Oct 17, 2019

I checked and HyperSwitch has this feature. You can see at the bottom of the preferences UI here:

image

I'll try to find a way to implement it. There is some interesting conversation in this StackOverflow thread. I can confirm that when removing .optionOnScreenOnly in the CGWindowListCopyWindowInfo call, the windows from other spaces are return. That's good news. However, We need to also get the same windows from the AXUIElementCopyAttributeValue(AXUIElementCreateApplication(cgOwnerPid), kAXWindowsAttribute, [AXUIElement].self) call, so that we can use the AX API later to focus the window. That API returns [] for apps that have windows in another space.

Also note that for some mysterious reason, removing .optionOnScreenOnly has the side-effect of having the array from CGWindowListCopyWindowInfo always sorted in the same order, instead of being sorted in the order of recently used first, which the app relies upon completely to order the thumbnails.

If anyone has an idea, please share it here :)

@lwouis lwouis added L size and removed M size labels Oct 17, 2019
@lwouis
Copy link
Owner Author

lwouis commented Oct 28, 2019

I tested:

  • HyperSwitch: always displays from other spaces correctly, with the Space number on the thumbnail
  • WindowSwitcher: only displays if it was running during the space transition. It seems they maintain state in the app instead of asking the OS. If you move windows to a space and restart, the app doesn't show the windows.

@okoeth
Copy link

okoeth commented Nov 3, 2019

Would really like to see this feature. Use Spaces a lot... Thanks for this great software!

@lwouis
Copy link
Owner Author

lwouis commented Nov 6, 2019

Interesting discussion on the Rectangle project on why they don't support Spaces in general because there don't have public APIs.

I've played today with private APIs from CGSPrivate.h and it seems that even these are now buggy

let connection = _CGSDefaultConnection()
var workspace: Int32 = 0
CGSGetWindowWorkspace(connection, Int32(cgId), &workspace)

workspace is 1 for windows in different workspaces

This tweet says that they stopped returning correct values in 10.8.

WhichSpace uses CGSCopyManagedDisplaySpaces which I tested and works on Mojave. This only gives us a list of spaces though, and only work if you have this setting enabled:

image

This code gives list of spaces with their windows IDs (1 to 3 digit IDs):

let connection = _CGSDefaultConnection()
let info = CGSCopyManagedDisplaySpaces(connection) as! [NSDictionary]
let displayInfo = info[0]
let activeSpaceID = (displayInfo["Current Space"]! as! NSDictionary)["ManagedSpaceID"] as! Int
let spaces = displayInfo["Spaces"] as! [NSDictionary]
for (index, space) in spaces.enumerated() {
  // let spaceID = space["ManagedSpaceID"] as! Int
  // let spaceNumber = index + 1
  var setTags = UInt64(0)
  var clearTags = UInt64(0x4000000000)
  let spaceID64 = space["id64"] as! Int
  let windows = CGSCopyWindowsWithOptionsAndTags(connection, 0, [spaceID64], 2, &setTags, &clearTags) as! NSArray
  debugPrint(windows)
}

I'm trying to find a lead on which tech to use to implement this feature, but it's very hard to find literature on this. The big projects like yabai, but look at the crazyness they have to go through:

It uses a scripting-addition, which is a bundle of code that we inject into Dock.app to elevate our privileges when communicating with the WindowServer. The WindowServer is a single point of contact for all applications. It is central to the implementation of the GUI frameworks and many other services. Because of this, System Integrity Protection must be disabled for yabai to function properly.

They are basically doing extremely low-level and version-specific OS black magic.

Also interesting to note how Catalina broke their app because of private APIs change. If we go the private API route to support this feature, we need to be ready to have a very challenging OS migration story...

@lwouis
Copy link
Owner Author

lwouis commented Nov 6, 2019

Interesting potential framework to switch spaces: https://github.com/bigbearlabs/SpaceSwitcher

@lwouis
Copy link
Owner Author

lwouis commented Nov 6, 2019

Got the honor to receive some tips from yabai's author Åsmund. He agreed for me to share our conversation here for others to learn from:

You are indeed correct that it is not possible to retrieve the handle to a window's AXUIElementRef until the space which said window belongs to is the active one.
Thus, being able to change window focus across space boundaries without first visiting each space in order and caching the window references is at first sight, not possible.
For this solution to work it is also necessary to subscribe to notifications from the accessibility API to get notified when a new window is created, and cache that also.
Now this is a rather complex solution and it will still have the prerequisite of visiting every space at least once, before it can track all visible windows. That's no good..

However, there is an alternative way to do this. To realize this we need some background information as to how the user-space in macOS is constructed.
There is a central service called the WindowServer and this service is the core of all GUI in macOS. Every single window is represented internally by something called a _CGSWindow.
The WindowServer is responsible for allocating the backingstore and setting other core properties of the window, such as which space(s) it belongs too, and other attributes such as shadow and much more.
GUI frameworks such as Cocoa is an abstraction layer that implements drawing functionality, and is compiled as a part of the application itself.
This structure is why it is not possible to perform system-wide application appearance changes, such as removing titlebars and so on.
An application can communicates with the WindowServer through the mach messaging system (MIG), and this is how all, or at least most of the low-level APIs work.
To determine whether an application is authorized to perform a change, it has the concept of a connection to the WindowServer, and this connection can have different levels of access.

The interesting part to notice is that almost all general desktop functionality on macOS is implemented the Dock (+ a couple of helper services). If you kill the Dock and prevent it from auto-relaunching
you will notice that cmd + tab will stop working, mission-control is gone (both the interface, gestures and keyboard shortcuts), wallpaper is no longer rendered, and much more.
Another interesting observation is that the Dock is able to focus ANY window of an application at any time, regardless of which space or display it is located on, and this bring us to the first alternative solution..

In yabai I use a scripting-addition to inject code into Dock.app. This allows us to call private functions with privileges given to the Docks connection to the WindowServer.
Secondly, it also allows us to call functions that are otherwise internal to the Dock process. In yabai I have tracked down function that performs a change in focus.
The only requirement for calling this function is to know the CGWindowId of the window you wish to activate regardless of its space or display - no need to go through the accessibility mess at all.
The bad part is that from Mojave and newer versions of macOS SIP must be disabled for code injection to work.

I've done a lot of digging with regards to being able to change window focus in various ways without having to rely on the accessibility API, and without having to rely on Dock.app, because I wanted to implement what is commonly known as autofocus among UNIX communities. I did manage to do this. It turns out that there are certain types of events that can be targetted at applications that alter its conception of what the focused window is.
I partially described how this works in an issue in Hammerspoon: Hammerspoon/hammerspoon#370

This solution does not use code-injection and does not require SIP to be deactivated. It does not inherently require the use of the accessibility API either, which makes me believe this could be tweaked to support the use-case you're trying to achieve.
In case you are feeling adventurous, the relevant function in yabai is: https://github.com/koekeishiya/yabai/blob/master/src/window_manager.c#L809 or https://github.com/koekeishiya/yabai/blob/master/src/window_manager.c#L829

As for getting the thumbnail for a window you may want to check the private function: CGSCaptureWindowsContentsToRectWithOptions
Googling the function name should probably suffice, if not let me know and I'll get you the parameters definition.


How would I visit all spaces? Is there a public API for that, or were you meaning actually the user would need to visit the spaces, and somehow the app would listen to the even of space change (how to do that?), and cache stuff? Also what is the API for the new window created event?

Unfortunately there is no public API for this. You could trigger a change in visible space using a private API call (see issue referenced below) and this would likely alter the information retrievable by the AX API. I have not tested this, but I can see this being a thing. The Dock, mission-control etc will not detect the space change, and will now be out of sync. The way we achieve space manipulation in yabai is through code-injection which allows us to solve these issues, and keep the system state in sync.
Otherwise, the user would have to visit the spaces manually. You can read the following issue for some in-depth information: https://github.com/koekeishiya/chunkwm/issues/127

You can utilize the NSWorkspace API to get notified when a space change occur: https://developer.apple.com/documentation/appkit/nsworkspace/1527073-activespacedidchangenotification

To get notified about newly created windows you have to use the accessibility API and subscribe to the desired notification for every application individually through the result of AXUIElementCreateApplication. See documentation for the AXObserver-related functions: https://developer.apple.com/documentation/applicationservices/axuielement_h

I think this is too much for a simple "alt-tab" app.

I completely agree with this assessment.

If I understand the code correctly, this solution relies on private API SLPSPostEventRecordTo and hardcoded retro-engineered values. I'm a bit worried about trying this in my app as low-level languages are really not my expertise unfortunately.

That is correct, and I fully understand that it is not trivial to get into at first, but I do think this approach appears to be the most promising way to reach said goal.

A last topic I thought I should ask you about in case you've dealt with this is capturing screenshot of a window that's minimized. It turns out that CGWindowListCreateImage can't do it.
I'm experimenting right now with leveraging the screencapture binary that's shipped with macOS. It's an abstraction layer over the private APIs that is consistent between OS versions. It turns out it can capture minimized windows! I'm struggling a bit with performance at the moment, but looking at Instruments it seems it may be me not using NSImage properly. The capture itself is like 40ms for 15 windows. Certainly more than CGWindowListCreateImage but I could image only using this for minimized window for instance. What do you think?

I haven't actually tried to solve that problem before either, but after taking a quick peek at the screencapture binary shipped with macOS Mojave 10.14.5, it appears to use the function I mentioned in my previous reply.


I found this particular problem to be quite interesting, so I wanted to give this a test myself. The function I mentioned earlier does indeed allow you to perform a snapshot of a minimized window.

Some sample C code I used to test this (no error checking etc is being performed here, just the minimal amount of code to run a test):

// declaration of external function to retrieve the running process' connection to the WindowServer
extern int CGSMainConnectionID(void);

// declaration of external function for grabbing a snapshot from the WindowServer
extern CGError CGSCaptureWindowsContentsToRectWithOptions(int cid, uint32_t *wid, bool window_only, CGRect rect, uint32_t options, CGImageRef *image);

// Ask the system return a snapshot of the full rect of the given window
CGImageRef image = NULL;
CGSCaptureWindowsContentsToRectWithOptions(CGSMainConnectionID(), &window_id, true, CGRectZero, (1 << 8), &image);

// If we got a valid snapshot we store it to disk as a .png
if (image) {
CFStringRef string = CFStringCreateWithCString(NULL, "file:////Users/Koe/Documents/test.png", kCFStringEncodingUTF8);
CFURLRef url = CFURLCreateWithString(NULL, string, NULL);
CGImageDestinationRef destination = CGImageDestinationCreateWithURL(url, kUTTypePNG, 1, NULL);
CGImageDestinationAddImage(destination, image, NULL);
CGImageDestinationFinalize(destination);
}
I have not verified which versions of macOS have this function available, but I highly suspect this to work all the way back up macOS El Capitan.

@koekeishiya
Copy link

@lwouis

Just figured I'd let you know that there are private functions available to retrieve a list of windows per space and so on, without that space being active. The only information the API returns is basically a list of the CGWindowIds. I do not know if they return windows in any particular order, such as recently focused, which is of importance to you in this project. I might investigate this and report back after doing so.

@lwouis
Copy link
Owner Author

lwouis commented Dec 26, 2019

I would like to share an update. I've been exploring private APIs for the past couple weeks, and having very encouraging results. I've very close to completion.

I will soon release a PR that delivers on the 3 most discussed tickets on this repo: alt-tab'ing to windows from other spaces (#14), minimized windows (#11), and better performance/responsiveness of the app (#45).

Adding spaces support adds a huge scope:

  • What should the windows order be? HyperSwitch does most-recent-first per space, not globally. I think the best UX would be a global most-recent-first list, but that's both probably something that should be a preference, and also probably not easily doable technically. Right now what I have is kind of similar to HyperSwitch (I guess we use the same private APIs)
  • Multi-display support is pretty complex because it's hard enough already as the use-cases are really complex, but then you can check the "Displays have separate Spaces" checkbox in the System Preferences and now everything changes
  • Lots of corner cases like: what if I press alt-tab during a space transition? What if a space is destroyed or changes midway through focusing? etc

Stay tuned!

lwouis pushed a commit that referenced this issue Dec 27, 2019
Also closes #11 closes #45 closes #62

BREAKING CHANGE: this brings huge changes to core parts of the codebase. It introduces the use of private APIs that hopefully are should be compatible from macOS 10.12+, but I couldn't test them. I reviewed the whole codebase to clean and improve on performance and readability
@lwouis lwouis closed this as completed in 3f5ea25 Dec 27, 2019
lwouis pushed a commit that referenced this issue Dec 27, 2019
# [2.0.0](v1.14.4...v2.0.0) (2019-12-27)

### Features

* display other spaces/minimized windows (closes [#14](#14)) ([3f5ea25](3f5ea25)), closes [#11](#11) [#45](#45) [#62](#62)

### BREAKING CHANGES

* this brings huge changes to core parts of the codebase. It introduces the use of private APIs that hopefully are should be compatible from macOS 10.12+, but I couldn't test them. I reviewed the whole codebase to clean and improve on performance and readability
@lwouis
Copy link
Owner Author

lwouis commented Dec 27, 2019

This ticket and a bunch of others are closed in v2 released today. Feel free to test that new version out and give feedback here! Hopefully you experience better performance, can interact with minimized windows, and interact with windows from other spaces and displays. Cheers!

@jackbravo
Copy link

Hidden (cmd-h) windows are not showing up.

@lwouis
Copy link
Owner Author

lwouis commented Jan 3, 2020

Hidden (cmd-h) windows are not showing up.

@jackbravo i think you're the first one to point that out. I had actually never tested hidden windows before somehow 🙉

Could you please open a ticket for that? Maybe including comparison with alternative apps like HyperSwitch :)

@jackbravo
Copy link

Sure thing @lwouis. Added #108 , seems like HyperSwitch doesn't provide this functionality either, but Contexts app does.

@jkelleyrtp
Copy link

Got the honor to receive some tips from yabai's author Åsmund. He agreed for me to share our conversation here for others to learn from:

You are indeed correct that it is not possible to retrieve the handle to a window's AXUIElementRef until the space which said window belongs to is the active one.
Thus, being able to change window focus across space boundaries without first visiting each space in order and caching the window references is at first sight, not possible.
For this solution to work it is also necessary to subscribe to notifications from the accessibility API to get notified when a new window is created, and cache that also.
Now this is a rather complex solution and it will still have the prerequisite of visiting every space at least once, before it can track all visible windows. That's no good..
However, there is an alternative way to do this. To realize this we need some background information as to how the user-space in macOS is constructed.
There is a central service called the WindowServer and this service is the core of all GUI in macOS. Every single window is represented internally by something called a _CGSWindow.
The WindowServer is responsible for allocating the backingstore and setting other core properties of the window, such as which space(s) it belongs too, and other attributes such as shadow and much more.
GUI frameworks such as Cocoa is an abstraction layer that implements drawing functionality, and is compiled as a part of the application itself.
This structure is why it is not possible to perform system-wide application appearance changes, such as removing titlebars and so on.
An application can communicates with the WindowServer through the mach messaging system (MIG), and this is how all, or at least most of the low-level APIs work.
To determine whether an application is authorized to perform a change, it has the concept of a connection to the WindowServer, and this connection can have different levels of access.
The interesting part to notice is that almost all general desktop functionality on macOS is implemented the Dock (+ a couple of helper services). If you kill the Dock and prevent it from auto-relaunching
you will notice that cmd + tab will stop working, mission-control is gone (both the interface, gestures and keyboard shortcuts), wallpaper is no longer rendered, and much more.
Another interesting observation is that the Dock is able to focus ANY window of an application at any time, regardless of which space or display it is located on, and this bring us to the first alternative solution..
In yabai I use a scripting-addition to inject code into Dock.app. This allows us to call private functions with privileges given to the Docks connection to the WindowServer.
Secondly, it also allows us to call functions that are otherwise internal to the Dock process. In yabai I have tracked down function that performs a change in focus.
The only requirement for calling this function is to know the CGWindowId of the window you wish to activate regardless of its space or display - no need to go through the accessibility mess at all.
The bad part is that from Mojave and newer versions of macOS SIP must be disabled for code injection to work.
I've done a lot of digging with regards to being able to change window focus in various ways without having to rely on the accessibility API, and without having to rely on Dock.app, because I wanted to implement what is commonly known as autofocus among UNIX communities. I did manage to do this. It turns out that there are certain types of events that can be targetted at applications that alter its conception of what the focused window is.
I partially described how this works in an issue in Hammerspoon: Hammerspoon/hammerspoon#370
This solution does not use code-injection and does not require SIP to be deactivated. It does not inherently require the use of the accessibility API either, which makes me believe this could be tweaked to support the use-case you're trying to achieve.
In case you are feeling adventurous, the relevant function in yabai is: https://github.com/koekeishiya/yabai/blob/master/src/window_manager.c#L809 or https://github.com/koekeishiya/yabai/blob/master/src/window_manager.c#L829
As for getting the thumbnail for a window you may want to check the private function: CGSCaptureWindowsContentsToRectWithOptions
Googling the function name should probably suffice, if not let me know and I'll get you the parameters definition.

How would I visit all spaces? Is there a public API for that, or were you meaning actually the user would need to visit the spaces, and somehow the app would listen to the even of space change (how to do that?), and cache stuff? Also what is the API for the new window created event?

Unfortunately there is no public API for this. You could trigger a change in visible space using a private API call (see issue referenced below) and this would likely alter the information retrievable by the AX API. I have not tested this, but I can see this being a thing. The Dock, mission-control etc will not detect the space change, and will now be out of sync. The way we achieve space manipulation in yabai is through code-injection which allows us to solve these issues, and keep the system state in sync.
Otherwise, the user would have to visit the spaces manually. You can read the following issue for some in-depth information: https://github.com/koekeishiya/chunkwm/issues/127
You can utilize the NSWorkspace API to get notified when a space change occur: https://developer.apple.com/documentation/appkit/nsworkspace/1527073-activespacedidchangenotification
To get notified about newly created windows you have to use the accessibility API and subscribe to the desired notification for every application individually through the result of AXUIElementCreateApplication. See documentation for the AXObserver-related functions: https://developer.apple.com/documentation/applicationservices/axuielement_h

I think this is too much for a simple "alt-tab" app.

I completely agree with this assessment.

If I understand the code correctly, this solution relies on private API SLPSPostEventRecordTo and hardcoded retro-engineered values. I'm a bit worried about trying this in my app as low-level languages are really not my expertise unfortunately.

That is correct, and I fully understand that it is not trivial to get into at first, but I do think this approach appears to be the most promising way to reach said goal.

A last topic I thought I should ask you about in case you've dealt with this is capturing screenshot of a window that's minimized. It turns out that CGWindowListCreateImage can't do it.
I'm experimenting right now with leveraging the screencapture binary that's shipped with macOS. It's an abstraction layer over the private APIs that is consistent between OS versions. It turns out it can capture minimized windows! I'm struggling a bit with performance at the moment, but looking at Instruments it seems it may be me not using NSImage properly. The capture itself is like 40ms for 15 windows. Certainly more than CGWindowListCreateImage but I could image only using this for minimized window for instance. What do you think?

I haven't actually tried to solve that problem before either, but after taking a quick peek at the screencapture binary shipped with macOS Mojave 10.14.5, it appears to use the function I mentioned in my previous reply.

I found this particular problem to be quite interesting, so I wanted to give this a test myself. The function I mentioned earlier does indeed allow you to perform a snapshot of a minimized window.
Some sample C code I used to test this (no error checking etc is being performed here, just the minimal amount of code to run a test):
// declaration of external function to retrieve the running process' connection to the WindowServer
extern int CGSMainConnectionID(void);
// declaration of external function for grabbing a snapshot from the WindowServer
extern CGError CGSCaptureWindowsContentsToRectWithOptions(int cid, uint32_t *wid, bool window_only, CGRect rect, uint32_t options, CGImageRef *image);
// Ask the system return a snapshot of the full rect of the given window
CGImageRef image = NULL;
CGSCaptureWindowsContentsToRectWithOptions(CGSMainConnectionID(), &window_id, true, CGRectZero, (1 << 8), &image);
// If we got a valid snapshot we store it to disk as a .png
if (image) {
CFStringRef string = CFStringCreateWithCString(NULL, "file:////Users/Koe/Documents/test.png", kCFStringEncodingUTF8);
CFURLRef url = CFURLCreateWithString(NULL, string, NULL);
CGImageDestinationRef destination = CGImageDestinationCreateWithURL(url, kUTTypePNG, 1, NULL);
CGImageDestinationAddImage(destination, image, NULL);
CGImageDestinationFinalize(destination);
}
I have not verified which versions of macOS have this function available, but I highly suspect this to work all the way back up macOS El Capitan.

This is such great information! Is there a place where this research has been accumulated into one spot (all the private macOS APIs for doing this window management stuff)? I want to write Rust wrappers for these APIs to build more complex apps on top.

@lwouis
Copy link
Owner Author

lwouis commented Apr 26, 2022

@jkelleyrtp i've documented some stuff in the file called privateapi.swift i think. Also in various tickets here.

Overall no it's all private effort clustered around projects. Contact the maintainers of top macOS project on these topics to learn more. We are a small community with less than 10 people really being knowledgeable about this arcane stuff

@ldenoue
Copy link

ldenoue commented Jul 8, 2022

There is a new API now in macOS 13+ called ScreenCaptureKit that makes capturing high FPS buffers of windows really efficient, see https://developer.apple.com/documentation/screencapturekit/capturing_screen_content_in_macos

I use it in my own app Screegle https://www.appblit.com/screegle and noticed CPU usage is very very low.

@lwouis
Copy link
Owner Author

lwouis commented Jul 8, 2022

@idenoue yes it's great for #122 but it doesn't help with the present ticket unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants