Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MultiGPU renderer #503

Merged
merged 37 commits into from
Mar 14, 2022
Merged

Introduce MultiGPU renderer #503

merged 37 commits into from
Mar 14, 2022

Conversation

Drakulix
Copy link
Member

@Drakulix Drakulix commented Mar 1, 2022

Best reviewed commit by commit:

45d1ee3

Adds a new trait Offscreen to our renderer module, that describes the feature to create offscreen-framebuffers to render into. (Actually rendering to them is handled by Bind and Renderer as previously.)

05059d3
f25c19a

Adds two implementations of Offscreen (and matching Bind implementations) to our Gles2Renderer.

  • Gles2Texture allow now to render into textures, which is useful to do some post-processing.
  • Gles2Renderbuffer, which is faster than a texture, but cannot be sampled from.

eb0a25c

Renames ImportShm to ImportMem and adds functions to import bitmaps, that don't belong to a wl_buffer.

25f59e1

Introduces a new ExportMem trait that allows copying textures and from bound framebuffers into memory
and implements it for our Gles2Renderer

4bd7f4d

Prequisite for ea9aabd. Adds the ability to convert from egl-images to dmabuf. (We previously only supported the other way around.)

ea9aabd

Introduces a new ExportDma trait, that analogous to ExportMem (and to ImportDma and ImportMem being the opposites) allows to export textures and framebuffers to dmabufs. Again implemented for our Gles2Renderer

ea2a9ac

Changes the DrmNode type to not contain a file descriptor, essentially only describing a drm-device. Can be used across the code base to uniquely identify drm-nodes, but is currently a bit under-used (except for the multigpu module).

This also makes the X11-backend api a little more awkward, but that is meant to switch to rendering via a EGLSurface in the future anyway.

95adeaf

Removes the 'static-constraint on the Renderer provided to Space::render_output by not using trait-objects for custom RenderElements anymore. A new macro is provided to generate a collating enum-type for downstream. This is required, given the MultiRenderer borrows internal Renderers from the GpuManager.

e9493f7
58443b4

Introduce a new multigpu-module, containing the GpuManager, MultiRenderer and some accompanying types.
Also implements its use in anvil's udev-backend.

The idea is, that a GpuManager enumerates available rendering devices for a given Api (only Egl+Gles2 is provided at the moment) and then allows to create MultiRenderers from two gpus represented by DrmNodes. The render-gpu, where compositing will take place and the target-gpu, where the result will reside on. Client buffers will be imported by any source-gpu, if it is available to the used api.

Almost all of the processing required to move stuff between gpus is done transparently in the background and using the new Offscreen- and Export*-traits, meaning this should theoretically just work with a future Vulkan-based renderer.

One exception is the GpuManager::early_import function, that can be used to pro-actively start copy-operations of client buffers on commit as opposed to during compositing, risking running late on the frame.

c10fa70
1ba365d
a64c55e
feecb91
8b4ebb2

A bunch of smaller fixes and additions...

Todo

  • Use GpuManager::early_import in anvil
  • Sometimes windows moved between outputs of different gpus will take a second to show up, probably a caching issue, because performance so far seems to be really smooth for a first attempt. fixed
  • Changelog

Questions

  • @i509VCB Are the new traits realistically compatible with vulkan the way they are or do they need some obvious adjustments?
  • Can we find some more people to test this on different hardware configurations?

@Drakulix Drakulix force-pushed the feature/multigpu branch 8 times, most recently from 60b38b7 to fb4e467 Compare March 3, 2022 18:22
@@ -784,6 +784,89 @@ impl ImportShm for Gles2Renderer {
.map_err(Gles2Error::BufferAccessError)?
}

fn import_memory(&mut self, data: &[u8], size: Size<i32, Buffer>) -> Result<Gles2Texture, Gles2Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be there is quite a bit of repetition, I'm thinking we might:

  • change the import_memory / update_memory to take as argument a &[u8] and a BufferData
  • make import_shm_buffer use import_memory / update_memory

would it make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly it does, more work for me! :)

}
}

impl std::ops::Add for Transform {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is extremely subjective, but Add does not feel like the appropriate operator for this to me. I think Mul would make more sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So 90 * 90 = 180? Feels weird to me.
Also Flipped * Flipped = Normal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but that clearly my inner mathematician speaking (seeing composition of rotations as multiplication of complex numbers, and likewise, -1 * -1 = 1 makes sense to me for flipping). But I'm clearly not going to die on that hill. 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I am indecisive about this, do we already have similar cases for composition in the library? Because at least it should be consistent.


// setup the timer
let timer = Timer::new().unwrap();

std::mem::drop(renderer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason for this drop? If so, could we get a comment about the reason?
If this is just a stylistic choice, then don't mind me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because the renderer borrows gpus mutably, so without the drop rustc complains, that it cannot move gpus into the UdevData in the next line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment to explain this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because the renderer borrows gpus mutably, so without the drop rustc complains, that it cannot move gpus into the UdevData in the next line.

Just checked and it compiles just fine for me, so rustc does not complain for me, maybe it'a a certain feature combination? (I'm building the default feature set of anvil)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe some recent changes made this obsolete? I will remove it and see what CI think about it.

@@ -313,7 +314,7 @@ impl X11Handle {
/// Returns the DRM node the X server uses for direct rendering.
///
/// The DRM node may be used to create a [`gbm::Device`] to allocate buffers.
pub fn drm_node(&self) -> Result<DrmNode, X11Error> {
pub fn drm_node(&self) -> Result<(DrmNode, RawFd), X11Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OwnedFd in std is currently unstable. We should probably wrap the fd in something so we can't leak it since the method gives you ownership of the fd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that is also how e.g. IntoRawFd works, ownership is implicit.
But yes, we could probably wrap it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note on that, I want to change the X11-backend soon to use an EGLSurface, which will not require downstream to access the RawFd anymore anyway. I will thus skip wrapping the RawFd in this PR and deal with it in an upcoming one.

Ok(texture)
}

fn update_memory(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this work on an EGLImage or Dmabuf backed texture?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be in the case of the Gles2Renderer (though it likely depends on the EGLImage/Dmabuf),
but the docs of the trait clearly state that:

    /// This function *may* error, if (but not limited to):
    /// - The texture was not created using either [`ImportMem::import_shm_buffer`] or [`ImportMem::import_memory`].
    /// ...

@Drakulix Drakulix force-pushed the feature/multigpu branch 4 times, most recently from dadde8a to 1dffaa6 Compare March 14, 2022 15:04
@Drakulix Drakulix merged commit 1151eea into master Mar 14, 2022
@Drakulix Drakulix deleted the feature/multigpu branch June 23, 2022 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants