Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magnum batch renderer prototype #1798

Merged
merged 11 commits into from
Jul 20, 2022
Merged

Magnum batch renderer prototype #1798

merged 11 commits into from
Jul 20, 2022

Conversation

mosra
Copy link
Collaborator

@mosra mosra commented Jun 30, 2022

Motivation and Context

Halfway through implementing multi-mesh and multi-file support for batch renderer I realized that having this on a WIP branch of a stale fork of a fork of this repository is starting to look more like a problem than a solution for a problem. So, in-line with the proposed integration path we discussed with @eundersander, I'm opening the first PR.

What this does

This PR integrates the core implementation of a Magnum-based batched renderer as a fully tested and maintained part of the main repository. At the moment it's not used by anything in the code but instead, to avoid code rot, its behavior is thoroughly tested, and so will be all future additions to it. Meaning, given the tests pass, my intent is to get it merged to main so #1745 as well as the Galakhtic project can depend on it.

The code itself is also only about 500 lines (the test being another 500), so it's not like the "dead code" should cause significant unwanted overhead. Majority of the actual work is done inside Magnum itself.

The feature set is currently as follows:

  • A narrow BatchRenderer interface that doesn't leak any GPU-API-specific functionality. Meaning it can have the full freedom in optimizing draws however it sees fit. As well as be implemented in either GL or Vulkan without the client-side code having to care.
  • A standalone BatchRendererStandalone implementation that manages also GL context and provides CUDA interop.
  • Designed with multi-draw and multi-scene behavior in mind. In particular, given a batch-optimized input file, it's capable of drawing a grid of dynamic scenes with a single draw call per scene. On the other hand, it's still possible to perform drawing of just a single scene mesh-by-mesh, but it won't be any faster than the current renderer.
  • Web-compatible renderer implementation. While I could design it with only advanced NVidia-only (Vulkan) features in mind, I felt that being compatible with VR devices and the web is an important property. Thus the multi-draw is implemented in a way that can run on WebGL as well.
  • Ability to configure a grid of scenes, with each having an independently controlled camera and a set of objects added.
  • Adding (trees of) named objects from a batch-optimized input file.
  • Flat-shaded triangle meshes, color and depth output.

The test file contains code that builds a batch-optimized glTF from scratch. The GltfSceneConverter plugin is currently in a WIP branch of Magnum, and thus the code is disabled if not building against said branch. Instead, to be still able to actually test the renderer, the test glTF file is bundled in the repository.

What this doesn't have

My goal here is to submit reasonably large chunks of code, to gather feedback on the overall direction and whether it all makes sense together. Thus, to not overwhelm, this is just the base implementation first. Other features will be in followup PRs:

  • Ability to remove arbitrary objects. At the moment it's only possible to clear a particular scene in the grid as a whole.
  • Ability to render multiple meshes from multiple files. At the moment it's only a single input file and a single draw call per scene.
  • Ability to render wireframe meshes (for debug visualization).
  • Ability to have a secondary "debug camera", possibly to a differently sized framebuffer.
  • There's no documentation, because right now it would be just a long list of things it cannot do. It will appear once the above planned functionality is in.

What's currently only present in the Galakhtic fork is the following:

  • A "script" that converts a large bundle of datasets into the optimized glTF. Because it again depends on a Magnum branch, is quite use-case-specific and relies on hardcoded set of datasets, I don't think it should go here.
  • A demo / "stress test" application that uses the batch-optimized glTF created by the above script and populates a grid of 192 scenes.

I expect that the Galakhtic fork eventually synchronizes with Habitat's main, and will contain optional "extra" functionality on top for data conversion and optimization, relying on more bleeding-edge Magnum features. I.e., vanilla Habitat will be fully capable of rendering batch-optimized glTFs, but to produce them one will need to use a different branch.

How Has This Been Tested

There's a test file, and look at the pretty images!

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jun 30, 2022
@mosra mosra force-pushed the magnum-batch-renderer branch 6 times, most recently from 7c465b0 to 06088f4 Compare June 30, 2022 19:32
@mosra
Copy link
Collaborator Author

mosra commented Jun 30, 2022

Everything seems to build (even for WebGL, yay!) and the test passes including CUDA interop.

However on the CI I get a segfault on exit. Doesn't happen locally with my RTX eGPU nor with Mesa & Intel/AMD. Will try investigating with ASan tomorrow, hopefully I didn't hit any cursed driver path :] It was me being stupid, forgot to call cudaGraphicsUnregisterResource() and looks like the 470.* drivers the CI has didn't like it, while on 510.* it was no problem.

Copy link
Contributor

@eundersander eundersander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. Looking good. I'm curious if/how we can keep this code Magnum-only (free of Habitat-specific stuff). See my comment below.

const Magnum::Vector2i& tileCount);

struct State;
Corrade::Containers::Pointer<State> state;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this paradigm? You want to keep the state members out of the header to reduce compile dependencies and build times?

Copy link
Collaborator Author

@mosra mosra Jul 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, reducing build times. Here in case of the configuration class it's possibly taken a bit too far and wouldn't need to be there as the state is tiny, but I have this so ingrained in my muscle memory that I used it everywhere.

For the BatchRenderer classes I definitely want to keep it this way because otherwise the header gets way too heavy for the minimal interface it exposes, especially when the alternative Vulkan backend gets added. For the BatchRendererConfiguration I can remove that if you feel it's silly -- let me know.

(Related: it's been a while since I worked with the Habitat codebase last time, and I was surprised by how long everything now takes to build. I may open a few PRs soon to clean some of that up, because it shouldn't be that an incremental build of the gfx library takes more time than rebuilding all of Magnum. I'm not used to so much waiting 😅)

// scene name for the whole scene??
// TODO document this may not return consecutive IDs in case the added name
// is multiple meshes together
std::size_t add(Magnum::UnsignedInt sceneId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add is maybe too concise? This is for adding a "render object instance" or whatever you want to call that. One day we may also have addLight, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes sense, especially since the API is not really typed to differentiate between lights, meshes or else.

I took this originally from the new SceneConverter APIs, where it's also just add(), be it a mesh, an image, a material or a scene. You can see that in the generateTestData() test case. I'm still a bit on the fence about it, whether that will bite back in the future, but there it's at least clearly distinguished by the argument types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to addMeshHierarchy().

const Magnum::Matrix4& transformation);
void clear(Magnum::UnsignedInt sceneId);

Magnum::Matrix4& camera(Magnum::UnsignedInt sceneId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, the user can modify this any time, even immediately after calling draw? It's not like the driver would be in the middle of copying this memory to GPU after calling draw, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this needs to be documented well.

The draw() function does a synchronous copy of all the data to the driver (and then the GL driver probably does an async copy to the GPU or whatever else it needs to do), so when it returns, it's safe to start modifying everything again. The only place that touches data used by the GPU is the colorImage() / cudaColorBufferDevicePointer() APIs, but those implicitly stall until the GPU has finished rendering, so it's also safe.


renderer.draw();
Mn::Image2D color = renderer.colorImage();
MAGNUM_VERIFY_NO_GL_ERROR();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, do we expect user code to also do this check for GL errors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No :) Error checking is expensive, potentially causing driver stalls, and if the code causes GL errors, it's broken.

But I customarily do this in all test code because it helps me catch errors easier. In this case this check uncovered a NVidia-specific bug causing wrong camera matrix to be used, which I subsequently fixed in faad5b2.

# GltfSceneConverter and KtxImageConverter needed by BatchRendererTest,
# GltfSceneConverter is not in magnum-plugins master yet
#set(MAGNUM_WITH_GLTFSCENECONVERTER ON CACHE BOOL "" FORCE)
set(MAGNUM_WITH_KTXIMAGECONVERTER ON CACHE BOOL "" FORCE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't expect Habitat users to be doing KTX compression, except in offline tools. (I know we also imagine a load-time preprocessing use case that could invoke this, but I hope to discourage users from this mode of operation.)

Any concern about including this by default in the Habitat runtime, in terms of binary size bloat or whatever? An alternative would be a build switch that is off by default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, actually, the reason I put it here is because find_package(MagnumPlugins OPTIONAL_COMPONENTS KtxImageConverter) causes a CMake error otherwise 🔥 Which I have to fix.

But this particular plugin (unlike e.g. Basis, or OpenEXR) is a tiny self-contained 400-line code with no external dependencies that builds fast, so it felt like a good enough tradeoff until I fix the CMake modules.

Will add a TODO there, at least.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO added.


/* std::hash specialization to be able to use Corrade::String in unordered_map
*/
// TODO put directly into Corrade, it's duplicated ON SEVEN PLACES ALREADY!!!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now implemented directly in Corrade, will remove it from here in the next batch of updates.

// This source code is licensed under the MIT license found in the
// LICENSE file in the root directory of this source tree.

#include "BatchRenderer.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this class doesn't depend on any Habitat code. I'd like to keep it that way.

  1. Have you considered putting this in Magnum instead of here?
  2. If not 1, what about some code organization that discourages including Habitat code here? E.g. a new CMake library (instead of gfx) with a name that conveys "just Magnum code" and doesn't link against Habitat libs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. No, the use case is too narrow and restricted to be useful for any other Magnum users except Habitat itself. Plus, as it's quite high-level compared to for example the multi-draw-enabled shaders, having it here means faster iteration times and easier experiments and perf tuning than the complex cycle of implementing, testing, documenting and upstreaming a feature in Magnum and then using it here. That said, what's here will tend to get smaller over time, as I integrate the lower-level bits of it into Magnum -- such as hierarchic transformation calculation, builtin mesh view support and such.
  2. I wouldn't discourage that, of course in moderation, not like pulling the whole Habitat JSON config parsing in here :) But -- as we discussed several times before -- I can imagine the Habitat "client" and the renderer sharing certain data such as bounding volume annotations for both culling and physics or debug visualizers, and then this would pull in common helpers to retrieve such data consistently on both sides. Such data could also be duplicated in Habitat-specific structures, but to me that feels like wasteful extra work if it would only be to have the renderer not rely on any Habitat code.

Thinking further, I intend to make this batch renderer replace / take over everything that's in the gfx library, and then it'll be as you say -- a standalone "just Magnum code" library that shares very little with the rest of Habitat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm specifically trying to avoid/discourage: some hacky Habitat-specific renderer "features" like toggling lights, debug-drawing, etc. We want this to be reusable with other simulators in the future. I guess we can leave this here in gfx and then just try to be vigilant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both toggling lights and debug drawing is a general-enough feature that would cause no problem here. If it's made in a way that doesn't negatively affect perf in scenarios that don't need them, of course.

What I would see highly problematic is dragging Habitat config JSON parsing into here, for example. But as long as I'm in charge of this code, that won't happen ;)

Copy link
Contributor

@eundersander eundersander Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree about no Habitat config JSON parsing here.

But poking this a bit more with a concrete example: Imagine we call LightSetup.h getDefaultLights in this file. LightSetup.h includes SceneGraph.h, which includes esp/sensor/VisualSensor.h and SceneNode.h. At this point, all this Habitat code is now hopelessly intertwined and it would become hard to use this batch renderer with another simulator (I'm thinking upcoming batch Mujoco specifically; I recently discussed this with the Google team maintaining Mujoco in an email thread.).

So how should all this actually work? LightSetup or other Habitat code should call into this file, e.g. calling BatchRenderer::addLight.

This is my motivation to separate this from gfx lib: to cut off access to those headers and thus force this one-directional dependency relationship.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I see what you mean ... especially given the current state of "everything includes everything else" 😅

Yep, making it a separate library would cut off the access, at least until someone needs to do a permanent temporary hack that links everything else to it again. But then a cunterexample -- wouldn't it make sense to eventually use Habitat's own logging routines in here? For Python-aware assertions, filtering and all that. Which is in the core library, which however unfortunately somehow pulls in the JSON headers as well, as far as I can see, and we're back at the start :/ I'm afraid the only viable long-term solution is to start enforcing header/dependency hygiene across the whole project...

But you have a point, and this is an opportunity to make the dependencies clean from the start. Unlike the rest of gfx I don't need any physics or Eigen integration here, so should I put this into a new batched_gfx library instead? Ideas for a fancier / shorter name?

Copy link
Contributor

@eundersander eundersander Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then a counterexample -- wouldn't it make sense to eventually use Habitat's own logging routines in here?

Agree this is tough. When we get to that point (adding logging), we can either (1) add the dependency on core and hope for the best, (2) do non-Habitat logging in here (to keep this truly isolated), or (3) clean up and separate core into more reasonable pieces, and only depend on the minimal set of pieces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least until someone needs to do a permanent temporary hack that links everything else to it again

Haha, exactly. We should have a giant scary comment in the CMake file about not adding dependencies. Maybe there's even a clever unit test that can be written (out of scope for this PR).

// is multiple meshes together
std::size_t add(Magnum::UnsignedInt sceneId,
Corrade::Containers::StringView name);
std::size_t add(Magnum::UnsignedInt sceneId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've also used the term "node template", i.e. the name you give here is the name of a node in the composite gltf, and it may have a hierarchy with multiple meshes. What's the status here? Does this return an integer representing the entire instantiated node? And we use that integer to index into transformations below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what this does. (The TODO comments may be confusing, there I was just noting down ideas for how to solve naming when multiple files get involved.)

As a proof, look at the meshHierarchy() test case -- it corresponds to this portion of the glTF file, and this is the rendered output:

BatchRendererTestMeshHierarchy.png

What this doesn't currently handle is deeper hierarchies, as I felt that was non-essential if you have full control over the input data and don't need to modify transformations inside the hierarchy. But I expect this to become a use case when generic input files are allowed, and it's prepared for that.

@eundersander
Copy link
Contributor

While I could design it with only advanced NVidia-only (Vulkan) features in mind, I felt that being compatible with VR devices and the web is an important property. Thus the multi-draw is implemented in a way that can run on WebGL as well.

Respectfully disagree! It seems to me our best chance at making this simple and ultra-fast is to not over-generalize to other platforms. We don't need a batch renderer for VR or web.

@mosra
Copy link
Collaborator Author

mosra commented Jul 1, 2022

It seems to me our best chance at making this simple and ultra-fast is to not over-generalize to other platforms.

Don't worry, I picked the right tradeoffs ;) As far as I know, texture arrays are never slower than descriptor arrays (or however the feature is called, to be able to pick among a divergent set of textures in the shader), and it only puts certain requirements on input data. Which is a solved problem now (including the texture repeat we talked about, that'll be done by the shader), and is done in a preprocessing step that has to be done anyway to make the meshes fast to draw.

That's the core of the renderer design and the optimized data layout, which I want to have as compatible as possible. But on top of that there will be functionality like compute-based culling, GPU self-feeding, virtual memory, mesh shaders and whatnot, which will be opt-in on platforms that can do it. Platforms that can't will use just the base and thus will be slower.

We don't need a batch renderer for VR or web.

We do -- I don't want to keep maintaining the classic renderer once this gets feature parity with it, I really don't. Ending up in a state where we finally have a replacement renderer but can't use it on certain platforms because its core design didn't count on that would be bad for the project. That said, I'm not really "wasting time" explicitly testing the batch renderer for WebGL or anything, I only make sure to design in a way that doesn't make it impossible to run there.

@mosra
Copy link
Collaborator Author

mosra commented Jul 15, 2022

The remaining minor TODOs should be fixed now. @eundersander @aclegg3 let me know if there's anything else :)

Copy link
Contributor

@aclegg3 aclegg3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good WIP for this feature. Thanks for the clear comments and separation from the rest of the codebase. Nicely parceled.

Copy link
Contributor

@Skylion007 Skylion007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

target_link_libraries(gfx_batch PUBLIC ${CUDART_LIBRARY})
endif()

target_link_libraries(
Copy link
Contributor

@eundersander eundersander Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where the naive Habitat programmer would add core and other stuff? Can we add a scary comment here to discourage this? You could even link to the relevant thread in the PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually using core right now, somehow -- see the TODO on line 13 😅

Not sure if it's a good idea to attempt to move the configure.h header out of core as part of this PR, risking breaking too many things and delaying the merge even further. And it also might be a bit controversial change, so I'll open a dedicated PR with just that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note added.)

The library doesn't depend on anything else in Habitat, which makes it
nicely standalone. Eventually there will be dependencies in the other
direction, though.
@mosra
Copy link
Collaborator Author

mosra commented Jul 19, 2022

Renamed, esp::gfx::BatchRenderer -> esp::gfx_batch::Renderer and similarly for the test and test files.

Will wait on the CIs to ensure I didn't forget anything, but apart from that I think it's ready to get merged.

@mosra mosra merged commit 0c896aa into main Jul 20, 2022
@mosra mosra deleted the magnum-batch-renderer branch July 20, 2022 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants