-
-
Notifications
You must be signed in to change notification settings - Fork 35.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebGLRenderer: Async Readback API - WIP #24466
Conversation
Signed-off-by: Guilherme <guilherme@maply.io>
Signed-off-by: Guilherme <guilherme@maply.io>
I am happy to help champion compute and readPixels APIs if we can find a reasonable design to work with (related: #14503, #21934). I have experimented with transform feedback (example) as a mechanism for compute in WebGL 2, but it's vertex-shader only. Providing async readPixels APIs via fenceSync seems more reasonable in the near term for GPGPU pixel shaders and as an intermediary step for transform feedback should that be realized. |
I agree, that's my primary understanding and goal as well, and I have a few ideas...
return new Promise((resolveProbe, rejectProbe) => {
function probe() {
switch ( gl.clientWaitSync( sync, gl.SYNC_FLUSH_COMMANDS_BIT, 0 ) ) {
case gl.WAIT_FAILED:
rejectProbe(); break;
case gl.TIMEOUT_EXPIRED:
setTimeout(probe, interval); break;
default:
resolveProbe();
}
}
setTimeout(probe);
}); We could explore using microtasks instead of blank In the past we shied away of implementing a lot of these features, because it was judged to be in the realm of what users should implement. However WebGL2 offers many of these from the underlying implementation itself. So in my eyes, we can use this opportunity, to introduce how these concepts work on a more niche feature, and let it naturally get mixed with core components. This is my current view of how we should path these changes, However, If @mrdoob and others, would prefer a more immediate PR to get this on production. We could just obfuscate the renderer.readPixel( computeSampler, typedarray );
renderer.readPixelsAsync( computeSampler, {
sync: true, // should enqueue a fence
interval: 10 // ms - debounce
readback: typedarray, // if undefined for just fetch, no copy
} ); |
I know the full-fledged async pipeline scares some people, but it's honestly not too bad for the end-user. If you are on a timed running loop ( main-thread, with requestAnimationFrame ), you simply attach/register a task with the This means we need to handle it through instantiation parameters, the presets and task priority-queue pop. // init-time
if ( renderer.capabilities.isWebGL2 ) {
asyncQueue = new THREE.AsyncQueue();
computeSampler = new THREE.WebGLSampler( renderTarget, { bounds: THREE.Box2( /**/ ) } );
transferSampler = new THREE.WebGLSampler( renderTarget, { bounds: THREE.Box2( /**/ ) } );
}
// loop time
renderer.render( scene, camera );
asyncQueue.readPixels( computeSampler,
typedarray, { // [==] undefined - no sync/copy, only flush gpu-queue
yieldable: false, // [?Boolean/Number] blocks main-thread on callback
debounce: 6, // [?Number] miliseconds - sync probe interval
} ).then( () => { /* . . . */ } );
// and/or spread tasks
asyncQueue.fetchPixels( computeSampler, {
debounce: 6,
// what should be the default debounce strategy? very application dependant.
// provide regular schemes / prediction implementations for ease of use.
// the world is your oyster on this optional
} ).then( () => { /* . . . */ } );
asyncQueue.copyPixels( transferSampler, typedarray {
yieldable: 4, // [?Number] - sub-task yield/generator
// If predicts insufficient time-window ( less then .yieldable )
// copy only partial data to typedarray, hold sampler priority on queue for next frame
} ).then( () => { /* . . . */ } ); |
If the proposed path is alright, starting with an initial PR implementing Updated Example - debounced & dispatch frequency, mobile-friendly. |
Signed-off-by: Guilherme ( @ScieOrg )
Would a fallback to setTimeout work instead? How much extra code would that add? |
Not much, I've seen similar in react's scheduler, etc. if ( typeof window.queueMicrotask !== 'function' ) {
window.queueMicrotask = function ( callback ) {
Promise.resolve()
.then( callback )
.catch( error => setTimeout( () => { throw error } ) );
};
} |
I think a fallback could work, but they can't offer the exact same functionality. There are ever so slightly differences on the expected behavior. But yeah, I would need to refresh my What can never work, however, is using this feature with WebGL1, unfortunately. |
@donmccurdy @Mugen87 - sorry for pinging, but any opinions on theses changes? I imagine many user-level websites/portfolios/demos would benefit with the use of asynchronous gpu-calls. I'm really not strong-minded about any of the suggestions, just wanted to get some traction to get a solution path going. This is really, really important to enable efficient out-of-the-box Three.js gpgpu and high-performance applications. I have a Frankenstein solution that works for my projects, but others may not. Fairly certain @gkjohnson ( also sorry for pinging, btw 😄 ) / three-gpu-pathtracer would, also, be able to use this and similar |
I'm very support of this feature and in general think we should be encouraging asynchronous readback APIs as much as possible. I'd argue we might even strongly consider replacing the synchronous API with an async one entirely. While maybe convenient reading back pixels before they're ready will always result in unnecessary performance stalls. So it would be nice to see this kind of feature merged at some point for GPGPU work. There are some nice uses for raycasting and likely three-mesh-bvh raycasting and data generation that would be neat to see. In terms of utility for three-gpu-pathtracer the only place that pixel readback is happening is for pre-filtering the environment map which takes ~15-30ms. Not a crazy amount of time but that is up to two frames of lost parallel work. I don't have the bandwidth at the moment to dig into the code in this PR but I appreciate the work on this! |
Thanks for commenting Garrett, appreciate it.
Oh, I thought you were making heavier use of data sync. My mistake then, but still, other aspect of associated async API (like
That's alright, code is mostly boilerplate at this point. The main focus is to find what is the preferred path to implementing gpu readback, and which associated API fits best for the larger audience. |
@sciecode is there a reason this was closed? I may pick this back up to get a version of async readback merged for some recent work. |
If I recall correctly, I only closed this due to lack of feedback on the API design. I ended up forking the original PR for the project I was working at the time. I have used this functionality extensively since, it is quite robust and I haven't faced any issues. Back when I proposed the original PR, WebGPU renderer wasn't a thing. But an async readback API has since been implemented for it in #26326. Repurposing this PR functionality with a similar interface sounds like decent plan. Feel free to continue work on it. Hit me up if you have any trouble or require PR review. |
* Copy to async implementation from #24466 * probeSync cleanup * More simplification * Simplification * Remove tangential functions * More simplification * Convert to thrown errors * Remove comma * Update docs, probe frequency
Aims to solve #22779
not meant to be merged in it's current state
I have developed a temporary solution, that works for any intended usage of asynchronous readback using three.js. This PR is meant to illustrate the performance gains that come with this features. Alongside this temporary API, this PR also includes two examples of how users might utilize it.
GPU PICKING
Being the most common use of
readPixel
. Where the user simply request a single asychronous readback per frame. This use-case and API is pretty much identical to the one we currently use. With the only exception being that the method returns aPromise
, which is later resolved to inform when the buffer is ready to be used by the CPU.GPU Picking Example
GPGPU ASYNC-READBACK
Now the more complex use-case, and most interesting - imo. The use of asynchronous readback on GPGPU pipelines, where multiple
readPixels
calls are made within a frame. This use-case is also handled correctly, however I could not think of a proper API that kept things simple and similar to our current approach. There are many reasons why, but it boils down to the following:glBuffer
is not sufficient to handle multiple calls within a frame.readPixels
to determine when afenceSync
is needed and when it's not.readPixels
calls from the actual read-back proceduregetBufferSubData
.WebGLRenderer
is not the proper place to implement the feature anymore.With that being said, the proposed API does work and implements exactly the first 3 items:
GPGPU Example
It is pretty obvious how powerful this feature is, so it's likely worth the effort of coming up with a new API. Which I hope to find with the help of the community.
I believe most of my difficulty in integrating this feature to our current API, stems from the fact that we try to hide the
gl
context and most of the WebGL objects from being used directly by the user. In the end, I just exposed a method for creating and disposing a specializedPIXEL PACK BUFFER
buffer. However I do believe it is possible to hide this, by adding another layer of abstraction on top, much like we do with a lot of the other objects.One reason why I'm emphasizing an API rework, is because this specific feature will go hand-in-hand with the upcoming WebGPU compute pipeline. So I deem it to be worth the extra effort elaborating it, in order to have an easier path ahead.
Idealizing a specialized component inside
WebGLRenderer
- let's call itWebGLTransfer
- similar to how we approachWebGLUniform
,WebGLBindingStates
and so on; seems like the best course of actions. Most of the modern lower graphics implementations, like Vulkan, WebGPU and Metal, have analogous concepts.I don't have as much free time as I wish I had. So if someone wants to patch accordingly to what is decided. Please, feel free to do it. I'll gladly review the code and comment on it. If not, I'll slowly work on it