Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add selection outline shaders #72

Closed
wants to merge 17 commits into from
Closed

Conversation

gregjohnson2017
Copy link
Owner

@gregjohnson2017 gregjohnson2017 commented Sep 8, 2020

These changes feature the return of the dotted outlining of arbitrary selections. On the more basic side, the "BufferArray" object was rewritten to focus exclusively on VAOs (Vertex Array Objects), and a more pure "BufferObject" was created that can handle OpenGL buffer objects in a more generic sense. The bottom line is that when generating a collection of the selected-texel point coordinates, the compute shaders write vertex information into an SSBO (Shader Storage Buffer Object). The underlying buffer object here is no different than any other buffer object, so the VAO needs to be able to switch from looking at the traditional VBO (Vertex Buffer Object) that it is has by default, to then look at the SSBO data and treat it like vertex information.

As for the main feature of this PR, selection outlines have made a return. The first major push of this change was implementing a geometry shader that converts a theoretical coordinate of a selected texel into line vertices for where the outlines should be placed. This was a great improvement, but what was left (99% of the slowness) was improving the way those texel coordinates were emitted into the rendering pipeline in the first place. We first made a CPU parallel algorithm that operates (over a selection texture with binary information on whether a texel is selected) in 3 steps:

  1. Each worker looks at their respective chunk of the selection texture and totals how many selected texels they see, each placing their answer in an array, indexed corresponding to their worker ID.
  2. Performing a prefix sum operation over this array (with slight modification) to determine the indices for where each worker should start writing the final answer in the final step
  3. Each worker looks again through their respective chunks, recording the (X, Y) texel coordinates of selected texels into the final answer array, indexed according to result from the previous step

This implementation was very good in comparison to the old sequential approach, where one thread would simply loop over an entire set containing the selections and populate a fixed array to send to the rendering pipeline. In our testing, we saw a 105%-107% speedup in extreme cases. The only problem with this approach is that it doesn't take advantage of the GPU - all of these calculations rely on the parallelism in the CPU, which is very limited. So, mainly wanting to grow our OpenGL knowledge, we implemented the same algorithm in compute shaders. The only issue was that the prefix sum step was incredibly slow to run on a single worker sequentially, so I had to implement a rather messy parallel prefix sum algorithm in another compute shader. (I don't know how to do it better with the limitations of GLSL - it plagued all of the shaders with 2 copies of essentially the same thing). The result is a lightning fast algorithm that in small cases is similar to the original CPU parallel solution, but in extreme cases can even get up to 10x faster. The run time depends heavily on how the chunkSize is set - right now it is hard coded at 256, but could be improved by intelligently deciding what it should be for a given image.

Closes #10
Closes #11
Closes #44
Closes #46

Copy link
Collaborator

@kroppt kroppt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did run into an issue where the program stalls when you select a pixel with certain images. The picture in question is 7680x4320 large.

if (i == wid * chunkSize) {
chunkIndices[wid] = 0;
next[wid] = 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statements in this condition can be moved before the loop.

}

// Render draws the ui.Component
func (l Layer) Render(view sdl.FRect) {
func (l Layer) Render(view sdl.FRect, program gfx.Program) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use pointer method receivers, so the method receiver type is the same for all the methods.
This method and Destroy still have value method receivers.

if iv.selLayer != nil {
p.X -= iv.selLayer.area.X
p.Y -= iv.selLayer.area.Y
return iv.selLayer.SelectTexel(p)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes more sense for this condition to be inversed; by checking the unordinary case, where it's nil, we can keep the main logic as far left as possible.

@kropptrevor kropptrevor closed this Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants