Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for macOS (and other GPUs without hardware ray tracing support) #46

Merged
merged 17 commits into from
Feb 27, 2022

Conversation

expenses
Copy link
Contributor

@expenses expenses commented Feb 3, 2022

Checklist

  • I have read the Contributor Guide
  • I have read and agree to the Code of Conduct
  • I have added a description of my changes and why I'd like them included in the section below

Description of Changes

This contains the changes in my branch mentioned here: #37.

I've marked it as a draft because there are a few things to sort out and because I haven't tested it on a GPU with hardware ray tracing yet.

Related Issues

#37

Copy link
Collaborator

@h3r2tic h3r2tic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool! ❤️ I thought it would require way more surgery to get rid of ray-tracing 😅 I've tested the branch, and it works in both modes in my 6800XT! 🥳

Besides the individual comments on the code, my higher-level thoughts on this are that we should probably dynamically check for whether ray tracing should be enabled. It's my fault really for suggesting this approach with the ray-tracing feature. It makes some sense for a purely developer-facing thing, but if anyone does ship a binary made with kajiya, they shouldn't need to ship a variant compiled with and without ray tracing.

Now, we don't strictly need to do that in this PR, as this is a pretty cool first version, but the current checks for the ray-tracing feature are a bit scattered, and apply to things which aren't purely about ray tracing -- that does need a bit of cleanup.

crates/lib/kajiya-backend/src/vulkan/device.rs Outdated Show resolved Hide resolved
crates/lib/kajiya-backend/src/vulkan/device.rs Outdated Show resolved Hide resolved
Comment on lines 254 to 282
let limits = self.pdevice.properties.limits;

let max_dimension = if desc.extent[2] > 1 {
limits
.max_image_dimension1_d
.min(limits.max_image_dimension2_d)
.min(limits.max_image_dimension3_d)
} else if desc.extent[1] > 1 {
limits
.max_image_dimension1_d
.min(limits.max_image_dimension2_d)
} else {
limits.max_image_dimension1_d
};

for i in 0..3 {
let dimension = desc.extent[i];

if dimension > max_dimension {
log::warn!(
"Dimension {} ({}) exceeds max dimension {}. Adjusting to the max. {:?}",
i,
dimension,
max_dimension,
desc
);
desc.extent[i] = max_dimension;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be here 😅 The image creation function cannot just go and create something else, as it will most likely break things up the chain, where code depends on the specific sizes of images. Best case some rendering is glitchy; worst case, we get infinite loops and driver crashes.

Where does this trigger for you? It should probably be higher-level logic that deals with those limits 👀

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just checked and the problematic volume texture is coming from CSGI because the indirect cascade texture is

[
    VOLUME_DIMS * TOTAL_SUBRAY_COUNT as u32),
    VOLUME_DIMS,
    VOLUME_DIMS,
]

or [3456, 64, 64].

Can this be 1x1x1 as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I believe it can!

crates/lib/kajiya-backend/src/vulkan/shader.rs Outdated Show resolved Hide resolved
@@ -261,6 +261,7 @@ impl CsgiRenderer {
//
// Either is going to leave the GPU quite under-utilized though, and would be a nice fit for async compute.

#[cfg(feature = "ray-tracing")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit messy; ideally the entire csgi renderer should be unused when ray-tracing is disabled. I suspect you're only after the resources that it creates -- and I also suspect it's those resources that are too large on your macbook, so that you ended up clamping texture size.

We can side-step all that by not disabling the whole renderer, and only creating placeholder resources for where it needs to be passed in -- something like a black 1x1x1 volume.

Copy link
Collaborator

@h3r2tic h3r2tic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good cleanup! Found one gotcha, but otherwise works nicely 😀

A few nitpicks too 👀

crates/lib/kajiya-backend/src/vulkan/device.rs Outdated Show resolved Hide resolved
Comment on lines 254 to 282
let limits = self.pdevice.properties.limits;

let max_dimension = if desc.extent[2] > 1 {
limits
.max_image_dimension1_d
.min(limits.max_image_dimension2_d)
.min(limits.max_image_dimension3_d)
} else if desc.extent[1] > 1 {
limits
.max_image_dimension1_d
.min(limits.max_image_dimension2_d)
} else {
limits.max_image_dimension1_d
};

for i in 0..3 {
let dimension = desc.extent[i];

if dimension > max_dimension {
log::warn!(
"Dimension {} ({}) exceeds max dimension {}. Adjusting to the max. {:?}",
i,
dimension,
max_dimension,
desc
);
desc.extent[i] = max_dimension;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

crates/lib/kajiya/src/renderers/rtdgi.rs Outdated Show resolved Hide resolved
Comment on lines 195 to 212
let mut resolved_tex = rg.create(
gbuffer_depth
.gbuffer
.desc()
.usage(vk::ImageUsageFlags::empty())
.format(vk::Format::R16G16B16A16_SFLOAT),
);

let (temporal_output_tex, history_tex) = self
.temporal_tex
.get_output_and_history(rg, Self::temporal_tex_desc(gbuffer_desc.extent_2d()));

let (mut ray_len_output_tex, ray_len_history_tex) =
self.ray_len_tex.get_output_and_history(
rg,
ImageDesc::new_2d(vk::Format::R16G16_SFLOAT, gbuffer_desc.extent_2d())
.usage(vk::ImageUsageFlags::SAMPLED | vk::ImageUsageFlags::STORAGE),
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically those could also be R8G8B8A8_UNORM, but it would also be possible to plug in SSR here, so not a big deal.

crates/lib/kajiya/src/world_renderer.rs Show resolved Hide resolved
crates/lib/kajiya-backend/src/vulkan/device.rs Outdated Show resolved Hide resolved
@expenses expenses marked this pull request as ready for review February 11, 2022 17:31
Copy link
Collaborator

@h3r2tic h3r2tic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, I was too deep in the trenches of digital color management for a bit 🙈

This looks great, and works great! ❤️ Tested on AMD and NV, also the DLSS branch, and everything runs fine. Nice work 😀

@h3r2tic h3r2tic merged commit 79b7a74 into EmbarkStudios:main Feb 27, 2022
@virtualritz
Copy link

Shouldn't the README be updated re. supported/required GPUs to run this?

@h3r2tic
Copy link
Collaborator

h3r2tic commented May 21, 2022

Yup, will be there in the upcoming GI branch: #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants