Are you interested in me adding NVIDIA Maxine Video Effects support (+ some other small features)? #2689
Replies: 6 comments 1 reply
-
First of all, thanks for the kind words :) This sounds really interesting. I'd be more than ok with you working this into a full feature set, I just have a couple questions first:
My personal approach with this would be to make its feature set within chaiNNer only show up if the SDK is detected on the system, that way it's entirely optional and only applies if the SDK exists (similar to how tensorrt support works within our onnx support today). Anyway, the most preferred approach to this IMO, (and feel free to chime in on this @RunDevelopment) would be to make an issue for each individual feature you'd like to add, and we can discuss it on a case by case basis
So would I ;) I have an idea for how to implement nested/multilevel iterators, it'll just be a lot of complicated work, and I'm not sure I can necessarily pull it off. As for the directory stuff, I have a WIP PR for that (#2519), it just became a little more complicated than I initially though due to some edge cases, and I haven't gone back to it to finish it.
I don't think this is technically possible right now (at least without a bit of a hacky workaround), but similar to the iterator stuff I have a general idea of how I want to implement conditional nodes like this. Lots of ideas, not enough time to act on them unfortunately. |
Beta Was this translation helpful? Give feedback.
-
Awesome. I have no promises on timeline, but at least it's already working functionally for me now. Now it's a question of proper integration.
So, the SDK has an installer but I haven't used it before. Without the installer, the SDK is distributed as a collection of binary objects including models, and an open source Github repo containing a bunch of header files with proxy objects providing a dynamic interface to the Maxine DLLs, and some sample application code. I believe it might be possible to distribute Maxine binaries with your application under certain conditions, but I can't say anything official about that. I would need to carefully review the documentation. Regardless, the obvious way to deal with this is as you said. Let the user set it up themselves and have chaiNNer find it and enable the additional nodes if it is seen. I suppose that could be like the external Stable Diffusion feature currently in the app. The PyTorch example of having the option of downloading it automatically might not be the best way in this case. In the end, what you need to get it working is paths to the binaries so they can be loaded at runtime. Regarding GPU, that will need to be detected as well. Not only are these models CUDA models, but they are TensorRT models and only work on GPUs with Tensor cores. So, Turing and above only. You don't have to specify the GPU yourself at runtime, but you do need to make sure you don't run it on an unsupported GPU. In any case that sounds good about making individual issues for each feature. I can start doing that one at a time. I can see how many of these features would be pretty easy to hack in as prototypes. But, making them work within the constraints of the environment you have set up here is a lot harder. There are a lot of requirements that have to be satisfied that aren't immediately obvious when you first write a simple implementation. Incidentally, I am about to write a simple filesystem-based mutex node that I can put around critical sections to make sure that multiple instances of chaiNNer don't run more than 1 or X number of critical nodes at the same time as one another. This will let them make better use of the resources for a big execution flow broken into pieces. But, I can also see how that might be very difficult to integrate in a clean, idiomatic way. |
Beta Was this translation helpful? Give feedback.
-
We already use the python NVML bindings to do this (it's how we auto enable or disable GPU support in pytorch, among other things) so we're good on that front. That mutex feature would actually be so awesome. We technically don't officially support running two instances of chainner at once, but even if we had a basic GPU mutex we might be able to do some parallel processing in iterators without needing to run multiple instances. (@RunDevelopment and I have discussed various ways of potentially accomplishing this before but never got anywhere close to a solution) P.S. Speaking of tensorrt, is there any chance you could convince your colleagues to publish windows wheels for the official tensorrt bindings on pypi? Right now they only publish the Linux ones there even though they build windows ones (which are locked behind a login gate where you have to download the entire SDK) |
Beta Was this translation helpful? Give feedback.
-
I'm definitely all in on the mutex, since what I see when I do very large runs is that each node runs sequentially and there is no pipeline parallelism between separate iterations. I imagine that's kind of what you are talking about wanting to add. Is there anything in particular I should watch out for running two instances at once? Is it just a matter of testing or is there a known dragon that will destroy me? When running two instances at the same time, the "correct" way to synchronize the two programs is to use an OS-level named semaphore. The only problem is that I don't think there is a cross-platform implementation of semaphores in Python. You have to use something like pywin32 unless I'm missing something. I've done it in Powershell, and it's easy enough to do there, but then you'd be calling out to a shell that is tied to Windows. That's too janky even for me in this case. But, if you wanted to add parallelism to your iterators, that would be easier. The Python multiprocessing library allows you to spawn new Python processes and part of the library contract is that you can use locks and semaphores from the multiprocessing library because it's easy for them to set that up. If each iterator is handled by a separate Python "multiprocessing" process, then you can use those primitives pretty easily. As for what I think I'm going to start with, I'd like to try this out: One other question. Where is your pip environment located? I wanted to install portalocker, but right now I'm just messing with a non-development installation of the application. If it would be better for me to set up a development environment, I can do that. (that's almost certainly the case) |
Beta Was this translation helpful? Give feedback.
-
I'll look into this. I can't remember who exactly it is that messes with that stuff. But, I think I've run across them before. If I can figure out who they are I'll put in a good word about it ;-) One thing I've found is that there has grown a pretty rich ecosystem of Windows AI stuff. Traditionally, conventional wisdom says that all of that work is done on Linux. But, everything I do personally at home is done with Windows tools these days. |
Beta Was this translation helpful? Give feedback.
-
FYI, I didn't forget about this, but life happened for a few months. I'm coming back around to it and I have a lot of movies I want to modify, so it's about time I looked at it again. |
Beta Was this translation helpful? Give feedback.
-
First of all, I really love this tool. It is so pleasant to use for the tasks it's suited for. Thanks for all the work on it.
I got really into AI upscaling and denoising DVDs and was always curious how many of the various models out there compared to NVIDIA Maxine SDK, which includes deartifacting, upscaling, and some other stuff like green screen substitution. I'm an NVIDIA employee and, while I am simply doing this for my own benefit and this has nothing to do with my actual job, I always thought that being able to plug into the Maxine SDK would be a naturally useful capability for this tool.
So, I went ahead and made a simple Python wrapper and integrated it into the PyTorch Load Model/Upscale Image nodes in spectacularly hacky fashion so it now works fine for my uses. But, it's not done properly for real integration into your framework. Are you interested in me working this up into something more professional?
Additionally, I have added lossless encoding support for x265 (by simple checkbox rather than additional parameters) and added an NVENC option (including lossless) for the H265 encoder in the Save Video node. I also added a start frame option in the Load Video node to go with the limit option so that you can choose a range of frames within a video instead of just the first X frames.
Personally, I would love to have nested/multilevel iterator support and also have directories be compatible with strings for appending/substitution purposes, but I haven't looked at them yet. Finally, something I would really like to add is an input into the switch node that lets you automatically switch based on a parameter.
Let me know your thoughts on any of this. I understand if it doesn't fit in with your goals, though.
Beta Was this translation helpful? Give feedback.
All reactions