-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does this actually override CUDA or NVENC sessions? #53
Comments
Do you use ffmpeg? |
To clarify things :) |
No, we don´t, we use Nvidia Video Codec SDK and GRID SDK directly. |
Nice to meet you. Let's say I'm guy with debugger and disassembler here. This patch is intended to patch NVENC (and only NVENC). This patch should work with any NVENC-enabled software but testing criteria is still ffmpeg since many software derived from it. About your concern for the name of patched library: as I mentioned before, nvcuvid.dll is loaded dynamically by NvEncodeAPI.dll Your error In order to sort things out please perform test with 64bit ffmpeg. You may run 3 simultaneous transcodes by issuing command like this:
If it will simply fail with same error we will know patch is simply not applied. If it'll work we shall seek problem somewhere else. |
I am getting: Invalid loglevel "h264_nvenc". Possible levels are numbers or: Maybe the binaries i downloaded are compiled without nvenc support? |
Yeah, it has no problems with it, it can encode 3 files concurrently. It seems that it is using cuda functions to use cuda device as input for nvenc. Nvenc can be initialized with CUDA or DirectX and we use DirectX device so maybe this patch currently does only unlock encoding sessions initialized with NV_ENC_DEVICE_TYPE_CUDA. EDIT: We can now encode more than 2 concurrent streams after runing ffmpeg code sample. |
It looks weird. Is it possible in your dev environment co-exist multiple versions of nvcuvid.dll, probably installed with some additional SDK package? |
I can confirm. I was able to encode more than 2 instances with our software, if I reboot windows after that i can´t encode more than 2 again but if I run ffmpeg code again then we can encode again more than 2 instances. |
No, it removes conditional jump leading to failure return, when one of subroutines indicates active sessions above limit. Probably you should use x64dbg and see which libraries are getting loaded. This debugger has useful feature to set breakpoint on each dll load, including programmatically initiated dynamic loads. I bet different set of libraries co-exist in system. |
I will try, although this is not in my skills :) |
I have exactly the same problem. We use NVENC directly with Direct3D. After patching we still get the NV_ENC_ERR_OUT_OF_MEMORY for the third session. When analyzing the libraries that are being loaded by the executable, we see 'nvEncodeAPI64.dll' getting loaded. Nvcuvid.dll is not loaded by our binary. I can also confirm that running the FFMPEG command above enables 1 extra session for our software. The fourth sessions still fails with the out of memory error. When i change the ffmpeg command to create 6 outputs, I can use 6 NVENC sessions in our software. |
@jantenhove Hello, Is there some way I can reproduce it on clean Windows machine? Probably some mininal executable would be ideal. |
Thanks for reopening. I will create a simple test program based on the Direct3D sample from the SDK. |
@Snawoot When it fails after creating 2 encoding sessions, you can run the ffmpeg command from #53 (comment) (with c:v instead of v:c). After that, you should be able to create more than 2 encoding sessions until you restart the computer. |
@jantenhove Thank you! I'm going to start looking at it. |
@jantenhove This is a 32bit binary which uses libraries from %WINDIR%\SysWOW64. It's a 32bit versions of libraries and they are not patched. Speaking of 32bit apps, I shall not support them because 32bit patch requires almost same efforts as 64bit, despite it is a legacy platform. Also I can confirm: nvcuvid.dll doesn't loaded at all in this app. Could you please provide x64 build of your test app? Maybe it is possible to derive solution which fits both for D3D and CUDA encoding session. |
I just had some important discovery. 32bit ffmpeg build exhibits exactly same behavior. It fails to open 3 sessions on patched system, but after successful run of 64bit version of ffmpeg, it becomes capable to open 3 sessions. @jantenhove your x64 test binary will be very helpful for revealing roots of problem and distinct between CUDA vs D3D mode and 32bit vs 64bit. |
@Snawoot Sorry for uploading the wrong binary. I debugged the x64 version, but created a x86 release build. Anyway, here is a x64 build: https://www.filehosting.org/file/details/784619/nvenc-patch-test.exe |
Here is my results. Journey through about ten levels of call stack leads to D3D and then to Probably this discovery also may help Plex users on Windows since Plex currently uses dxva2 and MF. But there is two problems:
It seems to me, it is more practical to bump D3D sessions via bumping CUDA sessions with some sort of minimal binary opening several sessions, because it is simpler to add single one-shot executable to autostart than bothering with system protection every time. Also maintenance of one binary patch takes less efforts than maintenance of two. If someone feels like he is up to implement such bumping binary in form of standalone application with source code - feel free to make Pull Request. Also parameterized script for FFmpeg will do. I wonder if ffmpeg has some sort of dummy input which fits here best. |
And here is minimal ffmpeg script which bumps 10 sessions with nullsrc input and null output: https://gist.github.com/Snawoot/243c53bb52044297f5ceb6125d59dc93 (don't forget to set actual ffmpeg path in script). I'll add this with proper description to win readme.md thereby closing this issue. |
@jantenhove Thank you for your code! |
@Snawoot Thank you for analyzing the problem so quickly. |
@jantenhove Yes, this will be much better than current trick with ffmpeg, so we'd appreciate such contribution. |
I have created a 'session bump' program which bumps the sessions for Direct3D by creating a configurable number of Cuda encoding sessions. Code + binary can be found here: https://github.com/jantenhove/NvencSessionLimitBump Anyone willing to test/comment? |
@jantenhove I'm going to set up clean VM with Windows 10 installation within couple of days. I'm planning to check which dependencies required (if they are) and do all walkthrough manually. |
@Snawoot In theory you'll only need the Visual Studio 2017 Redistributable (x64) when using the binary. When compiling yourself, you need the Nnvidia Video Codec SDK + Cuda SDK installed. I've created a small readme: https://github.com/jantenhove/NvencSessionLimitBump/blob/master/readme.md |
Your app works just fine with VC++ redist installed. I asked fellows to see if this code can be statically linked against VC++ runtime in order to simplify things for users and make app standalone. @svjukov modified project to link runtime statically, leaving link to nvcuda dynamic. I tested app on clean system without VC2017 Redist and it works without a hitch. Sergey prepared PR awaiting for your review. I think it is very useful change and hope for merge. |
The PR is merged. I commented on the PR report. It's nice to see it working for others! |
Thank you! I'll have to update docs for this patch to add reference to new workaround. Could you please issue new release with static binary or add static binary to current latest release? |
I will do that tomorrow. I'm currently on mobile. Thanks for all your work! |
New release is uploaded: https://github.com/jantenhove/NvencSessionLimitBump/releases |
Thank you! |
Good job guys! ⭐⭐⭐⭐⭐ |
the 3d bump is working for me great work guys …. |
I tried this for 418.81 on Windows 10 64 bits and is not working.
Our software uses NVENC and after testing I was not able to run more than 2 NVENC instances. My concers are in why is nvcuvid.dll being patched, as far as I know this dll was the old one with CUDA enc/dec implementations. New NVENC encoder implementation seems to not rely on nvcuvid.dll, it is not even loaded into our server process, in contrast when using NVENC the one loaded is:
NVIDIA Video Encoder API, Version 8.0
C:\Windows\System32\nvEncodeAPI64.dll
Could this patch be made for this DLL?
The text was updated successfully, but these errors were encountered: