-
-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting for the amount of (V)RAM used for upscaling #2088
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
9cd9abf
to
39012ae
Compare
I also implemented the logic for PyTorch. To give you a general overview. This feature currently is Apple Silicon only. That is due to the unique memory architecture on newer Macs. Since there is no separation between RAM and VRAM, it is only necessary to set a general limit for usable RAM for upscaling. The idea is that the minimum amount of RAM left for the system is 8 GB (which means 8 GB Macs are more or less out of the question). The max amount of RAM that can be reserved for the system is 80% of the total system memory. I did some testing on my 32 GB Mac, and it looks promising. RAM usage with 8 GB of reserved system memory: RAM usage with 25.6 GB (80 %) of reserved system memory: I'd also like to implement this for NCNN and would reuse some code snippets from #2070. Haven't looked into it yet, but is there something similar for ONNX? In general, I think this approach could be reused for Windows/Linux. But I assume there needs to be a second input Field for VRAM. Furthermore, the description needs to be reworded there, since the RAM usage, there is coupled to the CPU and the reserved system memory basically is only used for CPU upscaling. @RunDevelopment @joeyballentine, could I get some feedback from you? |
f5f1ffa
to
1c1996c
Compare
backend/src/packages/chaiNNer_pytorch/pytorch/processing/upscale_image.py
Outdated
Show resolved
Hide resolved
backend/src/packages/chaiNNer_pytorch/pytorch/processing/upscale_image.py
Outdated
Show resolved
Hide resolved
We don't support estimated tile sizes for ONNX. We just never implemented it. As for NCNN: it's complicated. We do estimate for NCNN, but that estimation frequently crashed the backend on macs for some reason, so I turned it off (#2006). It will now always estimate a tile size of 256 on mac. This is a good default, because it only needs 3~4GB of (V)RAM for most models. |
That's not unique to macs, anyone upscaling using integrated graphics via NCNN will have the same problem. And of course, anyone upscaling on CPU will only use RAM. |
Are you sure? From my understanding, on x86, the integrated GPUs get a share of the main memory. This memory will then be subtracted from the main memory available for the CPU. e.g., 32 GB Total Memory, 8 GB for the GPU, means the CPU will receive 24 GB of RAM. On the Mac, that is different because the GPU can have 100% of RAM. |
Here is a comprehensive comparison: https://www.sir-apfelot.de/en/compare-shared-memory-unified-memory-41023/ |
backend/src/packages/chaiNNer_pytorch/pytorch/processing/upscale_image.py
Outdated
Show resolved
Hide resolved
I think Intel integrated graphics also uses the unified memory model. https://www.intel.com/content/www/us/en/support/articles/000020962/graphics.html |
Interesting. We would need someone with such a system to test this. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Added calculation for NCNN MaxTileSize on Apple Silicon Macs. Left it 256 for Intel Macs, though. It seems On my system with
And with the
|
3ab5f42
to
2cef556
Compare
I squashed the commits and rebased it on main. This is how it looks now. Windows and LinuxmacOSLog file
|
8a3b0b7
to
16a24fe
Compare
What's still needed before we can get this merged? |
Ah crap, my backend settings change heavily conflicted with this... I'll try to come up with a plan for what to do to get this into the new system. Sorry about that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so sorry. I forgot to submit my review. My comments just sat there for a week...
16a24fe
to
4d6a53c
Compare
I'll need to wait for @joeyballentine to implement a general settings mechanism, since he did some major changes to the settings in an earlier PR. As soon as this is done, I refactor this PR. |
Yeah, I'll try to get that done as soon as I can |
70ca2a6
to
d4a0027
Compare
d4a0027
to
b4566e5
Compare
CPU upscaling & upscaling on Apple Silicon (CPU & GPU) A value between 20% and 80% of the freely available memory can be chosen for upscaling. If desired, instead of the freely available memory, these values can be applied to the entire available RAM. If a user chooses to do so, a warning will be presented and when upscaling the settings and amount of used RAM will be logged. For GPU upscaling the amount of freely available can be set. This setting is only available on Windows and Linux. MaxTileSize for NCNN on Apple Silicon has been added. add description for memory limit to upscaling node
b4566e5
to
f652d2b
Compare
@JeremyRand I'm currently very busy IRL. If you want to you can take this over and incorporate your changes. I'm not sure when I can continue working on this. |
@stonerl If I take this PR over, what remaining work needs to be done? I admit I haven't kept up with the code review on this PR, so it's hard for me to evaluate whether the remaining work is something I can do efficiently. |
(Or maybe @joeyballentine would be able to answer that?) |
@JeremyRand I'm not sure what still needs to be done for this. I can take a look tomorrow and try to figure that out though |
@joeyballentine Any updates? |
@JeremyRand sorry, we've been focused on the iterator rewrite and now on extracting out our model detection code into a separate package. After we're done with that we gotta finish the iterator rewrite (it's taken way longer than we thought). After that, we should be able to work on stuff like this |
Sure, no worries. The iterator rewrite looks really cool, I'm fine with waiting on this PR if it gets the iterator rewrite done faster. :) |
CPU upscaling & upscaling on Apple Silicon (CPU & GPU)
A value between 20% and 80% of the freely available memory can be chosen for upscaling. If desired, instead of the freely available memory, these values can be applied to the entire available RAM. If a user chooses to do so, a warning will be presented and when upscaling the settings and amount of used RAM will be logged.
For GPU upscaling, the amount of freely available can be set. This setting is only available on Windows and Linux.
MaxTileSize for NCNN on Apple Silicon has been added.
Should fix #876