[FR]: Use OpenCL instead privative alternatives (CUDA, Metal) #595

RafaelLinux · 2019-08-16T11:05:37Z

I just reported previosly the impossibility to render with Meshroom, probably cause despite I have an NVidia GPU, Nvidia does not provide any CUDA package for OpenSUSE 15.1 . I use Blender, GIMP ... all of them are using OpenCL. Meshroom is developed for Linux and Windows. OpenCL is updated continuously for both platforms. OpenCL performance is slightly under propietary Nvidia or AMD APIs, so, why do not let Meshroom to use OpenCL GPGPU API? Even Intel GPU users could use Meshroom if it uses OpenCL framework.

Please, could you consider this suggestion?

Thank you

natowi · 2019-08-16T11:22:43Z

Read alicevision/AliceVision#439
Here is the Background on why CUDA is used in many applications:
https://www.quora.com/Why-cant-a-deep-learning-framework-like-TensorFlow-support-all-GPUs-like-a-game-does-Many-games-in-the-market-support-almost-all-GPUs-from-AMD-and-Nvidia-Even-older-GPUs-are-supported-Why-cant-these-frameworks

RafaelLinux · 2019-08-16T13:55:52Z

I read the thread. Some commentaries are from 2018, and OpenCL 2.2 didn't exist, and many changes come from then. CUDA is used in many applications, but OpenCL too () . In that list is Darktable too, that I usually use.

Anyway, Fabencastian wrote

Currently, we have neither the interest nor the resources to do another implementation of the CUDA code to another GPU framework.

That's a pity, cause lot of users could not try Meshroom, despite it's a great develop. I'm just now in the PC with the Intel GPU, so there is no way to use Meshroom and tried alternatives, like Metashape, that doesn't require necessarily and Nvidia GPU.

skinkie · 2019-08-20T19:44:59Z

That @fabiencastan does not have the time to do a port of a - for him working implementation - does not mean that other cannot implement it in their own time. A very big thing here is, would you implement it in OpenCL, or something different. Some good pointers on the wiki what are viable alternatives could help people that want to start on this task.

RafaelLinux · 2019-08-20T20:42:29Z

Hi skinkie, I have no sufficient skills to code in C/C++. I'll give a try if it were Python, PHP or even JS. I point to the fact that "less users able to run an application = less interest in the application = less feedback" and finally, the great idea falls in an lost effort. It's true it's easier to work with the CUDA API, but a lot of users in this forum has reported info about how to migrate or simplify change to OpenCL. That could be a good point to start. That's only my opinion, of course.

skinkie · 2019-08-20T21:06:10Z

@RafaelLinux As user you can use Meshroom without CUDA, the only part of the application that is 'hidden' is the DepthMap stage and even that allows for preview without CUDA. As developer MeshRoom is Python + QML low entry level to make impact. The first acceleration CUDA is used in is the feature extraction. You could just try to get this to work: https://github.com/pierrepaleo/sift_pyocl

Personally my focus for Meshroom is introducing some heuristics for matching images and supervised learning opposed to the current brute force approach. Not that I am a photogrammetry specialist, but I can surely try to work on this open source project.

RafaelLinux · 2019-08-21T00:25:52Z

Maybe I'm using incorrectly Meshroom, cause if I only reach DepthMap, I only see a cloud of points, so I can see the model result.

skinkie · 2019-08-21T07:19:48Z

https://github.com/alicevision/meshroom/wiki/Draft-Meshing

RafaelLinux · 2019-08-21T09:30:26Z

Thank you, is a good workaround. I ll try it. Anyway, remember users don't mind how long it takes, quality is the priority, so please, don't forget this feature request ;)

aviallon · 2019-08-25T23:00:50Z

One could also use hipfy from AMD to convert CUDA code to HIP, wich can be built to work on either NVIDIA or AMD cards (with very nice performance, I currently use it for Tensorflow, and it works like a charm !)

natowi · 2019-08-25T23:43:57Z

@aviallon The last time (2018) hip did not support some cuda functions alicevision/AliceVision#439 (comment)
and there was no full support for windows and amdgpu linux alicevision/AliceVision#439 (comment).

You are welcome to try again using hipfy.

arpu · 2019-09-17T22:27:42Z

for reference https://github.com/cpc/hipcl

pppppppp783 · 2019-09-25T17:11:11Z

for reference https://github.com/cpc/hipcl

This is interesting, have anyone tried it?

pppppppp783 · 2019-09-25T17:15:13Z

https://www.computer.org/publications/tech-news/from-cuda-to-opencl-execution/

ShalokShalom · 2019-09-27T08:32:23Z

Nvidia does not provide any CUDA package_ for OpenSUSE 15.1.

This is simply a packaging issue since Arch has CUDA despite being not in the list here.

You already reported that issue to both, the open SUSE packagers and the NVidea CUDA team?

And you can probably repackage either the 15.0 variant of openSUSE package or the Arch package, which uses an independent source, as you can see in the link.

skinkie · 2019-09-27T08:38:26Z

@ShalokShalom the problem with Cuda remains that older hardware absolutely does not work with newer CUDA versions. This causes problems for nvidia-drivers and cuda, where one is effectively searching for the 'ideal pair' between them. I would be very interested if opencl could bridge this gap even by choosing the execution pipeline of choice.

ShalokShalom · 2019-09-27T09:32:09Z

And how is that with HiP? Nvidia hardware runs on it as well?

I consider using a Geforce GT 610 for CUDA, can you tell me how to choose the suitable CUDA version?

Thanks a lot

natowi · 2019-09-27T09:40:10Z

@ShalokShalom

And how is that with HiP? Nvidia hardware runs on it as well?

"HIP allows developers to convert CUDA code to portable C++. The same source code can be compiled to run on NVIDIA or AMD GPUs"

I consider using a Geforce GT 610 for CUDA, can you tell me how to choose the suitable CUDA version?

On Windows, install the latest version, on Linux this might depend on your Distro. GT 610 supports CUDA 2.1, MR requires 2+

ShalokShalom · 2019-09-27T10:22:28Z

I am on Linux, what decides which version is optimal? I am on KaOS, that is a rolling distribution.

So, does HiP negligible the version differences between CUDA and the different NVidia hardware?

Could or should we replace CUDA entirely with it or is the overhead to big?

natowi · 2019-09-27T10:35:33Z

@ShalokShalom With HiP we can compile two versions of Meshroom: for CUDA and AMD GPUs. For CUDA users nothing changes. (https://kaosx.us/docs/nvidia/ But you won´t get far with a 1GB GT 610)

natowi · 2019-09-27T10:55:33Z

We have to wait for HiP to support cudaMemcpy2DFromArray. Then we can add AMD support for AV/MR and try HiPCL.

skinkie · 2019-09-27T12:41:22Z

@natowi But you won´t get far with a 1GB GT 610)

If Meshroom would allow parallel computation for nodes where both CPU and GPU could for example do feature extraction. Any additional computing resource could help. It depends on how much overhead the GPU would give in compare to a (faster) decent CPU but I would still see the potential for independent computation tasks.

arpu · 2019-11-16T01:58:40Z

looks like hip supports now cudaMemcpy2DFromArray any progress on this?

natowi · 2019-11-16T10:13:39Z

@skinkie see #175

@arpu Yes, all CUDA functions are now supported by HiP and I was able to convert the code to HiP using the conversion tool (read here for details). The only thing left is to write a new cmake file that includes HiP and supports both CUDA and AMD compilation and the different platforms. Here is the Meshroom PopSift plugin I used for testing. At the moment I don´t have the time to figure out how to rewrite the cmake file, but I think @ShalokShalom wanted to look into this.
You are welcome to do so as well.

ShalokShalom · 2019-11-19T07:28:32Z

One question is very critical, I think: Will we ship two versions?

Linux distributions do their packaging themselves and we could benefit enormously by finding someone who is willing to maintain Alice for their userbase since that could result in new developers and funding.

2 versions, one for CUDA and one for HIP is something they will never do.

natowi · 2019-11-19T08:08:37Z

@ShalokShalom from the HiP code we can compile both CUDA and AMD versions. Similar to the parameter target platform/os in the cmake, CUDA or AMD can be defined. So depending on the compiler parameters we can define the versions (OS+cuda/amd).
So once we can compile all supported plattforms from our hipified code, we can create a PR to use HiP instead of CUDA code by default in the official repo.

PickUpYaAmmo · 2019-11-25T07:57:50Z

Any idea how long that approximately takes? I feel like a child just before Christmas eve :D

natowi · 2019-11-25T08:39:53Z

@PickUpYaAmmo I will take another look at this over the winter holidays.

skinkie · 2021-01-09T13:11:20Z

I just ran into a kernel panic over tesseract using opencl. While I agree generally with your statement, OpenCL may fall back to a CPU implementation, but as I just noticed, that is not a given thing. Even it worked gracefully, a CPU computation might render some operations useless or costing extreme amounts of time (and therefore: power).

ShalokShalom · 2021-01-11T00:19:16Z

We could change the title?

acxz · 2022-04-23T15:35:23Z

I apologize if this is a bit out of touch with the current direction of the conversion, but wanted to share nonetheless:

https://github.com/illuhad/hipSYCL

skinkie · 2022-04-23T15:40:49Z

I apologize if this is a bit out of touch with the current direction of the conversion, but wanted to share nonetheless:

https://github.com/illuhad/hipSYCL

Would that allow the program to use all interfaces simultaniously? (Read: the ability to schedule the tasks over multiple targets)

skinkie · 2022-04-23T17:10:33Z

Would that allow the program to use all interfaces simultaniously?

this would depend on if hip or SYCL can do this, which I don't think they can yet. (maybe they can if so do point me to some resources for this)

My question was more in the direction, would the glue code take care of it ;)

skinkie · 2022-04-23T18:14:59Z

Considering that hip or SYCL do not have that feature, hipSYCL cannot write the code for that feature even if they wanted to. From the motivation of hipSYCL, if hip or SYCL does have that feature, then yes, hipSYCL will expose that feature for users.

I don't think hip or SYCL would require the functionality on their own, if the intermediate would take care of it. Like starting a new thread on the CPU or GPU, anything that would be available.

skinkie · 2022-04-23T18:34:14Z

this is getting offtopic, if you want to discuss more feel free to open an issue over at https://github.com/illuhad/hipSYCL

I really don't think this is offtopic. Meshroom splits a huge task in many smaller steps, but is then limited to a specific backend. Anything that would natively allows to schedule the task transparantly to cpu, gpu, etc. that would motivate people to integrate that technology sooner.

ShalokShalom · 2022-04-24T06:39:27Z

Why would this be offtopic? 👀

michal2229 · 2022-05-19T22:09:58Z

I think this could be helpful: https://www.phoronix.com/news/Intel-SYCLomatic-20220829,
SYCLomatic on GitHub

Nosenzor · 2023-05-22T09:01:57Z

SYCL or Vulkan Compute Shader can be the open solution (with a preference to SYCL). SYCL principles are pretty close to CUDA.
What about the Mushroom CL version that has existed in parallel : https://github.com/openphotogrammetry/meshroomcl ? Is there a way for AliceVision to get the source code and work from here ?

balaclava9 · 2023-10-22T01:18:33Z

a more practical suggestion here: Regard3D is another OpenMVG based photogrammetry solution. he wrote a densification procedure that doesn't need CUDA and is platform agnostic. He hasn't updated in a while. maybe you can add his densification module to meshroom. It's open source. https://github.com/rhiestan/Regard3D/tree/master

also OpenMVS has a very nice platform independent densification module which works very well, which I've been using.

zicklag · 2024-02-13T15:48:41Z

Something to look into:

https://github.com/vosen/ZLUDA

ZLUDA is currently alpha quality, but it has been confirmed to work with a variety of native CUDA applications: Geekbench, 3DF Zephyr, Blender, Reality Capture, LAMMPS, NAMD, waifu2x, OpenFOAM, Arnold (proof of concept) and more.

vosen · 2024-03-26T13:39:39Z

To those interested in the topic: could you please test Meshroom-compatible ZLUDA version? More info here: vosen/ZLUDA#79 (comment)

natowi · 2024-03-30T18:41:29Z

For those of you who want to test it now:
Download the official Windows build and replace the AliceVision folder with the one shared by vosen. Then start Meshroom with ZLUDA (download provided by vosen).

You can download a ready to use ZIP here if you prefer. I put it together to simplify testing.

(It includes Meshroom 2023.3.0 and AliceVision+ZLUDA as provided by vosen. I added Run-Meshroom-ZLUDA.bat that hopefully works and ZLUDA-Info.txt with some info on ZLUDA from the git)

It would be great if you could do some tests with the https://github.com/alicevision/dataset_monstree dataset (mini3 and full) so we can compare the performance.

polarathene · 2024-03-31T21:00:15Z

You can download a ready to use ZIP here if you prefer. I put it together to simplify testing.

This just loads a webpage that says "Not found", response is 404. Perhaps it's only available to you?

I only have a laptop with a 780M APU + RTX 4060 (laptop part not desktop, so weaker part AFAIK?), paired with a Ryzen 7940HS (8/16 core/threads @ 4GHz) and 32GB RAM. A 780M probably isn't ideal for an AMD GPU to test with? 🤷‍♂️

I might find time to give it a try with the dataset if you like, although I haven't done photogrammetry in a while, I only have about 30GB of disk to spare atm, if that's sufficient I can probably tackle it by next weekend 👍

natowi · 2024-04-01T15:23:26Z

@polarathene sorry, my bad. Link is fixed. Just give it a test run on your machines. 30gb should be more than enough to test with the monstree dataset.

natowi · 2024-06-03T21:26:38Z

In addition to being able to run Meshroom using ZLUDA, MeshroomResearch now allows to run COLMAP in the backend (or MicMac if you like)

corndog2000 · 2024-06-07T22:31:54Z

@polarathene sorry, my bad. Link is fixed. Just give it a test run on your machines. 30gb should be more than enough to test with the monstree dataset.

It appears that your link is still broken.

natowi · 2024-06-09T21:55:07Z

@corndog2000 could you try again? Meshroom-2023.3.0-ZLUDA-2024-03.zip

polarathene · 2024-06-10T00:58:30Z

I tried with mini 3, note I've never used this software before.

There was no clear GUI insights on the GPU to be detected/used, but I had found out about the FeatureExtraction node having an advanced field visibility enabled to be able to disable the default setting that forces CPU.

The logs had no indication of anything about GPU though. And the statistics for that node only show CPU activity, with nothing in the GPU stats.

When the DepthMap node was reached, it failed. There is no log output for any additional context. I did notice that there was a setting for number of GPUs and I had set that to 1, setting it to 0 did not make a difference.

As my system has an nvidia RTX 4060 (dGPU) in addition to the AMD 780M (iGPU), if any GPU was used it's unclear when or which one used it. From what I could make of what happened, it doesn't seem like a GPU was used?

I will keep the files on my system for a few more days. If you have some guidance on how you'd like me to verify, let me know.

polarathene · 2024-06-10T01:02:07Z

Update: Running the full dataset, the FeatureExtraction step runs long enough to observe CPU / GPU activity.

CPU was about 70%
AMD iGPU inactive
Nvidia GPU inactive

Launched via Run-Meshroom-ZLUDA.bat from File Explorer on Windows 11.

If it's relevant, I have CUDA 12.4 on the system.

mkommar · 2024-06-19T00:52:12Z

Thank you for doing this!

Running on a ROG Ally attached to a Nexdock.

Tried the dataset at:
https://github.com/alicevision/dataset_monstree

Downloaded CUDA 11.0.2:
https://developer.nvidia.com/cuda-11.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exenetwork

Ran:
"C:\Meshroom-2023.3.0-ZLUDA-2024-03\Run-Meshroom-ZLUDA.bat"

natowi · 2024-06-19T18:49:30Z

@polarathene did it work now?

If both integrated AMD GPU and dedicated AMD GPU are present in the system, ZLUDA uses the integrated GPU.

This is a bug in underlying ROCm/HIP runtime. You can work around it by disabling the integrated GPU.

On Windows we recommend you use environment variable HIP_VISIBLE_DEVICES=1 environment variable (more here) or disable it system-wide in Device Manager.

I´d recommend reading the https://github.com/vosen/ZLUDA readme

polarathene · 2024-06-20T04:21:55Z

@polarathene did it work now?

No? I shared my feedback when I tried it. I deleted the files after several days without response.

I´d recommend reading the https://github.com/vosen/ZLUDA readme

I only have a 780M iGPU (AMD), and an RTX 4060 for dGPU (nvidia).

Is the version of Meshroom you provided not meant to detect my CUDA capable 4060? And no logs/errors regarding ZLUDA for the AMD 780M?
My gripe was that Meshroom wasn't very clear in what GPU(s) it was aware of, or would use. I saw no option to select a GPU device, only the checkbox settings for toggling between GPU/CPU and the stats that showed CPU without any errors about GPU in the logs.

I don't think I have ROCm installed, so that's probably why the 780M wasn't used. Not sure why the 4060 wasn't detected unless your build of Meshroom prevented that.

The README link mentions an issue with CUDA 12, but presumably that won't be an issue. Downgrading isn't something I'm interested in. I might install ROCm, but as I don't use Meshroom myself and the mentioned UX issues, I'm probably only going to assist with testing one more time 😅

EDIT: Nope, I can't help test this sorry. ROCm doesn't support AMD iGPUs apparently:

@mkommar are you reporting success with ZLUDA? Your screenshot shows CPU activity. How did you test to verify AMD was working correctly with ZLUDA? (these instructions were not clear to me, as someone who is not a meshroom user)

mkommar · 2024-06-20T04:44:22Z

I'll give a more detailed response after I do a couple more tests tomorrow.

…

On Thu, Jun 20, 2024, 12:22 AM Brennan Kinney ***@***.***> wrote: @polarathene <https://github.com/polarathene> did it work now? No? I shared my feedback when I tried it. I deleted the files after several days without response. ------------------------------ I´d recommend reading the https://github.com/vosen/ZLUDA readme I only have a 780M iGPU (AMD), and an RTX 4060 for dGPU (nvidia). - Is the version of Meshroom you provided not meant to detect my CUDA capable 4060? And no logs/errors regarding ZLUDA for the AMD 780M? - My gripe was that Meshroom wasn't very clear in what GPU(s) it was aware of, or would use. I saw no option to select a GPU device, only the checkbox settings for toggling between GPU/CPU and the stats that showed CPU without any errors about GPU in the logs. I don't think I have ROCm installed, so that's probably why the 780M wasn't used. Not sure why the 4060 wasn't detected unless your build of Meshroom prevented that. The README link mentions an issue with CUDA 12 <https://github.com/vosen/ZLUDA#cuda-12>, but presumably that won't be an issue. Downgrading isn't something I'm interested in. I might install ROCm, but as I don't use Meshroom myself and the mentioned UX issues, I'm probably only going to assist with testing one more time 😅 ------------------------------ @mkommar <https://github.com/mkommar> are you reporting success with ZLUDA? Your screenshot shows CPU activity. How did you test to verify AMD was working correctly with ZLUDA? (*these instructions were not clear to me, as someone who is not a meshroom user*) — Reply to this email directly, view it on GitHub <#595 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALQSGZTEM3VNSBLFBUZUILZIJKHZAVCNFSM4IMGO2Q2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJXHE3TONZYGU2Q> . You are receiving this because you were mentioned.Message ID: ***@***.***>

simogasp added CUDA feature request feature request from the community labels Aug 21, 2019

natowi added the do not close issue that should stay open (avoid automatically close because stale) label Oct 27, 2019

This comment was marked as off-topic.

Sign in to view

gerberger mentioned this issue May 4, 2024

Meshroom 2023.3.0 fails at DepthMap stage vosen/ZLUDA#79

Closed

natowi mentioned this issue Aug 11, 2024

Expanding beyond CUDA? alicevision/AliceVision#1503

Closed

[FR]: Use OpenCL instead privative alternatives (CUDA, Metal) #595

[FR]: Use OpenCL instead privative alternatives (CUDA, Metal) #595

Comments

RafaelLinux commented Aug 16, 2019

natowi commented Aug 16, 2019

RafaelLinux commented Aug 16, 2019

skinkie commented Aug 20, 2019

RafaelLinux commented Aug 20, 2019

skinkie commented Aug 20, 2019

RafaelLinux commented Aug 21, 2019

skinkie commented Aug 21, 2019

RafaelLinux commented Aug 21, 2019

aviallon commented Aug 25, 2019

natowi commented Aug 25, 2019 • edited Loading

arpu commented Sep 17, 2019

pppppppp783 commented Sep 25, 2019

pppppppp783 commented Sep 25, 2019

ShalokShalom commented Sep 27, 2019 • edited Loading

skinkie commented Sep 27, 2019

ShalokShalom commented Sep 27, 2019 • edited Loading

natowi commented Sep 27, 2019 • edited Loading

ShalokShalom commented Sep 27, 2019

natowi commented Sep 27, 2019

natowi commented Sep 27, 2019

skinkie commented Sep 27, 2019

arpu commented Nov 16, 2019

natowi commented Nov 16, 2019 • edited Loading

ShalokShalom commented Nov 19, 2019

natowi commented Nov 19, 2019

PickUpYaAmmo commented Nov 25, 2019

natowi commented Nov 25, 2019

skinkie commented Jan 9, 2021

ShalokShalom commented Jan 11, 2021

acxz commented Apr 23, 2022

skinkie commented Apr 23, 2022

This comment was marked as off-topic.

skinkie commented Apr 23, 2022

This comment was marked as off-topic.

skinkie commented Apr 23, 2022

This comment was marked as off-topic.

skinkie commented Apr 23, 2022

ShalokShalom commented Apr 24, 2022

michal2229 commented May 19, 2022 • edited Loading

Nosenzor commented May 22, 2023

balaclava9 commented Oct 22, 2023

zicklag commented Feb 13, 2024

vosen commented Mar 26, 2024

natowi commented Mar 30, 2024 • edited Loading

polarathene commented Mar 31, 2024 • edited by natowi Loading

natowi commented Apr 1, 2024 • edited Loading

natowi commented Jun 3, 2024

corndog2000 commented Jun 7, 2024

natowi commented Jun 9, 2024

polarathene commented Jun 10, 2024 • edited Loading

polarathene commented Jun 10, 2024 • edited Loading

mkommar commented Jun 19, 2024 • edited Loading

natowi commented Jun 19, 2024

polarathene commented Jun 20, 2024 • edited Loading

mkommar commented Jun 20, 2024 via email

natowi commented Aug 25, 2019 •

edited

Loading

ShalokShalom commented Sep 27, 2019 •

edited

Loading

ShalokShalom commented Sep 27, 2019 •

edited

Loading

natowi commented Sep 27, 2019 •

edited

Loading

natowi commented Nov 16, 2019 •

edited

Loading

michal2229 commented May 19, 2022 •

edited

Loading

natowi commented Mar 30, 2024 •

edited

Loading

polarathene commented Mar 31, 2024 •

edited by natowi

Loading

natowi commented Apr 1, 2024 •

edited

Loading

polarathene commented Jun 10, 2024 •

edited

Loading

polarathene commented Jun 10, 2024 •

edited

Loading

mkommar commented Jun 19, 2024 •

edited

Loading

polarathene commented Jun 20, 2024 •

edited

Loading