[QST]: Benchmarking `cugraph.leiden()` #4488

wolfram77 · 2024-06-14T10:19:10Z

What is your question?

Hello @afender I want to benchmark the runtime of cugraph.leiden(). For a benchmark of the algorithm, one should only consider the runtime of the algorithm, and exclude the runtime for validations and initial memory allocations. A direct measurement of runtime around the cugraph call includes all of the above. Is it possible to get an "algorithm runtime" from the call to cugraph.leiden()?

Code of Conduct

I agree to follow cuGraph's Code of Conduct
I have searched the open issues and have found no duplicates for this question

The text was updated successfully, but these errors were encountered:

ChuckHastings · 2024-06-14T14:36:30Z

@rlratzel should have a better answer for your question. Alex Fender has moved on to our cuopt effort and doesn't work on this software anymore.

I'm fuzzy on the performance overheads of the python API - where they exist and if/how you can avoid them. I know at one time we had (and perhaps still have) some lazy computations that occur on the first call to an algorithm. I believe there is a way to avoid those. @rlratzel should be able to clarify.

Expensive validation steps are directly enabled in the C/C++ layer by passing a parameter called do_expensive_check. This is set to False by default. My quick glance at the latest python for Leiden indicates there is no mechanism for you to override this. So the only error checks that occur are fast error checks (did you pass in an edge weights pointer is - I think - the only validation that occurs on the Leiden algorithm).

As implemented, memory allocation for the result is done inside of Leiden. That memory allocation does not include initialization, we copy the result into uninitialized memory. So the performance overhead of memory allocation of the result should be minimal. All other memory allocation done inside of Leiden is dynamic based on the progress of the clustering algorithm. If you configure RMM to use the pool allocator then memory allocations should be pretty fast. Perhaps @rlratzel can clarify how to do that from python.

rlratzel · 2024-06-14T17:53:42Z

Hi @wolfram77 , I don't know if this is acceptable, but I think the best way to benchmark only the algorithm implementation and eliminate any additional allocations/conversions/input checks done in the cugraph python library would be to benchmark leiden from the C++ library in C++. Because the cugraph python library calls the libcugraph C++ library implementation, you'd be benchmarking as close to the algorithm implementation as possible (without modifying C++ source code to isolate further beyond the API).

If C++ isn't an option, you could benchmark leiden from our lower-level python library (pylibcugraph.leiden). The cugraph python library wraps pylibcugraph and adds various conveniences and additional checks which you'd want to avoid in the benchmark you're describing, so pylibcugraph.leiden might be the next best function to benchmark after C++.

Finally, configuring RMM to use pool allocation might also be something to consider, as @ChuckHastings mentioned. You can read about how to do that from python here.

wolfram77 · 2024-06-17T10:02:14Z

Thanks @ChuckHastings and @rlratzel

As suggested, I configured RMM to use pool allocation (code below). This seems to help a lot.

pool = rmm.mr.PoolMemoryResource(rmm.mr.CudaMemoryResource(), initial_pool_size=2**36)
rmm.mr.set_current_device_resource(pool)

I also discard the runtime of the first call to cugraph.leiden(). This also helps.

Below is the runtimes we observed for cuGraph Leiden (inc. other comparisons).

cuGraph Leiden fails to run on the arabic-2005, uk-2005, webbase-2001, it-2004, and sk-2005 graphs due to out of memory issues. We use an NVIDIA A100 GPU.

wolfram77 added the question Further information is requested label Jun 14, 2024

ChuckHastings assigned rlratzel Jun 14, 2024

wolfram77 closed this as completed Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST]: Benchmarking `cugraph.leiden()` #4488

[QST]: Benchmarking `cugraph.leiden()` #4488

wolfram77 commented Jun 14, 2024

ChuckHastings commented Jun 14, 2024

rlratzel commented Jun 14, 2024

wolfram77 commented Jun 17, 2024

[QST]: Benchmarking cugraph.leiden() #4488

[QST]: Benchmarking cugraph.leiden() #4488

Comments

wolfram77 commented Jun 14, 2024

What is your question?

Code of Conduct

ChuckHastings commented Jun 14, 2024

rlratzel commented Jun 14, 2024

wolfram77 commented Jun 17, 2024

[QST]: Benchmarking `cugraph.leiden()` #4488

[QST]: Benchmarking `cugraph.leiden()` #4488