Hamming distance implementation with Numba #512

felixpetschko · 2024-04-29T15:32:21Z

Close #256

Hamming distance implementation with Numba in the same fashion as the improved TCRdist calculator.

for more information, see https://pre-commit.ci

grst · 2024-04-29T18:32:01Z

Awesome! I assume we can close #481 instead?

How does this compare to the pure numpy version in Speed up hamming distance calculation and add normalized hamming distance #481, speed-wise?
Could you port the changes related to the normalized hamming distance in Speed up hamming distance calculation and add normalized hamming distance #481 to this PR?

felixpetschko · 2024-04-30T14:50:49Z

Awesome! I assume we can close #481 instead?

How does this compare to the pure numpy version in Speed up hamming distance calculation and add normalized hamming distance #481, speed-wise?

Could you port the changes related to the normalized hamming distance in Speed up hamming distance calculation and add normalized hamming distance #481 to this PR?

Yes, we can close #481 instead, i will port the changes related to the normalized hamming distance to this PR.
This PR still needs some performance related adaptions that will follow soon. Today I tested a numba version that could run 1 million cells in 80 seconds with 64 cores with the hamming distance (the tcrdist takes around 210 seconds). This is around 8 times faster than the current implementation of the hamming distance in scirpy. The implementation in #481 was fast with lower ressources and smaller inputs, but I think it couldn't really outperform the current hamming distance in scirpy on the cluster. I also have a pure numpy implementation that wasn't shown in any PR so far which is around 25% slower than the upcoming numba version.

… into numba_hamming

for more information, see https://pre-commit.ci

… into numba_hamming

for more information, see https://pre-commit.ci

felixpetschko · 2024-05-02T14:38:58Z

@grst I just pushed a version of the hamming distance that uses numba for the parallelization by using parallel=True for the JIT compiler. I could run 1 million cells in 40 seconds (80 with job_lib) and 8 million cells in 2400 seconds with 64 cores. That way threads are used instead of processes and i only needed 128GB of RAM for 8 million cells.
What do you think about using numba parallel threads instead of job_lib?
The advantages:
Speed: better optimization by numba, threads start faster than processes, less copying stuff around;
Memory: shared memory between threads -> less redundancy -> lower memory consumption

grst · 2024-05-02T15:27:36Z

I've been thinking about this before, but wouldn't have thought that there is so much to gain since the blocks were already quite large. The only downside I can see is that out-of-machine parallelization is not possible that way anymore. 2400s is impressive for this number of cells, but if you have the compute power you could just split it to several nodes and be faster.

But probably we'd have to resolve other bottlenecks first before this becomes relevant.

felixpetschko · 2024-05-02T15:49:14Z

@grst We could just introduce 2 parameters number_of_processes (joblib jobs) and number_of_threads_per_process (numba threads) or something like that instead of n_jobs, because everything is already set up for it anyway. That way we could get the best of both worlds and the user can decide :)

grst

That's a great idea! I would maybe rather control the number of blocks via parameter rather than the number of jobs. When using dask, maybe smaller blocks are benefitial to balance the load, since some workers might be faster than others. Then the final call would look something like

with joblib.parallel_config(backend="dask", n_jobs=200, verbose=10):
    ir.pp.ir_dist(
        metric="hamming",
        n_jobs=8, # jobs per worker
        n_blocks = 2000, # number of blocks sent to dask
    )

To document how to do this properly, I'd like to setup a "large dataset tutorial" (#479) at some point.

grst · 2024-05-03T05:34:59Z

src/scirpy/ir_dist/metrics.py

+            arguments = [(split_seqs[x], seqs2, is_symmetric, start_columns[x]) for x in range(n_blocks)]
+
+            delayed_jobs = [joblib.delayed(self._calc_dist_mat_block)(*args) for args in arguments]
+            results = list(_parallelize_with_joblib(delayed_jobs, total=len(arguments), n_jobs=self.n_jobs))


You could directly use

Parallel(return_as="list", n_jobs=self.n_jobs)(delayed_jobs)

here. The _parallelize_with_joblib wrapper is only there for the progressbar - and a progressbar doesn't make sense with this small number of jobs and it is anyway not compatible with the dask backend.

…tcrdist distance metrics

… into numba_hamming

for more information, see https://pre-commit.ci

… into numba_hamming

for more information, see https://pre-commit.ci

… into numba_hamming

for more information, see https://pre-commit.ci

… into numba_hamming

…nged

for more information, see https://pre-commit.ci

… into numba_hamming

for more information, see https://pre-commit.ci

codecov · 2024-08-16T19:46:58Z

Codecov Report

Attention: Patch coverage is 53.69458% with 94 lines in your changes missing coverage. Please review.

Project coverage is 81.58%. Comparing base (d1db848) to head (f566211).
Report is 18 commits behind head on main.

Files	Patch %	Lines
src/scirpy/ir_dist/metrics.py	52.06%	93 Missing ⚠️
src/scirpy/util/__init__.py	85.71%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #512      +/-   ##
==========================================
+ Coverage   80.19%   81.58%   +1.39%     
==========================================
  Files          49       49              
  Lines        4079     4204     +125     
==========================================
+ Hits         3271     3430     +159     
+ Misses        808      774      -34

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

felixpetschko · 2024-08-16T20:19:44Z

I implemented the requested changes now. The histogram feature might need some adaptions in a future pull request.
The failing test case with codecov/patch doesn't make sense here, because the numba functions are not recognized as covered with tests eventhough they are (indirectly). From my side this pull request could be integrated. Afterwards I would start a new pull request to show a draft version of a GPU implementation for the hamming distance.

… into numba_hamming

for more information, see https://pre-commit.ci

grst · 2024-08-19T15:07:39Z

Hi @felixpetschko,

still anything pending from your side here? I'll try it out locally one more time and then I'd merge this.

felixpetschko · 2024-08-19T15:50:18Z

There is nothing else pending from my side :)

felixpetschko and others added 10 commits April 29, 2024 13:28

take static methods out of tcrdist

bad62d8

made _tcrdist_mat a normal class method

72565bf

parent method NumbaDistanceCalculator extracted

add8e7f

numba version of hamming distance implemented

e9c0642

hamming numba tests passed and reference test added

68e0493

hamming numba distance calculator implemented and tested

ef0fa7d

n_jobs parameter handling done in NumbaDistanceCalculator superclass

0b15f8b

documentation adapted

46bfc14

removed unnecessary import

e339e14

[pre-commit.ci] auto fixes from pre-commit.com hooks

7da4519

for more information, see https://pre-commit.ci

grst mentioned this pull request Apr 30, 2024

Speed up hamming distance calculation and add normalized hamming distance #481

Closed

felixpetschko and others added 6 commits May 2, 2024 15:33

hamming distance with numba parallelization implemented

82b0259

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

b2d28d3

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

249e626

for more information, see https://pre-commit.ci

imports fixed

2fccc6a

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

9ee1a2b

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

a68ab53

for more information, see https://pre-commit.ci

grst reviewed May 3, 2024

View reviewed changes

felixpetschko and others added 7 commits May 6, 2024 10:28

implemented parallelization with n_jobs and n_blocks for hamming and …

d68a10b

…tcrdist distance metrics

performance optimization for hamming and tcrdist

0005e63

more documentation added

6f16a3e

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

6b32311

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad13f52

for more information, see https://pre-commit.ci

documentation adapted

08ad838

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

a8d9846

… into numba_hamming

felixpetschko and others added 15 commits August 15, 2024 19:26

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

def8b1e

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

dc8dae4

for more information, see https://pre-commit.ci

moved histogram creation to parent class of hamming distance calculator

758feed

histogram computation adaptions

04d0db7

test case test_tcrdist_histogram_not_implemented added

b7ed4ca

documentation for histogram adapted

d40e193

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

f2c32af

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

54c0cc2

for more information, see https://pre-commit.ci

reformatted doc string

b16c705

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

3aab445

… into numba_hamming

handling of symmetric matrices with respect to histogram variable cha…

2419bfb

…nged

[pre-commit.ci] auto fixes from pre-commit.com hooks

e516a86

for more information, see https://pre-commit.ci

retrieval of usable cpus for numba adapted

20c14eb

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

7c2cc06

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

f566211

for more information, see https://pre-commit.ci

felixpetschko mentioned this pull request Aug 16, 2024

Get usable cpu count for numba #539

Closed

3 tasks

felixpetschko and others added 3 commits August 19, 2024 08:38

more documentation for histogram and (hamming) normalize added

f68dd70

Merge branch 'numba_hamming' of https://github.com/felixpetschko/scirpy…

3955eb1

… into numba_hamming

[pre-commit.ci] auto fixes from pre-commit.com hooks

d9dd20e

for more information, see https://pre-commit.ci

felixpetschko mentioned this pull request Aug 19, 2024

GPU implementation of hamming distance #541

Open

Update CHANGELOG and tweak docs

edf6900

grst approved these changes Aug 19, 2024

View reviewed changes

grst enabled auto-merge (squash) August 19, 2024 19:20

grst disabled auto-merge August 19, 2024 20:08

grst merged commit 2b8b8e6 into scverse:main Aug 19, 2024
9 checks passed

grst mentioned this pull request Aug 30, 2024

Refactor distance metric histogram #548

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hamming distance implementation with Numba #512

Hamming distance implementation with Numba #512

felixpetschko commented Apr 29, 2024 •

edited by grst

Loading

grst commented Apr 29, 2024

felixpetschko commented Apr 30, 2024

felixpetschko commented May 2, 2024 •

edited

Loading

grst commented May 2, 2024

felixpetschko commented May 2, 2024

grst left a comment

grst May 3, 2024 •

edited

Loading

codecov bot commented Aug 16, 2024

felixpetschko commented Aug 16, 2024 •

edited

Loading

grst commented Aug 19, 2024

felixpetschko commented Aug 19, 2024

Hamming distance implementation with Numba #512

Hamming distance implementation with Numba #512

Conversation

felixpetschko commented Apr 29, 2024 • edited by grst Loading

grst commented Apr 29, 2024

felixpetschko commented Apr 30, 2024

felixpetschko commented May 2, 2024 • edited Loading

grst commented May 2, 2024

felixpetschko commented May 2, 2024

grst left a comment

Choose a reason for hiding this comment

grst May 3, 2024 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Aug 16, 2024

Codecov Report

felixpetschko commented Aug 16, 2024 • edited Loading

grst commented Aug 19, 2024

felixpetschko commented Aug 19, 2024

felixpetschko commented Apr 29, 2024 •

edited by grst

Loading

felixpetschko commented May 2, 2024 •

edited

Loading

grst May 3, 2024 •

edited

Loading

felixpetschko commented Aug 16, 2024 •

edited

Loading