Skip to content

Commit

Permalink
Faster cpu ops (#1434)
Browse files Browse the repository at this point in the history
* faster binary and cleaner copy

* use recursive template for other ops

* more cleanup

* fix from cleanup

* more clean

* fix binary

* use contiguous iterator

* add 3d

* nits

* fix

* fix?

* fix

* fix rebase
  • Loading branch information
awni authored Sep 26, 2024
1 parent 0b4a586 commit 5b6f38d
Show file tree
Hide file tree
Showing 12 changed files with 577 additions and 1,334 deletions.
357 changes: 117 additions & 240 deletions mlx/backend/common/binary.h

Large diffs are not rendered by default.

577 changes: 130 additions & 447 deletions mlx/backend/common/binary_two.h

Large diffs are not rendered by default.

3 changes: 1 addition & 2 deletions mlx/backend/common/common.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -156,8 +156,7 @@ std::pair<bool, std::vector<size_t>> Reshape::prepare_reshape(
}

// Firstly let's collapse all the contiguous dimensions of the input
auto [shape, _strides] = collapse_contiguous_dims(in);
auto& strides = _strides[0];
auto [shape, strides] = collapse_contiguous_dims(in);

// If shapes fit exactly in the contiguous dims then no copy is necessary so
// let's check.
Expand Down
Loading

0 comments on commit 5b6f38d

Please sign in to comment.