Make vectorized store convert and perform multiple stores if required #111

maleadt · 2023-06-27T19:04:40Z

Similarly to #109, when doing a vectorized store, convert the elements so that they match the element type of the layout. In addition, if we want to store more values than what we can using a single vectorized store, perform multiple stores.

This matters when the element type of the global layout differs from the shared layout. For example, if global_layout is AlignedColMajor{Float16}, but shared_layout_a is AlignedColMajor{Float32}, we initially load 8 Float16's, but then only store 4 values into the Float32 shared-memory workspace, because the number of elements is currently always computed as 16 ÷ sizeof(T).

The above happens when doing Float16xFloat32, in which case I'm trying to use T=promote_type(Float16, Float32)=Float32, causing a type mismatch between global and shared memory. I want to take this approach (as opposed to keeping the shared layout Float16 too) because this opens up a path to using WMMA for incompatible inputs (say, for Float16xFloat32 we could then use WMMA on a GPU that has Float32xFloat32=... tensor cores).

codecov · 2023-06-27T20:13:06Z

Codecov Report

Patch coverage: 50.00% and project coverage change: +0.34 🎉

Comparison is base (2a6ad1d) 29.61% compared to head (8881635) 29.96%.

❗ Current head 8881635 differs from pull request most recent head 27903f6. Consider uploading reports for the commit 27903f6 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #111      +/-   ##
==========================================
+ Coverage   29.61%   29.96%   +0.34%     
==========================================
  Files          11       11              
  Lines         763      761       -2     
==========================================
+ Hits          226      228       +2     
+ Misses        537      533       -4

Impacted Files	Coverage Δ
src/layout.jl	`21.60% <50.00%> (+1.91%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/layout.jl

maleadt · 2023-06-29T07:49:35Z

Benchmark results for commit 2b25c99 (comparing to 48279bc):

ID	before	after	change

maleadt · 2023-06-29T12:46:56Z

The 1.6 CI failures is JuliaGPU/GPUCompiler.jl#481

maleadt requested a review from thomasfaingnaert June 27, 2023 19:04

thomasfaingnaert reviewed Jun 28, 2023

View reviewed changes

src/layout.jl Outdated Show resolved Hide resolved

maleadt force-pushed the tb/vstore_mismatch branch from 9643300 to 8881635 Compare June 29, 2023 07:44

maleadt force-pushed the tb/vstore_mismatch branch 4 times, most recently from acc3c48 to 73cf2ae Compare June 29, 2023 11:50

maleadt added 4 commits June 29, 2023 14:39

Convert values when doing a vectorized store.

e692d53

Perform multiple vectorized stores when there's a type mismatch.

17e1a0d

Move the loop into vstorea.

69e97c1

Manually unroll generated code.

2b25c99

maleadt force-pushed the tb/vstore_mismatch branch from 73cf2ae to 2b25c99 Compare June 29, 2023 12:39

maleadt changed the title ~~Make vectorized store convert elements~~ Make vectorized store convert and perform multiple stores if required Jun 29, 2023

maleadt merged commit ee0fe8d into master Jun 29, 2023

maleadt deleted the tb/vstore_mismatch branch June 29, 2023 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make vectorized store convert and perform multiple stores if required #111

Make vectorized store convert and perform multiple stores if required #111

maleadt commented Jun 27, 2023

codecov bot commented Jun 27, 2023 •

edited

Loading

maleadt commented Jun 29, 2023 •

edited

Loading

maleadt commented Jun 29, 2023

Make vectorized store convert and perform multiple stores if required #111

Make vectorized store convert and perform multiple stores if required #111

Conversation

maleadt commented Jun 27, 2023

codecov bot commented Jun 27, 2023 • edited Loading

Codecov Report

maleadt commented Jun 29, 2023 • edited Loading

maleadt commented Jun 29, 2023

codecov bot commented Jun 27, 2023 •

edited

Loading

maleadt commented Jun 29, 2023 •

edited

Loading