You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, consider this code, extracted from one of my libraries:
for &group in samples.array_chunks::<16>() {
let mask = predicate(i8x16::new(group));
offset += mask.move_mask().trailing_zeros() as usize;
if mask.any() { break }
}
offset
The intent here is to move offset by the fractional part of the group that matched the predicate. Unfortunately, if the predicate didn't match at all, the moved mask would have 32 zeroes, throwing off the pointer.
(I know about iter.remainder()--in this particular application it can't be used because by the time the remainder is examined, it is already advanced past the group being matched by the predicate.)
Making i8x16::move_mask() be an u16 would solve this problem, but won't work for i32x4. Adding leading_zeros() and trailing_zeros() to the SIMD types, or perhaps boolean mask types from #43, would also solve it.
The text was updated successfully, but these errors were encountered:
I wouldn't have come up with a loop like that, I think. Though, I see your point about how things can get mixed up if you write it that way.
Given the amount of time I have to work on wide lately (not much), I'd prefer to avoid solutions that are breaking changes in the library.
I'm not totally sure that I understand the proposed loop as well. If there's any elements matching the predicate, you add some to the offset and then break the loop immediately? So the only time you don't break the loop is if "no elements match", so wouldn't the move_mask of a "no elements match" always be a 0? So can't you make it an if-else with adding a constant 16 per loop that continues, and then add move_mask when things don't continue?
For example, consider this code, extracted from one of my libraries:
The intent here is to move
offset
by the fractional part of the group that matched the predicate. Unfortunately, if the predicate didn't match at all, the moved mask would have 32 zeroes, throwing off the pointer.(I know about
iter.remainder()
--in this particular application it can't be used because by the time the remainder is examined, it is already advanced past the group being matched by the predicate.)Making
i8x16::move_mask()
be anu16
would solve this problem, but won't work fori32x4
. Addingleading_zeros()
andtrailing_zeros()
to the SIMD types, or perhaps boolean mask types from #43, would also solve it.The text was updated successfully, but these errors were encountered: