Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize String#to_utf16 #14671

Merged

Conversation

straight-shoota
Copy link
Member

This patch contains a series of optimizations to String#to_utf16 to improve performance:

  • Remove branches for impossible codepoints. each_char already excludes invalid codepoints, so we only have to handle the encoding as one or two UInt16.
  • Drop ascii_only? branch. The performance benefit is questionable because ascii_only? iterates the string. With optimizations to the regular loop, this special case doesn't provide much extra performance, so it's expendable.
  • Use pointer appender to avoid bounds checks on every slice assignment. It also improves convenience.
  • Use wrapping math operators. These operations cannot overflow, so we can use the unsafe variants for better performance.

`each_char` already handles replacement chars
The performance benefit is questionable because `ascii_only?` iterates the
string bytes as well, leading to three consecutive iterations in the worst
case.
@straight-shoota straight-shoota added this to the 1.13.0 milestone Jun 7, 2024
end

# Append null byte
slice[i] = 0_u16
appender << 0_u16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slice#new already guarantees that the returned memory is zeroed out, so this could be removed

..or GC.malloc_atomic could be used to improve performance (+12% for me).

Copy link
Member Author

@straight-shoota straight-shoota Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..or GC.malloc_atomic could be used to improve performance (+12% for me).

That's unexpected. Slice(UInt16).new should use malloc_atomic implicitly.
This is not directly obvious, because there's some indirection through the compiler. Pointer.malloc is a primitive and the compiler decides whether that calls GC.malloc or GC.malloc_atomic depending on the type (whether it could contain internal pointers or not).

Can you share the patch for +12% performance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slice(UInt16).new should use malloc_atomic implicitly.

Yes, but crystal always zeroes out the memory, even if the GC already guarantees that the memory was zeroed out:

memset pointer, int8(0), size_t(size)

So atomically allocated memory is always cleared once, non-atomic memory is always cleared twice.
In this case, the memory does not need clearing at all.

Can you share the patch for +12% performance?

- slice = Slice(UInt16).new(u16_size + 1)
+ slice = Slice(UInt16).new(GC.malloc_atomic(sizeof(UInt16).to_u64! &* (u16_size + 1)).as(UInt16*), u16_size + 1)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. It's not atomic alloc but the clearing which makes a difference and we can prevent that with GC.malloc_atomic.

Copy link
Member Author

@straight-shoota straight-shoota Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave this as is.
Explicitly adding a trailing null shouldn't have any significant impact. So it doesn't matter to do it despite the memory being zeroed out. However, it makes sure the algorithm works correctly with non-zeroed memory. We should discuss this in a separate issue (#14687).

@straight-shoota straight-shoota merged commit ef04b2e into crystal-lang:master Jun 12, 2024
61 checks passed
@straight-shoota straight-shoota deleted the refactor/string-to_utf16 branch June 12, 2024 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants