-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deque subscript _modify accessor heap allocates #164
Comments
The latest nightly build on Linux successfully inlines the |
(Latest nightly on Linux is |
The same is true in the latest nightly on macOS, the |
The 5.7 nightlies, however, still allocate. |
I wonder what change in the compiler made it capable of inlining this and optimizing out the malloc. Might be worth making sure that there was an adequate test case added in whatever PR that was in the compiler repo to ensure this optimization doesn’t regress. |
We have very limited options here to work around this. Would marking the accessor |
Hm, force-inlining the accessor does appear to help in Release builds. The code size implications are somewhat worrying, but I'm not sure what else can we do on this end. I can try moving the preparations & cleanup into separate inlinable functions, perhaps that would be enough. |
Yep, that seems to help: @inlinable
public subscript(index: Int) -> Element {
...
@inline(__always) // https://github.com/apple/swift-collections/issues/164
_modify {
precondition(index >= 0 && index < count, "Index out of bounds")
var (slot, value) = _prepareForModify(at: index)
defer {
_finalizeModify(slot, value)
}
yield &value
}
}
@inlinable
internal mutating func _prepareForModify(at index: Int) -> (_Slot, Element) {
_storage.ensureUnique()
// We technically aren't supposed to escape storage pointers out of a
// managed buffer, so we escape a `(slot, value)` pair instead, leaving
// the corresponding slot temporarily uninitialized.
return _storage.update { handle in
let slot = handle.slot(forOffset: index)
return (slot, handle.ptr(at: slot).move())
}
}
@inlinable
internal mutating func _finalizeModify(_ slot: _Slot, _ value: Element) {
_storage.update { handle in
handle.ptr(at: slot).initialize(to: value)
}
}
|
CC: @eeckstein |
Not inlined co-routines almost always allocate. There is not much we can do here. |
I'm inclined to say we should investigate that idea: the cost of the coroutine allocations ends up being really painful. |
Coroutine lowering is currently pretty naive and allocates a lot more than it needs to. We have a whole bunch of issues tracking various improvements to the lowering that would reduce the likelihood of allocation without relying on inlining. |
But still, the inline buffer size (which is 2 or 3 words) is a hard limit. Unless we increase that, it's still very likely that a co-routine will allocate. |
Consider the following simple function (the operation is not useful, it's just intended to be complex enough to defeat being optimized away and simple enough to make the generated assembly simple):
When compiled in release mode using swift-collections 1.0.2 and current
main
(c3fdcf7
), this generates the following assembly on my arm64 Mac:As you can see,
_modify
has not been inlined but it has been specialized. As a result, the call to the continuation has been split and delayed (visible asblr x8
)._modify
has compiled to:Notice the call to
malloc
at the start of the function. This call drastically dominates the performance of_modify
, and it makesDeque
extremely difficult to use in performance sensitive code.Information
main
(c3fdcf7
)swift-driver version: 1.45.2 Apple Swift version 5.6.1 (swiftlang-5.6.0.323.66 clang-1316.0.20.12) Target: arm64-apple-macosx12.0
Checklist
main
branch of this package.Steps to Reproduce
Use the above code sample in an empty Swift project that depends on swift-collections.
Expected behavior
No allocations in
_modify
.Actual behavior
A direct call to
malloc
and a subsequent call tofree
.The text was updated successfully, but these errors were encountered: