- Proposal: SE-0228
- Authors: Becca Royal-Gordon, Michael Ilseman
- Review Manager: Doug Gregor
- Status: Implemented (Swift 5.0)
- Review: Discussion thread, Announcement thread
- Implementation: apple/swift#20214
String interpolation is a simple and powerful feature for expressing complex, runtime-created strings, but the current version of the ExpressibleByStringInterpolation
protocol has been deprecated since Swift 3. We propose a new design that improves its performance, clarity, and efficiency.
Swift-evolution thread: [Draft] Fix ExpressibleByStringInterpolation, String interpolation revamp, String interpolation revamp: design decisions
An interpolated string literal contains one or more embedded expressions, delimited by \(
and )
. At runtime, these expressions are evaluated and concatenated with the string literal to produce a value. They are typically more readable than code that switches between string literals, concatenation operators, and arbitrary expressions.
Like most literal features in Swift, interpolated string literals are implemented with a protocol, ExpressibleByStringInterpolation
. However, this protocol has been known to have issues since Swift 3, so it is currently deprecated.
We see three general classes of types that might want to conform to ExpressibleByStringInterpolation
:
-
Simple textual data: Types that represent simple, unconstrained text, like
Swift.String
itself. String types from foreign languages (likeJavaScriptCore.JSValue
) and alternative representations of strings (like a hypotheticalASCIIString
type) might also want to participate in string interpolation. -
Structured textual data: Types that represent text but have some additional semantics. For example, a
Foundation.AttributedString
type might allow you to interpolate dictionaries to set or clear attributes. This gist’sLocalizableString
type creates a format string, which can be looked up in aFoundation.Bundle
’s localization tables. -
Machine-readable code fragments: Types that represent data in a format that will be understood by a machine, like
SQLKit.SQLStatement
or this blog post’sSanitizedHTML
. These types often require data included in them to be escaped or passed out-of-band; a goodExpressibleByStringInterpolation
design might allow this to be done automatically without the programmer having to do anything explicit. They may only support specific types, or may want to escape by default but also have a way to insert unescaped data.
The current design handles simple textual data, but struggles to support structured textual data and machine-readable code fragments.
The compiler parses a string literal into a series of segments, each of which is either a literal segment containing characters and escapes, or an interpolated segment containing an expression to be interpolated. If there is more than one segment, it wraps each segment in a call to init(stringInterpolationSegment:)
, then wraps all of the segments together in a call to init(stringInterpolation:)
:
// Semantic expression for: "hello \(name)!"
String(stringInterpolation:
String(stringInterpolationSegment: "hello "),
String(stringInterpolationSegment: name),
String(stringInterpolationSegment: "!"))
The type checker considers all overloads of init(stringInterpolationSegment:)
, not just the one that implements the protocol requirement. Swift.String
uses this to add fast paths for types conforming to CustomStringConvertible
and TextOutputStreamable
.
The current design is inefficient and inflexible for conformers. It does not permit special handling such as formatting or interpolation options.
Each init(stringInterpolationSegment:)
call creates a temporary instance of Self
; these instance are then concatenated together. Depending on the conformer or the segment size, this may trigger a heap allocation and ARC overhead for every single interpolated segment.
Furthermore, while the compiler knows the sizes and numbers of literal and interpolated segments, this is not communicated to the conformer.
If size information were available to the conformer, they could estimate the final size of the value and preallocate capacity. If segments were not converted to Self
before concatenation, their data could be directly written to this preallocated capacity without using temporary instances.
The current approach does not permit conformers to specify additional parameters or options to govern the evaluation of an interpolated expression. Many conformers may want to provide alternative interpolation behaviors, such as disabling escaping in SanitizedHTML
. Others may want to accept options, like controlling the format string used in a LocalizableString
. String
itself would like to support a format argument eventually.
init(stringInterpolationSegment:)
takes an unconstrained generic value, so its parameter can be of any type. However, some conformers may want to limit the types that can be interpolated. For example, SQLKit.SQLStatement
can only bind certain types, like integers and strings, to a SQL statement’s parameters.
This unconstrained generic parameter causes a second problem: when a literal is passed to init(stringInterpolationSegment:)
, it defaults to forming a String
, the default literal type. This deviates from the standard library’s common practice of allowing the conformer to supply a literal type for use.
Finally, the conformer cannot easily determine whether an incoming segment was from a literal or an expression without resorting to hacks baking in compiler-internal details.
Compiler-internal details
An init(stringInterpolationSegment:)
implementation cannot determine whether its parameter is a literal segment or an interpolated segment. However, the init(stringInterpolation:)
call can exploit a compiler quirk to do so: the parser always generates a literal segment first, and always alternates between literal and interpolated segments (generating empty literal segments if necessary), so the position of a segment can tell you whether it is literal or interpolated.
Needless to say, this is the sort of obscure implementation detail we don’t want users to depend upon. And preserving enough data for init(stringInterpolation:)
to treat a segment as either type often requires conformers to add extra properties or otherwise alter the type's design purely to support string interpolation.
If semantic analysis simply generated the semantic expression and then type-checked it normally, many string interpolations would be too complex to type-check. Instead, it type-checks each segment separately, then creates the init(stringInterpolationSegment:)
call for the segment and type-checks just the one call to resolve its overload.
String interpolation is the only remaining client of this type-checker entry point; we want to get rid of it.
An improved string interpolation design could open many doors for future functionality in the standard library, in framework overlays, and in user code. To illustrate, here are some things we could use it for in code shipped with the Swift compiler. We're not proposing any of this, and any future proposal might look different—we're just demonstrating what's possible.
There are a number of approaches we could take to formatting values interpolated into strings. Here are a few examples with numbers:
// Use printf-style format strings:
"The price is $\(cost, format: "%.2f")"
// Use UTS #35 number formats:
"The price is \(cost, format: "¤###,##0.00")"
// Use Foundation.NumberFormatter, or a new type-safe native formatter:
"The price is \(cost, format: moneyFormatter)"
// Mimic String.init(_:radix:uppercase:)
"The checksum is 0x\(checksum, radix: 16)"
You could imagine analogous formatting tools for other types, like Data
, Date
, or even just String
itself.
Some logging facilities restrict the kinds of data that can be logged or require extra metadata on certain values; a more powerful interpolation feature could support that:
log("Processing \(public: tagName) tag containing \(private: contents)")
NSAttributedString
or a value-type wrapper around it could allow users to interpolate dictionaries of attributes to enable and disable them:
"\([.link: supportURL])Click here\([.link: nil]) to visit our support site"
A LocalizableString
type could be expressed by a string literal, which would be used to generate a format string key and a list of arguments; converting a LocalizableString
to an ordinary String
would look up the key in a Bundle
's localization table, then format the value with the arguments.
// Builds a LocalizableString(key: "The document “%@” could not be saved.", arguments: [name])
let message: LocalizableString = "The document “\(name)” could not be saved."
alert.messageText = String(localized: message)
We propose completely reworking the currently-deprecated ExpressibleByStringInterpolation
as follows (doc comments omitted for brevity):
public protocol ExpressibleByStringInterpolation
: ExpressibleByStringLiteral {
associatedtype StringInterpolation : StringInterpolationProtocol
= String.StringInterpolation
where StringInterpolation.StringLiteralType == StringLiteralType
init(stringInterpolation: StringInterpolation)
}
public protocol StringInterpolationProtocol {
associatedtype StringLiteralType : _ExpressibleByBuiltinStringLiteral
init(literalCapacity: Int, interpolationCount: Int)
mutating func appendLiteral(_ literal: StringLiteralType)
// Informal requirement: mutating func appendInterpolation(...)
}
An interpolated string will be converted into code that:
-
Initializes an instance of an associated
StringInterpolation
type, passing the total literal segment size and interpolation count as parameters. -
Calls its
appendLiteral(_:)
method to append literal values, andappendInterpolation
to append its interpolated values, one at a time. Interpolations are treated as call parentheses—that is,\(x, with: y)
becomes a call toappendInterpolation(x, with: y)
. -
Passes the instance to
init(stringInterpolation:)
to produce a final value.
Below is code roughly similar to what the compiler would generate:
// Semantic expression for: "hello \(name)!"
String(stringInterpolation: {
var temp = String.StringInterpolation(literalCapacity: 7, interpolationCount: 1)
temp.appendLiteral("hello ")
temp.appendInterpolation(name)
temp.appendLiteral("!")
return temp
}())
We have written a few examples of conforming types.
This design has been implemented in apple/swift#18590.
The associated StringInterpolation
type is a sort of buffer or scratchpad where the value of an interpolated string literal is accumulated. By having it be an associated type, rather than Self
as it currently is, we realize a few benefits:
-
A new type can serve as a namespace for the various
appendLiteral
andappendInterpolation
methods. This allows conformers to add new interpolation methods without them showing up in code completion, documentation, etc. -
A separate type can store extra temporary state involved in the formation of the result. For example,
Foundation.AttributedString
might need to track the current attributes in a property; a type backed by a parsed data structure, like aLambdaCalculusExp
orRegexp
type, could store an unparsed string or parser state. When a type does not need any extra state, the associated type does not add any overhead. -
Several different types can share an implementation. For instance,
String
andSubstring
both use a commonStringInterpolationProtocol
-conforming type.
The standard library will provide a DefaultStringInterpolation
type; StringProtocol
, and therefore String
and Substring
, will use this type for their interpolation. (Substring
did not previously permit interpolation.)
The standard library will also provide two sets of default implementations:
-
For types using
DefaultStringInterpolation
, it will provide a defaultinit(stringInterpolation:)
that extracts the value after interpolation and forwards it toinit(stringLiteral:)
. Thus, types that currently conform toExpressibleByStringLiteral
and useString
as their literal type can add simple interpolation support by merely changing their conformance toExpressibleByStringInterpolation
. -
For other types, it will provide a default
init(stringLiteral:)
that constructs aSelf.StringInterpolation
instance, calls itsappendLiteral(_:)
method, and forwards it toinit(stringInterpolation:)
. (An unavailable or deprecatedinit(stringLiteral:)
will ensure that this is never used with theinit(stringInterpolation:)
provided forDefaultStringInterpolation
-using types, which would cause infinite recursion.)
StringInterpolation
types must conform to a StringInterpolationProtocol
, which requires the init(literalCapacity:interpolationCount:)
and appendLiteral(_:)
methods.
Non-literal segments are restricted at compile time to the overloads of appendInterpolation
supplied by the conformer. This allows conforming types to restrict the values that can be interpolated into them by implementing only methods that accept the types they want to support. appendInterpolation
can be overloaded to support several unrelated types.
appendInterpolation
methods can specify any parameter signature they wish. An appendInterpolation
method can accept multiple parameters (with or without default values), can require a label on any parameter (including the first one), and can have variadic parameters. appendInterpolation
methods can also throw; if one does, the string literal must be covered by a try
, try?
, or try!
keyword. Future work includes enhancing String to accept formatting control.
While this part of the design gives us great flexibility, it does introduce an implicit relationship between the compiler and ad-hoc methods declared by the conformer. It also restricts what values can be interpolated in a context generic over StringInterpolationProtocol
, though further constraints can lift this restriction.
Even though there is no formal requirement listed in the protocol, we have modified the compiler to emit an error if a StringInterpolationProtocol
-conforming type does not have at least one overload of appendInterpolation
that is as public as the type, does not return a value (or returns a discardable value), and is not static.
Interpolations will be parsed as argument lists; labels and multiple parameters will be permitted, but trailing closures will not.
This change is slightly source-breaking: a 4.2 interpolation like \(x, y)
, which tries to interpolate a tuple, would need to be written \((x, y))
. While we could address un-labeled tuples with n-arity overloads of appendInterpolation
, labeled tuples would still break. We emulate the current behavior in Swift 4.2 mode, and we can easily correct it during migration to Swift 5.
We will add ExpressibleByStringInterpolation
conformance to StringProtocol
, and thus to Susbtring
, allowing interpolations in string literals used to create Substring
s.
We will add TextOutputStreamable
conformances to Float
, Double
, and Float80
, along with an underscored, defaulted method for writing raw ASCII buffers to TextOutputStream
s. These changes together reduce a regression in Float
interpolation benchmarks and completely reverse regressions in Double
and Float80
interpolation benchmarks.
The `DefaultStringInterpolation` type
The standard library uses make()
to extract the final value; CustomStringConvertible
is provided as a public equivalent for types that want to use DefaultStringInterpolation
but do some processing in their init(stringInterpolation:)
implementation.
/// Represents a string literal with interpolations while it is being built up.
///
/// Do not create an instance of this type directly. It is used by the compiler
/// when you create a string using string interpolation. Instead, use string
/// interpolation to create a new string by including values, literals,
/// variables, or expressions enclosed in parentheses, prefixed by a
/// backslash (`\(`...`)`).
///
/// let price = 2
/// let number = 3
/// let message = "If one cookie costs \(price) dollars, " +
/// "\(number) cookies cost \(price * number) dollars."
/// print(message)
/// // Prints "If one cookie costs 2 dollars, 3 cookies cost 6 dollars."
///
/// When implementing an `ExpressibleByStringInterpolation` conformance,
/// set the `StringInterpolation` associated type to `DefaultStringInterpolation`
/// to get the same interpolation behavior as Swift's built-in `String` type and
/// construct a `String` with the results. If you don't want the default behavior
/// or don't want to construct a `String`, use a custom type conforming to
/// `StringInterpolationProtocol` instead.
///
/// Extending default string interpolation behavior
/// ===============================================
///
/// Code outside the standard library can extend string interpolation on
/// `String` and many other common types by extending
/// `DefaultStringInterpolation` and adding an `appendInterpolation(...)`
/// method. For example:
///
/// extension DefaultStringInterpolation {
/// fileprivate mutating func appendInterpolation(
/// escaped value: String, asASCII forceASCII: Bool = false) {
/// for char in value.unicodeScalars {
/// appendInterpolation(char.escaped(asASCII: forceASCII))
/// }
/// }
/// }
///
/// print("Escaped string: \(escaped: string)")
///
/// See `StringInterpolationProtocol` for details on `appendInterpolation`
/// methods.
///
/// `DefaultStringInterpolation` extensions should add only `mutating` members
/// and should not copy `self` or capture it in an escaping closure.
@_fixed_layout
public struct DefaultStringInterpolation: StringInterpolationProtocol {
/// The string contents accumulated by this instance.
@usableFromInline
internal var _storage: String = ""
/// Creates a string interpolation with storage pre-sized for a literal
/// with the indicated attributes.
///
/// Do not call this initializer directly. It is used by the compiler when
/// interpreting string interpolations.
@inlinable
public init(literalCapacity: Int, interpolationCount: Int) {
let capacityPerInterpolation = 2
let initialCapacity = literalCapacity + interpolationCount * capacityPerInterpolation
_storage.reserveCapacity(initialCapacity)
}
/// Appends a literal segment of a string interpolation.
///
/// Do not call this method directly. It is used by the compiler when
/// interpreting string interpolations.
@inlinable
public mutating func appendLiteral(_ literal: String) {
_storage += literal
}
/// Interpolates the given value's textual representation into the
/// string literal being created.
///
/// Do not call this method directly. It is used by the compiler when
/// interpreting string interpolations. Instead, use string
/// interpolation to create a new string by including values, literals,
/// variables, or expressions enclosed in parentheses, prefixed by a
/// backslash (`\(`...`)`).
///
/// let price = 2
/// let number = 3
/// let message = "If one cookie costs \(price) dollars, " +
/// "\(number) cookies cost \(price * number) dollars."
/// print(message)
/// // Prints "If one cookie costs 2 dollars, 3 cookies cost 6 dollars."
@inlinable
public mutating func appendInterpolation<T: TextOutputStreamable & CustomStringConvertible>(_ value: T) {
value.write(to: &_storage)
}
/// Interpolates the given value's textual representation into the
/// string literal being created.
///
/// Do not call this method directly. It is used by the compiler when
/// interpreting string interpolations. Instead, use string
/// interpolation to create a new string by including values, literals,
/// variables, or expressions enclosed in parentheses, prefixed by a
/// backslash (`\(`...`)`).
///
/// let price = 2
/// let number = 3
/// let message = "If one cookie costs \(price) dollars, " +
/// "\(number) cookies cost \(price * number) dollars."
/// print(message)
/// // Prints "If one cookie costs 2 dollars, 3 cookies cost 6 dollars."
@inlinable
public mutating func appendInterpolation<T: TextOutputStreamable>(_ value: T) {
value.write(to: &_storage)
}
/// Interpolates the given value's textual representation into the
/// string literal being created.
///
/// Do not call this method directly. It is used by the compiler when
/// interpreting string interpolations. Instead, use string
/// interpolation to create a new string by including values, literals,
/// variables, or expressions enclosed in parentheses, prefixed by a
/// backslash (`\(`...`)`).
///
/// let price = 2
/// let number = 3
/// let message = "If one cookie costs \(price) dollars, " +
/// "\(number) cookies cost \(price * number) dollars."
/// print(message)
/// // Prints "If one cookie costs 2 dollars, 3 cookies cost 6 dollars."
@inlinable
public mutating func appendInterpolation<T: CustomStringConvertible>(_ value: T) {
_storage += value.description
}
/// Interpolates the given value's textual representation into the
/// string literal being created.
///
/// Do not call this method directly. It is used by the compiler when
/// interpreting string interpolations. Instead, use string
/// interpolation to create a new string by including values, literals,
/// variables, or expressions enclosed in parentheses, prefixed by a
/// backslash (`\(`...`)`).
///
/// let price = 2
/// let number = 3
/// let message = "If one cookie costs \(price) dollars, " +
/// "\(number) cookies cost \(price * number) dollars."
/// print(message)
/// // Prints "If one cookie costs 2 dollars, 3 cookies cost 6 dollars."
@inlinable
public mutating func appendInterpolation<T>(_ value: T) {
_print_unlocked(value, &_storage)
}
/// Creates a string from this instance, consuming the instance in the process.
@inlinable
internal __consuming func make() -> String {
return _storage
}
}
extension DefaultStringInterpolation: CustomStringConvertible {
@inlinable
public var description: String {
return _storage
}
}
Generating the append calls
This design puts every appendLiteral(_:)
and appendInterpolation
call in its own statement, so there’s no need for special type checker treatment. Each interpolation will naturally be type-checked separately, and the overloads of appendInterpolation
will be resolved at the same time as the value being interpolated. This helps us with ongoing refactoring of the type checker.
Due to issues with capturing of partially initialized variables, we do not enclose these statements in a closure. Instead, we use a new kind of AST node.
Performance
While some string interpolation benchmarks show regressions of 20–30%, most show improvements, sometimes dramatic ones.
Benchmark | -O speed improvement | -Osize speed improvement |
---|---|---|
StringInterpolationManySmallSegments |
2.15x | 1.80x |
StringInterpolationSmall |
2.01x | 2.03x |
ArrayAppendStrings |
1.16x | 1.14x |
FloatingPointPrinting_Double_interpolated |
1.15x | 1.16x |
FloatingPointPrinting_Float80_interpolated |
1.09x | 1.08x |
StringInterpolation |
0.82x | 0.79x |
FloatingPointPrinting_Float_interpolated |
0.82x | 0.73x |
The StringInterpolation
benchmark's regression is caused by the specific sizes of literal and interpolated segment sizes; in the new design, these happen to cause the benchmark to grow its buffer an extra time. We don't think it's representative of the design's performance.
Initially, all three FloatingPointPrinting_<type>_interpolated
tests regressed with the new design. We conformed these types to TextOutputStreamable
and added a private ASCII-only fast path in TextOutputStream
; this increased the performance of Double
and Float80
to be small improvements, but did little to help Float
.
Benchmark code size slightly improved on average:
Benchmark file | -O size improvement | -Osize size improvement |
---|---|---|
StringInterpolation.o | 1.18x | 1.16x |
FloatingPointPrinting.o | 1.12x | 1.11x |
All files with notable changes | 1.02x | 1.02x |
So did Swift library code size:
Library | Size improvement |
---|---|
libswiftSwiftPrivateLibcExtras.dylib | 1.20x |
libswiftFoundation.dylib | 1.15x |
libswiftXCTest.dylib | 1.10x |
libswiftStdlibUnittest.dylib | 1.06x |
libswiftCore.dylib. | 1.04x |
libswiftNetwork.dylib | 1.02x |
libswiftSwiftOnoneSupport.dylib | 1.02x |
libswiftsimd.dylib | 1.01x |
libswiftMetal.dylib | 0.90x |
libswiftSwiftReflectionTest.dylib | 0.92x |
We believe the current results already look pretty good, and further performance tuning is possible in the future. Other types can likely improve interpolation performance using TextOutputStreamable
. Overall, this design has nowhere to go but up.
The default init(stringLiteral:)
(which is only used for types implementing fully custom string interpolation) is currently about 0.5x the speed of a manually-implemented init(stringLiteral:)
, but prototyping indicates that inlining certain fast paths from String.reserveCapacity(_:)
and String.append(_:)
can reduce that penalty to 0.93x, and we may be able to squeeze out gains beyond that. Even if we cannot close this gap completely, performance-sensitive types can always implement init(stringLiteral:)
manually.
Since ExpressibleByStringInterpolation
has been deprecated since Swift 3, we need not maintain source compatibility with existing conformances, nor do we propose preserving existing conformances to ExpressibleByStringInterpolation
even in Swift 4 mode.
We do not propose preserving existing init(stringInterpolation:)
or init(stringInterpolationSegment:)
initializers, since they have always been documented as calls that should not be used directly. However, the source compatibility suite contains code that accidentally uses init(stringInterpolationSegment:)
by writing String.init
in a context expecting a CustomStringConvertible
or TextOutputStreamable
type. We have devised a set of overloads to init(describing:)
that will match these accidental, implicit uses of init(stringInterpolationSegment:)
without preserving explicit uses of init(stringInterpolationSegment:)
.
We propose a set of String.StringInterpolation.appendInterpolation
overloads that exactly match the current init(stringInterpolationSegment:)
overloads, so “normal” interpolations will work exactly as before.
“Strange” interpolations like \(x, y)
or \(foo: x)
, which are currently accepted by the Swift compiler will be errors in Swift 5 mode. In Swift 4.2 mode, we will preserve the existing behavior with a warning; this means that Swift 4.2 code will only be able to use appendInterpolation
overloads with a single unlabeled parameter, unless all other parameters have default values. Migration involves inserting an extra pair of parens or removing an argument label to preserve behavior.
ExpressibleByStringInterpolation
will need to be ABI-stable starting in Swift 5; we should adopt this proposal or some alternative and un-deprecate ExpressibleByStringInterpolation
before that.
This API is pretty foundational and it would be difficult to change compatibly in the future.
We considered several designs that, like the current design, passed segments to a variadic parameter. For example, we could wrap literal segments in init(stringLiteral:)
instead of init(stringInterpolationSegment:)
and otherwise keep the existing design:
String(stringInterpolation:
String(stringLiteral: "hello "),
String(stringInterpolationSegment: name),
String(stringLiteral: "!"))
Or we could use an enum to differentiate literal segments from interpolated ones:
String(stringInterpolation:
.literal("hello "),
.interpolation(String.StringInterpolationType(name)),
.literal("!"))
However, this requires that conformers expose a homogenous return value, which has expressibility and/or efficiency drawbacks. The proposed approach, which is statement based, keeps this as a detail internal to the conformer.
We considered having a formal appendInterpolation(_:)
requirement with an unconstrained generic parameter to mimic current behavior. We could even have a default implementation that vends strings and still honors overloading.
However, we would have to give up on conformers being able to restrict the types or interpolation segment forms permitted.