Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison operators #702

Merged
merged 17 commits into from
Sep 24, 2021
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 62 additions & 31 deletions proposals/p0702.md
Original file line number Diff line number Diff line change
@@ -23,6 +23,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Precedence](#precedence)
- [Associativity](#associativity)
- [Conversions](#conversions)
- [Performance](#performance)
- [Overloading](#overloading)
- [Default implementations for basic types](#default-implementations-for-basic-types)
- [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
@@ -32,7 +33,6 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Convert operands like C++](#convert-operands-like-c)
- [Provide a three-way comparison operator](#provide-a-three-way-comparison-operator)
- [Allow comparisons as the operand of `not`](#allow-comparisons-as-the-operand-of-not)
- [Disallow relational comparisons of Boolean values](#disallow-relational-comparisons-of-boolean-values)

<!-- tocstop -->

@@ -165,10 +165,10 @@ if (m > 1 == n > 1) {}

### Conversions

When both operands are of standard Carbon numeric types (`Int(n)` or
`Float(n)`), no conversions are performed on either operand, and the result is
the mathematically correct result for that comparison, or `False` if either
operand is a NaN. For example:
When both operands are of standard Carbon numeric types (`Int(n)`,
`Unsigned(n)`, or `Float(n)`), no conversions are performed on either operand,
and the result is the mathematically correct result for that comparison, or
`False` if either operand is a NaN. For example:

```
// The value of `v` is True, because `a` is less than `b`, even though the
@@ -184,13 +184,39 @@ let w: Bool = f == n;
```

An equivalent viewpoint is that the comparison is performed in a hypothetical
suffiicently large type. For example, a comparison of `i32` against `u32` can be
sufficiently large type. For example, a comparison of `i32` against `u32` can be
performed in `i64`, and a comparison of `f32` against `i32` can be performed in
`f64`. However, no such type is required to actually exist.

Note that this diverges from C++, which would convert both operands to a common
type first, sometimes performing a lossy conversion.

#### Performance

The choice to not convert has a performance impact in practice, because it
exposes operations that some processors do not currently directly support.
[Sample microbenchmarks](https://godbolt.org/z/dfGe4MhEx) for implementations of
several operations show the following performance on x86_64 (use the Quick-bench
link in Compiler Explorer to run the benchmarks):

| Operation | Mathematical comparison time | C++ comparison time | Ratio |
| --------------- | ---------------------------- | ------------------- | ----- |
| `i64 < u64` | 2814 | 992 | 2.8x |
| `u64 < i64` | 1957 | 1012 | 1.9x |
| `f64 == i64` | 4996 | 2197 | 2.3x |
| `f64 < i64` (a) | 2012 | 2841 | 0.7x |
| `f64 < i64` (b) | 5332 | 2647 | 2.0x |

The mathematical code sequence used for `f64 < i64` introduces a branch around a
slow path; in the benchmark, that branch should never be mispredicted. Line (a)
demonstrates the fast path and line (b) demonstrates the slow path.

The mixed-type operations are typically 2-3x slower than the same-type
operations. However, this is a predictable performance change, and can be
controlled by the developer by converting the operands to a suitable type prior
to the conversion if a faster same-type comparison is preferred over a correct
mixed-type comparison.

### Overloading

Separate interfaces will be provided to permit overloading equality and
@@ -210,21 +236,24 @@ proposal. As non-binding design guidance for such a proposal:

### Default implementations for basic types

In addition to being defined for standard Carbon numeric types, equality
comparisons are also defined for all "data" types:
In addition to being defined for standard Carbon numeric types, equality and
relational comparisons are also defined for all "data" types:

- Tuples.
- Structs (structural data classes).
- Classes implementing an interface that identifies them as data classes.

In addition, relational comparisons are defined for tuples, and provide a
lexicographical ordering.
Relational comparisons for these types provide a lexicographical ordering. This
proposal defers to
[#561](https://github.com/carbon-language/carbon-lang/pull/561) for details on
zygoloid marked this conversation as resolved.
Show resolved Hide resolved
comparison support for classes.

In each case, the ordering is only available if it is supported by all element
In each case, the comparison is only available if it is supported by all element
types.

The `Bool` type supports equality comparisons and relational comparisons. For
relational comparisons, `False` is treated as being less than `True`.
The `Bool` type should be treated as a choice type, and so should support
equality comparisons and relational comparisons if and only if choice types do
in general. That decision is left to a future proposal.

## Rationale based on Carbon's goals

@@ -266,6 +295,8 @@ Disadvantages:
- Unfamiliar to C++ programmers.
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
- `a /= b` would likely be expected to mean an `a = a / b` compound
assignment.
- Breaks consistency with Python, which uses `not` for logical negation and
`!=` for inequality comparison.

We could use `=/=` instead of `!=` for not-equal comparisons.

@@ -275,7 +306,8 @@ Advantages:

Disadvantages:

- This would be inventive and unlike all other languages.
- This would be inventive and unlike all other languages. As above, breaks
consistency with Python.
- This would make `=/=` one character longer, and harder to type on US-ASCII
keyboards because the keys are distant but likely to be typed with the same
finger.
@@ -287,6 +319,7 @@ We could support Python-like chained comparisons.
Advantages:

- Small ergonomic improvement for range comparisons.
josh11b marked this conversation as resolved.
Show resolved Hide resolved
- Middle operand is evaluated only once.

Disadvantages:
jonmeow marked this conversation as resolved.
Show resolved Hide resolved

@@ -295,7 +328,17 @@ Disadvantages:
changing the semantics of the operator expression as we can no longer move
from the operand.
- Both short-circuiting behavior and non-short-circuiting behavior will be
surprising and unintuitive to some.
surprising and unintuitive to some. The short-circiuting option will
introduce control flow without a keyword to announce it, which goes against
our design decision to use a keyword for `and` and `or` to announce the
control flow. The non-short-circuiting option will evaluate subexpressions
unnecessarily, which creates a tension with our performance goal.
- Experienced C++ developers may expect a different behavior, such as
`a < b == cmp` comparing the result of `a < b` against the Boolean value
`cmp`.

See also the ongoing discussion in
[#451](https://github.com/carbon-language/carbon-lang/issues/451).

### Convert operands like C++

@@ -307,8 +350,10 @@ Advantages:
- May ease migration from C++.
- May allow programmers to reuse some intuition, for example when comparing
floating-point values against integer values.
- May allow more efficient machine code to be generated for source code that
- Allows more efficient machine code to be generated for source code that
takes no special care about the types of comparison operands.
- Improves performance predictability for C++ developers unfamiliar with
Carbon's rules.

Disadvantages:

@@ -343,19 +388,5 @@ Advantages:
Disadvantages:

- Introduces ambiguity when comparing Boolean values: `not cond1 == cond2`
might intend to compare `not cond1` to `cond2` rather than cmoparing
might intend to compare `not cond1` to `cond2` rather than comparing
`cond1 != cond2`.

### Disallow relational comparisons of Boolean values

We could disallow ordered comparisons of Boolean values.

Advantages:

- Disallows an operation that might be unintended.

Disadvantages:

- Disallows an operation that might be intended.
- Likely to make `Bool` behave differently from discriminated union types,
which are likely to be treated as data types.