Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

And, or, not #680

Merged
merged 8 commits into from
Aug 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions proposals/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,5 +63,6 @@ request:
- [0618 - var ordering](p0618.md)
- [0623 - Require braces](p0623.md)
- [0646 - Low context-sensitivity principle](p0646.md)
- [0680 - And, or, not](p0680.md)

<!-- endproposals -->
358 changes: 358 additions & 0 deletions proposals/p0680.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,358 @@
# And, or, not

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/680)

<!-- toc -->

## Table of contents

- [Problem](#problem)
- [Background](#background)
- [Usage in existing languages](#usage-in-existing-languages)
- [Known issues with punctuation operators](#known-issues-with-punctuation-operators)
- [Proposal](#proposal)
- [Details](#details)
- [Precedence](#precedence)
- [Associativity](#associativity)
- [Conversions](#conversions)
- [Overloading](#overloading)
- [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
- [Alternatives considered](#alternatives-considered)
- [Use punctuation spelling for all three operators](#use-punctuation-spelling-for-all-three-operators)
- [Precedence of AND versus OR](#precedence-of-and-versus-or)
- [Precedence of NOT](#precedence-of-not)
- [Punctuation form of NOT](#punctuation-form-of-not)
- [Two forms of NOT](#two-forms-of-not)
- [Repeated NOT](#repeated-not)
- [AND and OR produce the decisive value](#and-and-or-produce-the-decisive-value)

<!-- tocstop -->

## Problem

Logical AND, OR, and NOT are important building blocks for working with Boolean
values. Carbon should support them.

## Background

### Usage in existing languages

Most mainstream programming languages use one (or both) of two syntaxes for
these operators:

- `&&`, `||`, and `!`. The `&&` and `||` operators were
[first used by C](https://www.bell-labs.com/usr/dmr/www/chist.html) to
distinguish short-circuiting behavior from that of regular operators. They
are now seen in C, C++, C#, Swift, Rust, Haskell, and many more languages.
`&&` and `||` invariably short-circuit.
- `and`, `or`, and `not`. These are often used by languages with more emphasis
on readability and scripting, such as Python, Pascal, Nim, SQL, and various
variants of BASIC. In Python, the `and` and `or` operators short-circuit; in
Pascal and Visual Basic, they do not, but variants of them do:

- `and then` and `or else` in Pascal short-circuit.
- `AndAlso` and `OrElse` in Visual Basic short-circuit.

C++ recognizes `and`, `or`, and `not` as keywords and treats them as lexical
synonyms for `&&`, `||`, and `!`. C provides an `<iso646.h>` standard header
that exposes `and`, `or`, and `not` as macros expanding to `&&`, `||`, and
`!`.

Perl, Ruby, and Raku permit both syntaxes, giving the punctuation forms higher
precedence and the keyword forms lower precedence. Both forms short-circuit.
Raku also provides `andthen` and `orelse`, but these differ from `and` and `or`
in what value is produced on short-ciruiting, not in whether they short-ciruit.

Zig provides `and`, `or`, and `!`.

### Known issues with punctuation operators

The use of `&&` and `||` for short-circuiting logical operators is a common
source of error due to the potential for confusion with the bitwise `&` and `|`
operators. See:

- [CERT rule EXP46-C: Do not use a bitwise operator with a Boolean-like operand](https://wiki.sei.cmu.edu/confluence/display/c/EXP46-C.+Do+not+use+a+bitwise+operator+with+a+Boolean-like+operand)
- ChromiumOS bug prevents login due to `&&` / `&` typo
([news coverage](https://www.theregister.com/2021/07/23/chromebork_bug_google/)).

We have anecdotal evidence that the `!` operator is hard to see for some
audiences in some contexts, particularly when adjacent to similar characters:
parentheses, `I`, `l`, `1`.

## Proposal

Carbon will provide three operators to support logical operations on Boolean
values.

- `and` provides a short-circuiting logical AND operation.
- `or` provides a short-circuiting logical OR operation.
- `not` provides a logical NOT operation.

Note that these operators are valid alternative spellings in C++; PR
[#682](https://github.com/carbon-language/carbon-lang/pull/682) provides a
demonstration of what this proposal would look like if applied to the C++ code
in the Carbon project.

## Details

`and` and `or` are infix binary operators. `not` is a prefix unary operator.

### Precedence

`and`, `or`, and `not` have very low precedence. When an expression appearing as
the condition of an `if` uses these operators unparenthesized, they are always
the lowest precedence operators in that expression.

These operators permit any reasonable operator that might be used to form a
boolean value as a subexpression. In particular, comparison operators such as
`<` and `==` have higher precedence than all logical operators.

A `not` operator can be used within `and` and `or`, but `and` cannot be used
directly within `or` without parentheses, nor the other way around.

For example:

```
if (n + m == 3 and not n < m) {
```

can be fully parenthesized as

```
if (((n + m) == 3) and (not (n < m))) {
```

and

```
if (cond1 == not cond2) {
// ...
if (cond1 and cond2 or cond3) {
```

are both errors, requiring parentheses.

### Associativity

`and` and `or` are left-associative. A `not` expression cannot be the operand of
another `not` expression -- `not not b` is an error without parentheses.

```
// OK
if (not a and not b and not c) { ... }
if (not (not a)) { ... }

// Error
if (not a or not b and not c) { ... }
if (not not a) { ... }
```

### Conversions

The operand of `and`, `or`, or `not` is converted to a Boolean value in the same
way as the condition of an `if` expression. In particular:

- If we decide that certain values, such as pointers or integers, should not
be usable as the condition of an `if` without an explicit comparison against
null or zero, then those values will also not be usable as the operand of
`and`, `or`, or `not` without an explicit comparison.
- If an extension point is provided to determine how to branch on the truth of
a value in an `if` (such as by supplying a conversion to a Boolean type),
that extension point will also apply to `and`, `or`, and `not`.

### Overloading

The logical operators `and`, `or`, and `not` are not overloadable. As noted
above, any mechanism that allows types to customize how `if` treats them will
also customize how `and`, `or`, and `not` treats them.

## Rationale based on Carbon's goals

- _Code that is easy to read, understand, and write:_

- The choice of `and` and `or` improves readability by avoiding the
potential for visual confusion between `&&` and `&`.
- The choice of `not` improves readability by avoiding the use of a small
and easy-to-miss punctuation character for logical negation.
- Having no precedence rule between `and` and `or` avoids readability
problems with expressions involving both operators, by requiring the use
of parentheses.
- The use of a keyword rather than punctuation for these operators helps
emphasize that they are not regular operators but instead have
control-flow semantics.
- Using the same precedence for `not` as for `and` and `or` allows
conditions to quickly be visually scanned and for the overall structure
and its nested conditions to be identified.

- _Interoperability with and migration from existing C++ code:_
- The use of the operators `and`, `or`, and `not`, which are keywords with
the same meaning in C++, avoids problems with a Carbon keyword colliding
with a valid C++ identifier and allows early adoption of the Carbon
syntax in C++ codebases.
- While these operators are overloadable in C++, `&&` and `||` are nearly
the least-overloaded operators, and `!` is generally only overloaded as
a roundabout way of converting to `bool`. There is no known need for
Carbon code to call overloaded C++ `operator&&`, `operator||`, or
`operator!`, nor for Carbon code to provide such operators to C++. We
will need a mechanism for `if`, `and`, `or`, and `not` to invoke
(possibly-`explicit`) `operator bool` defined in C++, but that's outside
the scope of this proposal.

## Alternatives considered

### Use punctuation spelling for all three operators

We could follow the convention established by C and spell these operators as
`&&`, `||`, and `!`.

Advantage:

- This would improve familiarity for people comfortable with this operator
set.

Disadvantages:

- The use of keywords rather than punctuation gives a hint that `and` and `or`
are not merely performing a computation -- they also affect control flow.
- The `!` operator is known to be hard to see for some readers.
- If we also support `&` and `|` operators, there is a known risk of confusion
between `&&` and `&` and between `||` and `|`.
- If we want to change the precedence rules, using a different syntax can
serve as a reminder that these operators do not behave like the `&&`, `||`,
and `!` operators.
- `&&`, `||`, and `!` are likely to be harder to type than `and`, `or`, and
`not` on at least most English keyboard layouts, due to requiring the use of
the Shift key and reaching outside the letter keys.

### Precedence of AND versus OR

Most languages give their AND operator higher precedence than their OR operator:

```c++
if (a && b || c && d) { ... }
// ... means ...
if ((a && b) || (c && d)) { ... }
```

This makes sense when viewed the right way: `&&` is the multiplication of
Boolean algebra and `||` is the addition, and multiplication usually binds
tighter than addition.

We could do the same. However, this precedence rule is not reliably known by a
significant fraction of developers despite having been present across many
languages for decades, leading to common recommendations to enable compiler
warnings for the first form in the above example, suggesting to rewrite it as
the second form. This therefore fails the test from proposal
[#555: when to add precedence edges](/proposals/p0555.md#when-to-add-precedence-edges).

### Precedence of NOT

We could give the `not` operator high precedence, mirroring the behavior in C++.
In this proposal, the `not` operator has the same precedence rules as `and` and
`or`, which may be inconvenient for some uses. For example:

```
var x: Bool = cond1 == not cond2;
```

is invalid in this proposal and requires parentheses, and

```
var y: Bool = not cond1 == cond2;
```

is equivalent to

```
var y: Bool = cond1 != cond2;
```

which may not be what the developer intended.

However, giving `not` higher precedence would result in a more complex rule and
would break the symmetry between `and`, `or`, and `not`.

### Punctuation form of NOT

We could use the spelling `!` for the NOT operator instead of `not`.

The syntax of the `not` operator may be harder to read than a `!` operator for
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
some uses. For example, when its operand is parenthesized, because it contains a
nested `and` or `or`:

```
if (not (thing1 and thing2) or
not (thing3 and thing4)) {
```

Also, the spelling of the `!=` operator is harder to justify if there is no `!`
operator.
chandlerc marked this conversation as resolved.
Show resolved Hide resolved

However:

- Changing the syntax would break the symmetry between `and`, `or`, and `not`.
- Python uses this set of keywords and has a `!=` operator, and that doesn't
appear to cause confusion in practice.
- `not` may be easier to read than `!` in some uses, and easier to pick out of
surrounding context.
- `!` is being used to indicate early substitution in generics, so avoiding
its use here may avoid overloading the same syntax for multiple purposes.
- A function-like `not (...)` may better conform to reader expectations than a
`!(...)` expression.

### Two forms of NOT

We could combine the two previous alternatives, and introduce both a
low-precedence `not` operator and a high-precedence `!` operator.

This would allow us to provide both a convenient `!` operator for use in
subexpressions and a consistent `not` keyword that looks and behaves like `and`
and `or`.

However, including both operators would either lead to one of the forms being
unused or to the extra developer burden of style rules suggesting when to use
each. Also, it may be surprising to provide a `!` operator but not `&&` and `||`
operators.
chandlerc marked this conversation as resolved.
Show resolved Hide resolved

### Repeated NOT

We could permit the syntax `not not x` as an idiom for converting `x` to a
Boolean type. However, we hope that a clearer syntax for that will be available,
such as perhaps `x as Bool`. In the presence of such syntax, `not not x` would
be an anti-pattern, and may indicate a bug due to an unintentionally repeated
`not` operator. Per
[#555: when to add precedence edges](/proposals/p0555.md#when-to-add-precedence-edges),
when in doubt, we omit the precedence rule and wait for real-world experience.

### AND and OR produce the decisive value

We could make the AND and OR operator produce the (unconverted) value that
determined the result, rather than producing a Boolean value. For example, given
nullable pointers `p` and `q`, and assuming that we can test a pointer value for
nonnullness with `if (p)`, we could treat `p or q` as producing `p` if `p` is
nonnull and `q` otherwise.

This is the approach taken by several scripting languages, notably Python, Perl,
and Raku. It is also the approach taken by `std::conjunction` and
`std::disjunction` in C++.

Advantages:

- Provides a syntactic method to compute the first "truthy" or last "falsy"
value in a list.

Disadvantages:

- The resulting code may be harder to read and understand, especially when the
value is passed onto a function call.
- Requires all conditions in a chained conditional to have a common type, or
the use of a fallback rule in the case where there is no common type.
- When combined with type deduction, an unexpected type will often be deduced;
for example, `var x: auto = p and q;` may deduce `x` as having pointer type
instead of Boolean type.