Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with comparability between string literal types #6167

Closed
DanielRosenwasser opened this issue Dec 19, 2015 · 18 comments
Closed

Problems with comparability between string literal types #6167

DanielRosenwasser opened this issue Dec 19, 2015 · 18 comments
Labels
Domain: Literal Types Unit types including string literal types, numeric literal types, Boolean literals, null, undefined Fixed A PR has been merged for this issue Suggestion An idea for TypeScript

Comments

@DanielRosenwasser
Copy link
Member

String literals currently only acquire a string literal type if they are contextually typed by another string literal or a union of string literals. They are then comparable and assertable to any string-like type (i.e. string, another string literal type, and any union with a string literal type inside of it).

By and large, this gives us the behavior you want; however, there are some problems with this approach.

Problems

switch/case

This is valid, which seems bad:

let x: "foo" = "foo";
// ...
switch (x) {
    case "bar":
        console.log("wat");
}

When can x be "bar"? Without bypassing the type system, clearly never.

Equality

This is valid, which seems bad:

let x: "foo";
// ...
if (x === "bar") {
}

This is similar to the switch statement example. When will this condition be true? Again, without fooling the type system, it can't be.

Type assertions

These are both valid, which seems bad:

let x = "hello" as "world";
let y = <"bar" | "baz">"foo";

This is where we allow you to lie to the type system. This is probably way too lenient, so I'd say that this is bad.

Solutions?

Most of the solutions here would require us to tighten our rules regarding allowing these operations between string-like entities. Essentially, we would need to capture some of the same behavior originally proposed in the pull request about a new "comparable" type relationship (#5517).

Widening

Why don't we have every string literal start out with a string literal type and widen to string when needed? We could certainly do that, but the question is when does a string literal type need to be widened?

To quote @JsonFreeman on #5300 (comment), widening as it would stand today would be a problem for type argument inference:

I think the biggest issue with the widening approach is that types are widened after type argument inference, which is an issue we have discussed before. It makes it so that any string literals that get inferred are automatically widened to string, which is a little unfortunate.

Here's the problem Jason was talking about

declare function f<T>(x: T): T;

let x = f("hello");

Here, we would always widen to string after picking a type for T. And even if we didn't, we'd still have other issues:

declare function f<T>(x: T, y: T): T;

let x = f("hello", "world");

When we try to figure out what T should be, we'd have two different types: "hello" and "world. Usually, when trying to infer T given the choice between something like number and string, we'll error. We could change this behavior, and while it might be more desirable to infer "hello" | "world" here, it would certainly be less consistent.

Contextually type case expressions

For the switch/case example, we could contextually type each case clause expression by the type of the switch expression:

let x: "foo" | "bar;
// ...
switch (x) {
    case "baz:": // <- error: type '"baz"' is not compatible with '"foo" | "bar"'
        console.log("wat");
}

But what about the other way around? We can't contextually type in both directions, because that would create a circularity. A case clause expression would try to grab a contextual type from the switch expression, which would try to grab the contextual type from its case clauses... etc.

So we could just make things unidirectional, which still has undesirable results

let x: "foo" | "bar;
// ...
switch ("baz") {
    case x: // <- okay: type '"foo" | "bar"' is compatible with 'string'
        console.log("wat");
}

And admittedly, that code is less likely to be written, but really, what is fundamentally different about the first switch (x) example and the following?

let x: "foo" | "bar";
// ...
if (x === "baz") { // <- okay: type '"foo" | "bar"' is compatible with 'string'
    console.log("wat");
}

Clearly you want an error here too, so it would be kind of weird to specially treat switch statements, but not equality comparisons.

Come up with a some new flow of information, like contextual typing

For equality checks, it'd be nice to have something that can flow both ways. For instance, in this example

let x: "foo" | "bar";
// ...
if (x === "baz") {
    console.log("wat");
}

you want "baz" to grab the type of x, independent of this type relation, and apply the same sort of information towards creating a string literal type as you would given a contextual type. Since x would have the type of "foo" | "bar", which is a union containing string literal types, "baz" would acquire its string literal type for the purpose of this check.

You wouldn't end up catching the following

if ("hello" === "world") {
    // ...
}

because each side would just have type string, and inform the other side with the type string. But you can already do that today, and this is a fairly silly case anyway, so it probably doesn't matter.

The biggest question is how we plan to implement this. It would certainly need a proposal.

Come up with some ad hoc checks

We could come up with some weak checks in these positions to figure out if the user is likely making an error. This would need a proposal.

@DanielRosenwasser DanielRosenwasser added Needs Proposal This issue needs a plan that clarifies the finer details of how it could be implemented. Discussion Issues which may not have code impact labels Dec 19, 2015
@RyanCavanaugh
Copy link
Member

/cc @Aleksey-Bykov

@JsonFreeman
Copy link
Contributor

For the part where you quoted me, this is not quite what I meant. In the example you gave:

declare function f<T>(x: T, y: T): T;
let x = f("hello", "world");

The widening approach would be a problem here, but not because of the widening itself. Rather, because the string literals are initially typed as string literal types, the type argument inference will fail. Neither "hello" nor "world" is a supertype of the other.

What I meant was in an example like this:

var hello: "hello";
declare function f<T>(param: T): T;
f(hello);

You'd want this to infer "hello" for T, since that's the type you passed. But with widening, it would infer string.

I'm pretty sure widening is not appropriate here. Widening is what you do when some kind of type can arise in an expression context, but you don't want that type to land on any named entity (like a variable). You clearly want to allow variables to have string literal types, so to me widening is automatically ruled out.

One other argument. In the statement var a = b;, a and b should have the same type. If you say that variables can have string literal types, yet string literal types can be widened to string, you break this expectation, like this:

var b: "hello" = "hello";
var a = b; // a would have type string if you widen

So the two variables would not have the same type.

@zpdDG4gta8XKpMCd
Copy link

I think that the solution for all 3 problems should be one big "NO". This is a type script after all, don't use literal types if they scare you. "NO" doesn't mean that there is no way around for someone who is looking for troubles:

let x = "i don't feel like" : "making sense today"; // <-- should be a problem
let x = <"making sense today">("i don't feel like" : any); // <-- fine, be this way

So I wish that the right way of doing things was a default, and the wrong way was a harder-to-come-with option.

@falsandtru
Copy link
Contributor

I think,

Nondeterministic intersection type for string literal value

"foo" string literal value has a string&"foo" intersection type. This is normal intersection types. TypeScript uses this intersection types as a nondeterministic type. Probably, almost operations can use existing language functions.

strliteral type

strliteral type is a nondeterministic new type. Set is types of string literal value(string&strliteral) >= strliteral types >= any union types of string literal types("foo"|"bar"). strliteral types infer a minimum string literal type set. It works like a placeholder for string literal types. <strliteral>"foo" means <"foo">"foo".

strliteral is a super set of string literal types. This type can use only with type parameters.

Declarations

let x = "foo"; // `x` is `string&"foo"`
let x: "foo" = "foo"; // `x` is `"foo"`
let x: string = "foo"; // `x` is `string`
let x: "foo" = <string>"foo"; // error
//let x: strliteral&"foo" = "foo"; // `x` is `"foo"`
//let x: string&strliteral = "foo"; // `x` is `string&"foo"`
let x: "foo"|"bar" = "foo"; // `x` is `"foo"|"bar"`
let x: "foo"&"bar" = "foo"; // error
let x: "foo"&"bar"; // `x` is `"foo"&"bar"` but useless
//let x: strliteral = 0; // error
//let x: strliteral = null; // error
//let x: strliteral = undefined; // error

Assertions

let x = "foo"; // `x` is `string&"foo"`
let x = <"foo">"foo"; // `x` is `"foo"`
let x = <string>"foo"; // `x` is `string`
let x = <"foo"><string>"foo"; // error
//let x = <strliteral>"foo"; // `x` is `"foo"`
//let x = <strliteral&"foo">"foo"; // `x` is `"foo"`
//let x = <string&strliteral><string>"foo"; // `x` is `string`
let x = <"bar">"foo"; // error
let x = <"foo"|"bar">"foo"; // `x` is `"foo"|"bar"`
//let x = <strliteral><"foo"|"bar">"foo"; // `x` is `"foo"|"bar"`
//let x = <strliteral>0; // error
//let x = <strliteral>null; // error
//let x = <strliteral>undefined; // error
let x = <string&"bar"><string>"foo"; // ok...

Equality

let x = "foo";
x === "bar"; // ok, string === string
let x: string = "foo";
x === "bar"; // ok, string === string
let x: "foo" = "foo";
x === "bar"; // error, "foo" === "bar"

Inferences

let x = "foo"; // `x` is `string&"foo"`
let x = "foo" + "bar"; // `x` is `string`
let x = "foo" || "bar"; // `x` is `string`
let x = "foo" && "bar"; // `x` is `string`
let x = <"foo">"foo" + "bar"; // error, invalid operation
let x = <"foo">"foo" || "bar"; // error, meaningless code
let x = <"foo">"foo" && "bar"; // error, meaningless code
//let x = <strliteral>"foo" + "bar"; // error, invalid operation
//let x = <strliteral>"foo" || "bar"; // error, meaningless code
//let x = <strliteral>"foo" && "bar"; // error, meaningless code

Inference constraints

declare function f<T>(x: T, y: T): T;

let x = f("hello", "world"); // `T` is `string`
let x = f<string>("hello", "world"); // `T` is `string`
//let x = f<strliteral>("hello", "world"); // `T` is `"hello" | "world"`
let x = f<"hello" | "world">("hello", "world"); // `T` is `"hello" | "world"`
let x = f<"hello">("hello", "world"); // error
declare function f<T extends strliteral>(x: T, y: T): T;
let x = f("hello", "world"); // `T` is `"hello" | "world"`

Type guards

let x = "foo";
switch (x) {
    default: // ok, `x` is `string&"foo"` here
    case "foo": // ok, `x` is `string&"foo"`
    case "bar": // ok, `x` is `string`
    default: // ok, `x` is `string` here
}
switch (<"foo">x) {
//switch (<strliteral>x) {
    case "foo": // ok
    case "bar": // error
    default: // ok, `x` is `string` here, this is a special rule for an exception handling
}
switch (<string>x) {
    case "foo": // ok
    case "bar": // ok
    default: // ok
}
switch (<"foo"|"bar">x) {
    default: // ok, `x` is `"foo"|"bar"` here
    case "foo": // ok, `x` is `"foo"`
    default: // ok, `x` is `"bar"` here
    case "bar": // ok, `x` is `"bar"`
    case "fizzbazz": // error
    default: // ok, `x` is `string` here
}

@DanielRosenwasser
Copy link
Member Author

@JsonFreeman sorry, my mistake. There are two issues there. I'll amend the original content.

@JsonFreeman
Copy link
Contributor

In the example

declare function f<T>(x: T, y: T): T;
let x = f("hello", "world");

You claim that it would be more desirable to infer "hello" | "world". Why is this more desirable? I would think string is the correct inference here. There is no indication that you want string literal types to play any part here, as you want to be able to assign any string to x after it is initialized (I think).

@JsonFreeman
Copy link
Contributor

Why does the type assertion problem happen? I thought in a type assertion, the operand is contextually typed by the asserted type. Doesn't that take care of it?

@DanielRosenwasser
Copy link
Member Author

Why does the type assertion problem happen? I thought in a type assertion, the operand is contextually typed by the asserted type. Doesn't that take care of it?

Here is why something like "foo" as "bar" or "<hello">"world" succeeds.

Prior to string literal types, we would perform to/from assignability checking on all types to see if they are compatible in the three listed locations. However, that couldn't capture the following behavior:

var x: "foo" | boolean;
// ...
var y = x as string; // error: 'string' is not assignable to 'boolean', and vice versa

The comparable relationship was supposed to fix this, but instead we just simply settled for a different fix where a string-like type, or a union containing a string-like type, is compatible with any other string-like type, or union containing a string-like type.

@JsonFreeman
Copy link
Contributor

I see, that makes sense. But I also don't think it is specific to string literal types. In logical terms, what you really want to check is not assignability, but consistency. For all three of the cases you pointed out, whether or not they include string literal types, what matters is whether the types involved have any values in common. In other words, there should be an error if the intersection of the sets corresponding to the types is empty. This is not adequately captured by assignability, regardless of contextual typing or any other typing mechanism. I think this needs to be a new type relation (consistency).

If that is deemed too complex, then I think the only other realistic alternative is to just allow all three cases. Anything else is smoke and mirrors, it's not really addressing the issue at its core. I also do not agree with the notion of any string-like type being assignable to all other string-like types. It is nice for type assertions, but too permissive for other scenarios involving assignability.

@JsonFreeman
Copy link
Contributor

I realize now what is different about string literal types versus object types for example. For object types, consistency is theoretically the goal, but it holds trivially in almost all cases. Pretty much all pairs of objects are consistent if you can take their intersection. So this is not very interesting, and so consistency is not enough to imply that the types are related in some meaningful way.

For string literals on the other hand, because the types map to specific values, consistency is not trivial, and it makes more sense to rely on it for type assertions and equality checks. So I think the answer is a relation that somehow covers consistency, but checks something stronger in the cases where consistency is trivial (like object types).

@DanielRosenwasser DanielRosenwasser added the Domain: Literal Types Unit types including string literal types, numeric literal types, Boolean literals, null, undefined label Jan 12, 2016
@DanielRosenwasser
Copy link
Member Author

Spoken offline with @ahejlsberg and @RyanCavanaugh. @ahejlsberg brought up the idea of a specialized "top-level" widening.

Basically, here's the semantics that would be involved.

  1. Every string literal gets a string literal type.

  2. Widening on a singleton string literal type occurs at every widening location, except when bound to a readonly location. For example:

    // 'c' has type '"hello"'
    const c = "hello";
    
    // 'd' has type 'string'
    let d = c;
    
    // 'e' has type '"hello"'
    let e: "hello" = c;
    
    // 'f' has type 'string'
    let f = e;
    
    // 'g' has type '{ prop: string }'
    const e = {
       prop: c
    };
  3. String literal types are not widened when within a union, intersection, or within a property of any other type:

    // 'c' has type '"hello" | "world"'
    const c = randBool() ? "hello" : "world";
    
    // 'd' has type '"hello" | "world"'
    let d = c;
    
    // 'e' has type '{ prop: "hello" | "world" }'
    const e = {
       prop: c
    };

Additionally, another change would be to union multiple string literal return types rather than complaining about no best common type.

@JsonFreeman
Copy link
Contributor

Reactions:

  1. Initially I was very opposed to having something like let d = c; where c and d have different types. I think it's pretty unexpected. But apart from user intuition, I am starting to soften on this, because I don't have any concrete objections. However, I do think it's important to realize that this breaks an invariant of the language, that has held up to this point.
  2. What is the reason for rule 3 above?
  3. What does the best common type for return expressions have to do with string literals?

@DanielRosenwasser
Copy link
Member Author

  1. I have the same feeling and we're wrangling with this right now.

  2. Reason 3 is applied because we kind of see a singleton literal type as "useless", whereas a composite type was probably written intentionally. A user typically doesn't want to keep around a singleton type when assigning to a let/var binding because they might assign to it later. On the other hand, with a union of string literals, mutation makes more sense.

  3. Basically that in the following

    function f(b: boolean) {
       if (b) {
           return "true";
       }
    
       return "false";
    }

    you'd get an error because there's no best common type between types "true" and "false". It's reasonable that the user just wants "true" | "false".

@DanielRosenwasser
Copy link
Member Author

Here's a few motivating scenarios.

const a = "foo";
const b = a;

Both a and b should have the type "foo" because they are read-only locations. They can never be modified, so a widening to string should never occur.


namespace Kind {
    export const Foo = "Foo";
    export const Bar = "Bar";
}

let kind = Kind.Foo;

The general consensus is that kind should have type string (though it would be interesting if one could specify what type Foo and Bar would actually become).


interface Option {
    kind: "string" | "number" | Map<number>;
}

let option: Option;
let kind = option.kind;

Ideally, kind should continue to have the type "string" | "number" | Map<number>. There's a set of types that Option.kind could take on, and it's likely we don't want to touch that.


const supportedTsExtensions = [".ts", ".tsx", ".d.ts"];
const supportedJsExtensions = [".js", ".jsx"];
const allSupportedExtensions = supportedTsExtensions.concat(supportedJsExtensions);

Ideally, this should succeed. supportedTsExtensions should have the type string, as should supportedJsExtensions.

This is in contrast to potential behavior in which supportedTsExtensions has the type ".ts" | ".tsx" | ".d.ts" and the invocation of concat fails because supportedJsExtensions has the type ".js" | ".jsx".

@JsonFreeman
Copy link
Contributor

So widening occurs unless both

  • The target of the assignment is const
  • The thing to be widened is at the top level of the initializer, not nested inside the initializer.

Does this include other effects of widening (I'm thinking of null and undefined, fresh object literals, etc)?

const c = null;
var array = ["hello", c]; // is this any[] or string[]?

If you suppress widening for top level initializers, what about nested parts of the initializer if the result is destructured? I'm trying to come up with an example to demonstrate this, but it doesn't seem like a very useful pattern:

var kindAndVal: {
     kind: "kindA";
     val: any;
};
const { kind } = kindAndVal; // Does the const kind have type "kindA"?

Overall, I'm still of the opinion that widening should be used in order to prevent certain types from "landing on" a variable. That was the original goal of widening. It seems like this is stretching the goal of widening in an unnatural way, particularly because string literal types can be written in a type annotation. Previously, every type affected by widening could only be inferred. This seemed like a crucial property of widening. Why the sudden switch in the mentality of widening?

@DanielRosenwasser
Copy link
Member Author

@JsonFreeman we came up with a few rules yesterday:

  1. All string literals start off with a "fresh" string literal type.
  2. When assigning to a const, a "fresh" string literal type becomes a "non-fresh" string literal type.
  3. When assigning to a let/var/etc., a singleton string literal type is widened to string (regardless of freshness).
  4. When widening in any other scenarios, all fresh string literal types are widened to string. For instance, given the expression ["a", "b", "c"], the array literal's type starts off as "a" | "b" | "c" and is widened to string since the constituents are all fresh.

@JsonFreeman
Copy link
Contributor

Ok, and I presume that means a string literal type annotation introduces a non-fresh type, correct?

Though my above questions about other effects of widening, and destructuring still stand.

And I am still curious about the philosophy of widening. To confirm, is it not important that types affected by widening can only be inferred, never denoted directly?

@DanielRosenwasser
Copy link
Member Author

Given that the comparable relationship has been implemented (#5517 and #7140), I think we can close this. 😄

@DanielRosenwasser DanielRosenwasser removed the Needs Proposal This issue needs a plan that clarifies the finer details of how it could be implemented. label May 8, 2016
@microsoft microsoft locked and limited conversation to collaborators Jun 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Domain: Literal Types Unit types including string literal types, numeric literal types, Boolean literals, null, undefined Fixed A PR has been merged for this issue Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

6 participants