-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support a type representing any literal string, a la Python's LiteralString type #51513
Comments
Related: #41114 |
From a performance perspective, this seems like a nightmare, and possibly even undecidable given how TS's type system works. Is there a workable formal definition of this? |
@RyanCavanaugh For the performance concerns, I don't know the TS implementation well enough to understand the issue. Can you elaborate a bit? The linked PEP did seem to imply that this ended up being simple to implement for Python, which gives me some hope; but, of course, that might not translate to TS for a million reasons. As far as a definition goes, what type of "formality" did you have in mind? I'm not sure what would be helpful, but I'll try to give some examples... This is the simplest case: const x = "hello";
let y: LiteralString = x; Beyond that, the most common way of combining literal strings is probably with const x = "hello" + " world";
let y: LiteralString = x; If the type of Concatenating w/ template strings would ideally be supported too: const x = "hello";
const y = `${x} world`;
let z: LiteralString = y; I think this builds pretty straightforwardly on the above. It also seems like the constant expression machinery for enums might be applicable. In addition to literal types being assignable to let a: LiteralString = "SELECT * from foo";
if(applyLimit) {
a += " LIMIT 1"; // assignment should succeed.
} More generally, if I think this gets tricky with widening. I.e., in declare const a: string;
let query = "SELECT * from foo";
await executeQuery(query); // ok. typeof query = LiteralString
query += a; // typeof query silently changes to string upon concatenating the non-literal string `a`
await executeQuery(query) // this now fails Assuming the above can't be supported, I think we'd have to keep the current behavior where let query: LiteralString = "...."; Finally, for // again, an explicit annotation's probably needed here, or `as const`;
// otherwise, `conditions` inferred as just `string[]`.
const conditions: LiteralString[] = [
"status = 'published'",
"created_at > '2022-01-01'",
"author_id = ?"
];
await query(`SELECT * from posts WHERE ${conditions.join(' AND ')}`); So, there's an implicit overload on interface Array<T> {
join(separator?: LiteralString): T extends LiteralString ? LiteralString : string;
} The basic idea would be that any deterministic operation involving only For some of these overloads — especially of methods that live on strings — I'm not sure if TS supports a good place to put them. E.g., how would we specify that calling That said, I think defining |
Ah, I was taking this much more literally (ha!) that In terms of TypeScript relative to Python, I think there'd be a very difficult cognitive leap at the point where the runtime behavior crosses into the type system behavior. I believe with the definitions given, this program is supposed to have an error, but it seems like a hard sell: function foo(x: "bar") {
fn(x);
}
function fn(x: LiteralString) {
} |
@RyanCavanaugh Now I’m confused haha. Why would the example code you showed have an error? The type of |
Totally my fault! I see how the original text implied that. I’ve updated the OP to hopefully make it much clearer what I’m actually proposing |
I think the implication was that despite |
For now I found a shitty-workaround for this: const fn = <const S extends string>(str: string extends S ? never : S) => {}
fn("test") // passes
fn("test" as string) // fails |
🔍 Search Terms
literal string, xss, sql injection, security, user input handling
✅ Viability Checklist
My suggestion meets these guidelines:
⭐ Suggestion + Motivating Example
The idea is to add a built-in type called
LiteralString
, which would be the supertype of all literal string types. Ie,LiteralString
is inhabited by all the subtypes ofstring
, excludingstring
itself and template string types that containstring
. In addition to introducing this type, TS would be more careful about tracking whether a string has a literal type (eg, when two strings with literal types are concatenated with+
, the result would remain a literal type, rather than becomingstring
).The motivation here is to allow the type system to check that certain security-sensitive strings haven't been unsafely manipulated by user-controlled input. For example, one could write a function like
queryDb(query: LiteralString, params?: unknown[]): Promise<Results>
to enforce that the query string does not have any values interpolated into it that could've been user-controlled and created SQL injection vulnerabilities. The idea is that the value from user input would’ve had to be typed asstring
, which can’t be mixed into aLiteralString
without producing astring
, which would then not be an acceptable input toqueryDb
:There is a bunch of prior art for such a type, with identical motivation, including the LiteralString type in Python. There was also a proposal to have JS engines track whether a string was created entirely from literals, which would've been used to allow DOM APIs like
innerHTML
to treat literal strings as safe, as part of a broader strategy to protect against XSS. (Of course, this TS proposal is compile-time only, but the motivation is the same.) Additionally, there was/is an analogous type in Google's Closure Compiler, with the same motivation. Finally, Scala has an analogous type,Singleton
, which is inhabited by all literal types.Potentially, the built-in type could be called
Literal
, rather thanLiteralString
, and could also include other kinds of literals (numbers, bigints, etc); APIs which need a string would then doLiteral & string
, or TS could provideLiteralString
as a built-in alias.I guess there's an argument that tracking all literal values in the same way, and having a unified
Literal
type, is more elegant, and perhaps there are some use cases outside of security for which such a type would be valuable. For the security use case, though, if an API takes a non-string, and you pass user input to that API (or some value derived from user input), it seems almost certain that you intended to let the user control the API with their input. In these non-string cases, there's nothing analogous to the "you intended to allow the user to provide some data, but they tricked the system into interpreting that data as code" problem that's at the heart of SQL injection, XSS, and related vulnerabilities.Given all that, I guess I'd propose starting with only
LiteralString
, as that's presumably less effort to implement and adds less overhead to compile times. If legitimate use cases for a more generalLiteral
type arise, then it's easy to implement that later and redefineLiteralString
asLiteral & string
.The text was updated successfully, but these errors were encountered: