Skip to content

Commit

Permalink
Enhanced Parsing with TemplateLiteralParser, closes #3307 (#3347)
Browse files Browse the repository at this point in the history
  • Loading branch information
gcanti authored Jul 31, 2024
1 parent 657fc48 commit 3dce357
Show file tree
Hide file tree
Showing 7 changed files with 306 additions and 7 deletions.
51 changes: 51 additions & 0 deletions .changeset/dirty-geese-wash.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
"@effect/schema": patch
---

Enhanced Parsing with `TemplateLiteralParser`, closes #3307

In this update we've introduced a sophisticated API for more refined string parsing: `TemplateLiteralParser`. This enhancement stems from recognizing limitations in the `Schema.TemplateLiteral` and `Schema.pattern` functionalities, which effectively validate string formats without extracting structured data.

**Overview of Existing Limitations**

The `Schema.TemplateLiteral` function, while useful as a simple validator, only verifies that an input conforms to a specific string pattern by converting template literal definitions into regular expressions. Similarly, `Schema.pattern` employs regular expressions directly for the same purpose. Post-validation, both methods require additional manual parsing to convert the validated string into a usable data format.

**Introducing TemplateLiteralParser**

To address these limitations and eliminate the need for manual post-validation parsing, the new `TemplateLiteralParser` API has been developed. It not only validates the input format but also automatically parses it into a more structured and type-safe output, specifically into a **tuple** format.

This new approach enhances developer productivity by reducing boilerplate code and simplifying the process of working with complex string inputs.

**Example** (string based schemas)

```ts
import { Schema } from "@effect/schema"

// const schema: Schema.Schema<readonly [number, "a", string], `${string}a${string}`, never>
const schema = Schema.TemplateLiteralParser(
Schema.NumberFromString,
"a",
Schema.NonEmptyString
)

console.log(Schema.decodeEither(schema)("100ab"))
// { _id: 'Either', _tag: 'Right', right: [ 100, 'a', 'b' ] }

console.log(Schema.encode(schema)([100, "a", "b"]))
// { _id: 'Either', _tag: 'Right', right: '100ab' }
```

**Example** (number based schemas)

```ts
import { Schema } from "@effect/schema"

// const schema: Schema.Schema<readonly [number, "a"], `${number}a`, never>
const schema = Schema.TemplateLiteralParser(Schema.Int, "a")

console.log(Schema.decodeEither(schema)("1a"))
// { _id: 'Either', _tag: 'Right', right: [ 1, 'a' ] }

console.log(Schema.encode(schema)([1, "a"]))
// { _id: 'Either', _tag: 'Right', right: '1a' }
```
27 changes: 27 additions & 0 deletions packages/schema/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1459,6 +1459,33 @@ The `TemplateLiteral` constructor supports the following types of spans:
- Literals: `string | number | boolean | null | bigint`. These can be either wrapped by `Schema.Literal` or used directly
- Unions of the above types

## Enhanced Parsing with TemplateLiteralParser

The `Schema.TemplateLiteral` function, while useful as a simple validator, only verifies that an input conforms to a specific string pattern by converting template literal definitions into regular expressions. Similarly, `Schema.pattern` employs regular expressions directly for the same purpose. Post-validation, both methods require additional manual parsing to convert the validated string into a usable data format.

To address these limitations and eliminate the need for manual post-validation parsing, the new `TemplateLiteralParser` API has been developed. It not only validates the input format but also automatically parses it into a more structured and type-safe output, specifically into a **tuple** format.

This new approach enhances developer productivity by reducing boilerplate code and simplifying the process of working with complex string inputs.

**Example**

```ts
import { Schema } from "@effect/schema"

// const schema: Schema.Schema<readonly [number, "a", string], `${string}a${string}`, never>
const schema = Schema.TemplateLiteralParser(
Schema.NumberFromString,
"a",
Schema.NonEmptyString
)

console.log(Schema.decodeEither(schema)("100afoo"))
// { _id: 'Either', _tag: 'Right', right: [ 100, 'a', 'foo' ] }

console.log(Schema.encode(schema)([100, "a", "foo"]))
// { _id: 'Either', _tag: 'Right', right: '100afoo' }
```

## Unique Symbols

```ts
Expand Down
7 changes: 7 additions & 0 deletions packages/schema/dtslint/Context.ts
Original file line number Diff line number Diff line change
Expand Up @@ -412,3 +412,10 @@ declare const myRequest: MyRequest

// $ExpectType Schema<Exit<boolean, number>, ExitEncoded<boolean, number, unknown>, "bContext" | "cContext">
Serializable.exitSchema(myRequest)

// ---------------------------------------------
// TemplateLiteralParser
// ---------------------------------------------

// $ExpectType Schema<readonly [string, "a", string], `${string}a${string}`, "a" | "b">
S.asSchema(S.TemplateLiteralParser(hole<S.Schema<string, string, "a">>(), "a", hole<S.Schema<string, string, "b">>()))
22 changes: 22 additions & 0 deletions packages/schema/dtslint/Schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2655,3 +2655,25 @@ S.asSchema(S.Array(S.String).pipe(S.minItems(2), S.maxItems(3)))

// $ExpectType filter<Schema<readonly string[], readonly string[], never>>
S.Array(S.String).pipe(S.minItems(1), S.maxItems(2))

// ---------------------------------------------
// TemplateLiteralParser
// ---------------------------------------------

// $ExpectType Schema<readonly [number, "a"], `${number}a`, never>
S.asSchema(S.TemplateLiteralParser(S.Int, "a"))

// $ExpectType TemplateLiteralParser<[typeof Int, "a"]>
S.TemplateLiteralParser(S.Int, "a")

// $ExpectType Schema<readonly [number, "a", string], `${string}a${string}`, never>
S.asSchema(S.TemplateLiteralParser(S.NumberFromString, "a", S.NonEmptyString))

// $ExpectType TemplateLiteralParser<[typeof NumberFromString, "a", typeof NonEmptyString]>
S.TemplateLiteralParser(S.NumberFromString, "a", S.NonEmptyString)

// $ExpectType Schema<readonly ["/", number, "/", "a" | "b"], `/${number}/a` | `/${number}/b`, never>
S.asSchema(S.TemplateLiteralParser("/", S.Int, "/", S.Literal("a", "b")))

// $ExpectType TemplateLiteralParser<["/", typeof Int, "/", Literal<["a", "b"]>]>
S.TemplateLiteralParser("/", S.Int, "/", S.Literal("a", "b"))
24 changes: 24 additions & 0 deletions packages/schema/src/AST.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2008,6 +2008,30 @@ export const getTemplateLiteralRegExp = (ast: TemplateLiteral): RegExp => {
return new RegExp(pattern)
}

/**
* @since 0.70.1
*/
export const getTemplateLiteralCapturingRegExp = (ast: TemplateLiteral): RegExp => {
let pattern = `^`
if (ast.head !== "") {
pattern += `(${regexp.escape(ast.head)})`
}

for (const span of ast.spans) {
if (isStringKeyword(span.type)) {
pattern += `(${STRING_KEYWORD_PATTERN})`
} else if (isNumberKeyword(span.type)) {
pattern += `(${NUMBER_KEYWORD_PATTERN})`
}
if (span.literal !== "") {
pattern += `(${regexp.escape(span.literal)})`
}
}

pattern += "$"
return new RegExp(pattern)
}

/**
* @since 0.67.0
*/
Expand Down
81 changes: 74 additions & 7 deletions packages/schema/src/Schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -705,7 +705,7 @@ const makeEnumsClass = <A extends EnumsDefinition>(
*/
export const Enums = <A extends EnumsDefinition>(enums: A): Enums<A> => makeEnumsClass(enums)

type Join<T> = T extends [infer Head, ...infer Tail] ?
type Join<Params> = Params extends [infer Head, ...infer Tail] ?
`${(Head extends Schema<infer A> ? A : Head) & (AST.LiteralValue)}${Join<Tail>}`
: ""

Expand All @@ -718,14 +718,12 @@ export interface TemplateLiteral<A> extends SchemaClass<A> {}
type TemplateLiteralParameter = Schema.AnyNoContext | AST.LiteralValue

/**
* @category constructors
* @category template literal
* @since 0.67.0
*/
export const TemplateLiteral = <
T extends readonly [TemplateLiteralParameter, ...Array<TemplateLiteralParameter>]
>(
...[head, ...tail]: T
): TemplateLiteral<Join<T>> => {
export const TemplateLiteral = <Params extends array_.NonEmptyReadonlyArray<TemplateLiteralParameter>>(
...[head, ...tail]: Params
): TemplateLiteral<Join<Params>> => {
let astOrs: ReadonlyArray<AST.TemplateLiteral | string> = getTemplateLiterals(
getTemplateLiteralParameterAST(head)
)
Expand Down Expand Up @@ -786,6 +784,75 @@ const getTemplateLiterals = (
throw new Error(errors_.getSchemaUnsupportedLiteralSpanErrorMessage(ast))
}

type TemplateLiteralParserParameters = Schema.Any | AST.LiteralValue

type TemplateLiteralParserParametersType<T> = T extends [infer Head, ...infer Tail] ?
readonly [Head extends Schema<infer A, infer _I, infer _R> ? A : Head, ...TemplateLiteralParserParametersType<Tail>]
: []

type TemplateLiteralParserParametersEncoded<T> = T extends [infer Head, ...infer Tail] ? `${
& (Head extends Schema<infer _A, infer I, infer _R> ? I : Head)
& (AST.LiteralValue)}${TemplateLiteralParserParametersEncoded<Tail>}`
: ""

/**
* @category API interface
* @since 0.70.1
*/
export interface TemplateLiteralParser<Params extends array_.NonEmptyReadonlyArray<TemplateLiteralParserParameters>>
extends
Schema<
TemplateLiteralParserParametersType<Params>,
TemplateLiteralParserParametersEncoded<Params>,
Schema.Context<Params[number]>
>
{
readonly params: Params
}

/**
* @category template literal
* @since 0.70.1
*/
export const TemplateLiteralParser = <Params extends array_.NonEmptyReadonlyArray<TemplateLiteralParserParameters>>(
...params: Params
): TemplateLiteralParser<Params> => {
const encodedSchemas: Array<Schema.Any> = []
const typeSchemas: Array<Schema.Any> = []
const numbers: Array<number> = []
for (let i = 0; i < params.length; i++) {
const p = params[i]
if (isSchema(p)) {
const encoded = encodedSchema(p)
if (AST.isNumberKeyword(encoded.ast)) {
numbers.push(i)
}
encodedSchemas.push(encoded)
typeSchemas.push(p)
} else {
const literal = Literal(p as AST.LiteralValue)
encodedSchemas.push(literal)
typeSchemas.push(literal)
}
}
const from = TemplateLiteral(...encodedSchemas as any)
const re = AST.getTemplateLiteralCapturingRegExp(from.ast as AST.TemplateLiteral)
return class TemplateLiteralParserClass extends transform(from, Tuple(...typeSchemas), {
strict: false,
decode: (s) => {
const out: Array<number | string> = re.exec(s)!.slice(1, params.length + 1)
for (let i = 0; i < numbers.length; i++) {
const index = numbers[i]
out[index] = Number(out[index])
}
return out
},
encode: (tuple) => tuple.join("")
}) {
static params = params.slice()
} as any
}

const declareConstructor = <
const TypeParameters extends ReadonlyArray<Schema.Any>,
I,
Expand Down
101 changes: 101 additions & 0 deletions packages/schema/test/Schema/TemplateLiteralParser.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
import * as Schema from "@effect/schema/Schema"
import * as Util from "@effect/schema/test/TestUtils"
import { describe, expect, it } from "vitest"

describe("TemplateLiteralParser", () => {
it("should throw on unsupported template literal spans", () => {
expect(() => Schema.TemplateLiteralParser(Schema.Boolean)).toThrow(
new Error(`Unsupported template literal span
schema (BooleanKeyword): boolean`)
)
expect(() => Schema.TemplateLiteralParser(Schema.SymbolFromSelf)).toThrow(
new Error(`Unsupported template literal span
schema (SymbolKeyword): symbol`)
)
})

it("should expose the params", () => {
const params = ["/", Schema.Int, "/", Schema.String] as const
const schema = Schema.TemplateLiteralParser(...params)
expect(schema.params).toStrictEqual(params)
})

describe("number based schemas", () => {
it("decoding", async () => {
const schema = Schema.TemplateLiteralParser(Schema.Int, "a")
await Util.expectDecodeUnknownSuccess(schema, "1a", [1, "a"])
await Util.expectDecodeUnknownFailure(
schema,
"1.1a",
`(\`\${number}a\` <-> readonly [Int, "a"])
└─ Type side transformation failure
└─ readonly [Int, "a"]
└─ [0]
└─ Int
└─ Predicate refinement failure
└─ Expected Int, actual 1.1`
)
})

it("encoding", async () => {
const schema = Schema.TemplateLiteralParser(Schema.Int, "a", Schema.Char)
await Util.expectEncodeSuccess(schema, [1, "a", "b"], "1ab")
await Util.expectEncodeFailure(
schema,
[1.1, "a", ""],
`(\`\${number}a\${string}\` <-> readonly [Int, "a", Char])
└─ Type side transformation failure
└─ readonly [Int, "a", Char]
└─ [0]
└─ Int
└─ Predicate refinement failure
└─ Expected Int, actual 1.1`
)
await Util.expectEncodeFailure(
schema,
[1, "a", ""],
`(\`\${number}a\${string}\` <-> readonly [Int, "a", Char])
└─ Type side transformation failure
└─ readonly [Int, "a", Char]
└─ [2]
└─ Char
└─ Predicate refinement failure
└─ Expected Char, actual ""`
)
})
})

describe("string based schemas", () => {
it("decoding", async () => {
const schema = Schema.TemplateLiteralParser(Schema.NumberFromString, "a", Schema.NonEmptyString)
await Util.expectDecodeUnknownSuccess(schema, "100ab", [100, "a", "b"])
await Util.expectDecodeUnknownFailure(
schema,
"-ab",
`(\`\${string}a\${string}\` <-> readonly [NumberFromString, "a", NonEmptyString])
└─ Type side transformation failure
└─ readonly [NumberFromString, "a", NonEmptyString]
└─ [0]
└─ NumberFromString
└─ Transformation process failure
└─ Expected NumberFromString, actual "-"`
)
})

it("encoding", async () => {
const schema = Schema.TemplateLiteralParser(Schema.NumberFromString, "a", Schema.Char)
await Util.expectEncodeSuccess(schema, [100, "a", "b"], "100ab")
await Util.expectEncodeFailure(
schema,
[100, "a", ""],
`(\`\${string}a\${string}\` <-> readonly [NumberFromString, "a", Char])
└─ Type side transformation failure
└─ readonly [NumberFromString, "a", Char]
└─ [2]
└─ Char
└─ Predicate refinement failure
└─ Expected Char, actual ""`
)
})
})
})

0 comments on commit 3dce357

Please sign in to comment.