Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite lexer and parser #196

Merged
merged 138 commits into from
Jul 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
566e9cc
First prototype of lexer with parcours.
01mf02 May 7, 2024
cd5d735
Rewrite tokeniser without library support.
01mf02 May 8, 2024
32d0fcd
Bare metal string parser.
01mf02 May 8, 2024
6293c14
Finish lexer conversion.
01mf02 May 8, 2024
db2b78e
New Token type for simpler and faster lexing.
01mf02 May 9, 2024
6909174
Remove parcours dependency.
01mf02 May 9, 2024
0c671f1
Nicer handling of words.
01mf02 May 9, 2024
3bde326
Error reporting.
01mf02 May 9, 2024
8ac7319
Report Unicode errors.
01mf02 May 9, 2024
9c1a3e9
Correct lexing of incorrect string escapes, e.g. "\0".
01mf02 May 9, 2024
4f001de
Remove unused function.
01mf02 May 9, 2024
03df662
Make lexer an object.
01mf02 May 10, 2024
0f91ca2
Improve string handling.
01mf02 May 10, 2024
c596dd9
Document.
01mf02 May 10, 2024
3928fc8
Documentation, a bit of refactoring.
01mf02 May 10, 2024
fd85d66
'"' is a delimiter, too.
01mf02 May 10, 2024
9ec392f
Enable new lexer!
01mf02 May 10, 2024
8eb8ed6
Compress strings.
01mf02 May 10, 2024
0845350
Removed the old lexer.
01mf02 May 13, 2024
0a126d1
Make `Punct` derive `Eq`.
01mf02 May 15, 2024
195f694
Format.
01mf02 May 15, 2024
03ea6f8
Work on new term parser.
01mf02 May 15, 2024
50232d8
Variable bindings.
01mf02 May 16, 2024
378539b
Fix a typo.
01mf02 May 16, 2024
bbf75f1
Parse definitions, less verbose error reporting.
01mf02 May 17, 2024
b31f014
Properly report next token in blocks.
01mf02 May 18, 2024
28a4994
Lex `?//` operator.
01mf02 May 21, 2024
fef8322
Simplify definitions.
01mf02 May 21, 2024
022d913
Definitions inside terms.
01mf02 May 21, 2024
458333a
Remove `Main`; restrict object construction.
01mf02 May 21, 2024
5087238
Implement `elif`.
01mf02 May 22, 2024
bb7452c
More powerful object construction.
01mf02 May 22, 2024
e5e6d94
String parsing with non-owned strings.
01mf02 May 22, 2024
da08256
Restore old lexer.
01mf02 May 22, 2024
9e45931
Restore bincode.
01mf02 May 22, 2024
6d9271f
Make jaq-parse no_std again!!!
01mf02 May 22, 2024
0541ee2
Reenable jaq-std, measure performance.
01mf02 May 22, 2024
26ada80
Thanks to clippy!
01mf02 May 22, 2024
2694dc0
Avoid unnecessary allocation.
01mf02 May 22, 2024
408aa88
Remove unused variable.
01mf02 May 22, 2024
07cd6e4
Correctly advance input after Unicode escape sequence.
01mf02 May 22, 2024
26dee39
Load standard library with new parser.
01mf02 May 22, 2024
0ad3dc4
More robust Unicode escape handling.
01mf02 May 22, 2024
ef0b298
Parse module syntax.
01mf02 May 24, 2024
7f7bd0a
Pedantic clippy.
01mf02 May 28, 2024
a4cbd97
Parse module syntax.
01mf02 Jun 5, 2024
c4e42c2
Parse label-break.
01mf02 Jun 19, 2024
a5c6b73
Labels and definitions are atoms.
01mf02 Jun 23, 2024
899a882
For `{(k): v}`, `v` must not have commas.
01mf02 Jun 23, 2024
bb62357
Merge branch 'main' into faster-lexer
01mf02 Jun 27, 2024
edcd731
Move new lexer/parser to jaq-syn.
01mf02 Jun 27, 2024
ef6ee2c
Start work on conversion to legacy terms.
01mf02 Jun 27, 2024
bc2b0da
Remove empty test.
01mf02 Jun 27, 2024
645dccf
Make Def fields public.
01mf02 Jun 27, 2024
3209191
Convert definitions and main.
01mf02 Jun 27, 2024
b31f89f
More descriptive panic.
01mf02 Jun 27, 2024
b6b515d
Split away leading `$` for variables.
01mf02 Jun 28, 2024
8737856
Make folding operations accept only atoms for now.
01mf02 Jun 28, 2024
7374b0c
Refactor.
01mf02 Jun 28, 2024
4201dd5
Make `Module` usable.
01mf02 Jun 28, 2024
08947bd
Remove dependency on jaq-std!
01mf02 Jun 28, 2024
9cc0187
Simplify final parsing, make compile with MSRV.
01mf02 Jun 28, 2024
b444f6b
Exclude unpermitted numbers such as `三`.
01mf02 Jul 2, 2024
695d012
Use more `iter::from_fn`.
01mf02 Jul 2, 2024
95e40a7
Correct more numeric lexing.
01mf02 Jul 2, 2024
f75d26c
Make consumed take usize instead of Chars.
01mf02 Jul 2, 2024
17c77fc
Move conversion functions to own module.
01mf02 Jul 2, 2024
a36cde2
Calculate (more) correct spans!
01mf02 Jul 2, 2024
3a94958
Use lex::span instead of duplicate code.
01mf02 Jul 3, 2024
cdd8806
Make span() public.
01mf02 Jul 3, 2024
0aefb7d
String representation for lex::Expect.
01mf02 Jul 3, 2024
637f58c
Make Expect::Delim return only delimiter, not whole remaining input.
01mf02 Jul 3, 2024
95c511b
Make a few lexer types abstract over S, not over 'a.
01mf02 Jul 3, 2024
8a4a78b
Report lexer errors!
01mf02 Jul 3, 2024
edd9070
Proper reporting of reports.
01mf02 Jul 4, 2024
18fc877
Make parse error public.
01mf02 Jul 4, 2024
4a2a15c
Handle unclosed quotes.
01mf02 Jul 4, 2024
201f9ee
Store start and end of strings.
01mf02 Jul 4, 2024
668bc1f
Report parse errors!
01mf02 Jul 4, 2024
7a68ede
Flatten.
01mf02 Jul 4, 2024
5142761
Output compilation errors via new report infrastructure.
01mf02 Jul 15, 2024
2d9b3c0
Remove chumsky and jaq_parse dependencies from jaq.
01mf02 Jul 15, 2024
c87c678
Remove unused imports.
01mf02 Jul 15, 2024
c4298a5
Warn for unimplemented functionality.
01mf02 Jul 15, 2024
91f6916
Document.
01mf02 Jul 15, 2024
fde7a83
Nicer conversion.
01mf02 Jul 15, 2024
02079a5
Document Term type.
01mf02 Jul 15, 2024
c7b106f
Format.
01mf02 Jul 15, 2024
2007fef
Nicer handling of leading key in path expression.
01mf02 Jul 15, 2024
c077331
Merge branch 'main' into faster-lexer
01mf02 Jul 15, 2024
bfb52f1
Remove duplicate check.
01mf02 Jul 15, 2024
d3f9189
More robust handling of paths starting with a key.
01mf02 Jul 15, 2024
2d16f1a
Document.
01mf02 Jul 15, 2024
033ef57
Remove unused function.
01mf02 Jul 15, 2024
77aa7d0
Unify logic for key followed by optionality.
01mf02 Jul 16, 2024
70b4ab4
More streamlined lexing/parsing API.
01mf02 Jul 16, 2024
c872e4e
Document.
01mf02 Jul 16, 2024
00fd324
Distinguish string lifetime from token lifetime.
01mf02 Jul 16, 2024
bbf2425
Correct a few mistakes in the parser.
01mf02 Jul 16, 2024
aa165a6
Remove invalid test.
01mf02 Jul 16, 2024
324ea28
Remove jaq-parse from jaq-interpret.
01mf02 Jul 16, 2024
4bf1db2
Remove jaq-parse from jaq-std.
01mf02 Jul 16, 2024
072ec5f
Bump jaq-syn to 1.6.0.
01mf02 Jul 16, 2024
a1fb0da
Remove jaq-parse from jaq-core.
01mf02 Jul 16, 2024
cd8a336
Update Cargo.lock.
01mf02 Jul 16, 2024
e97afb3
Use jaq-std in jaq again.
01mf02 Jul 16, 2024
bcc3d5a
Format.
01mf02 Jul 16, 2024
f800a82
Use new parser in jaq-play.
01mf02 Jul 16, 2024
ee34b4c
Remove old parser from Cargo.toml.
01mf02 Jul 16, 2024
f483266
Update Cargo.lock --- bye bye, chumsky!
01mf02 Jul 16, 2024
6a2c7b2
Clippy.
01mf02 Jul 16, 2024
02c05cb
Do not panic on interpolated strings in module paths.
01mf02 Jul 17, 2024
4ec2ca3
Identifiers.
01mf02 Jul 17, 2024
fc15a22
More permissive key syntax.
01mf02 Jul 17, 2024
bdbc9a5
If you can write `{key}`, then you can also write `.key`.
01mf02 Jul 17, 2024
9968bf6
Document.
01mf02 Jul 17, 2024
6f2478f
Allow `.key` and `{key}` where `key` is a keyword, and disallow `. key`.
01mf02 Jul 17, 2024
abd09d5
Remove `def_head`.
01mf02 Jul 17, 2024
f30c3fe
Clippy.
01mf02 Jul 17, 2024
1b82d32
Document.
01mf02 Jul 17, 2024
6729198
Merge branch 'main' into faster-lexer
01mf02 Jul 17, 2024
44e5851
New test.
01mf02 Jul 17, 2024
c379734
Document.
01mf02 Jul 17, 2024
b46c2b1
Report lex errors for characters `c` like `💣` where `c.len_utf8() != 1`.
01mf02 Jul 17, 2024
9cde4e4
Clippy.
01mf02 Jul 17, 2024
2772166
Document.
01mf02 Jul 17, 2024
1174607
Document term type.
01mf02 Jul 22, 2024
2f970c0
Example for parse function.
01mf02 Jul 22, 2024
88d55f0
Report unsupported operator.
01mf02 Jul 22, 2024
39c3429
Document.
01mf02 Jul 22, 2024
cd8085c
Remove `KEYWORDS`.
01mf02 Jul 22, 2024
33f65ab
Document expectation.
01mf02 Jul 22, 2024
e3df12f
Document.
01mf02 Jul 22, 2024
dce1814
Do not attempt to support destructuring alternative operator for now.
01mf02 Jul 22, 2024
27798d4
Correctly compute `{$k}`.
01mf02 Jul 22, 2024
7a74ea7
Atomicity tests.
01mf02 Jul 22, 2024
b410289
Make test more meaningful.
01mf02 Jul 22, 2024
cf3fc71
Document.
01mf02 Jul 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 2 additions & 45 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
[workspace]
members = [
"jaq-syn",
"jaq-parse",
"jaq-interpret",
"jaq-core",
"jaq-std",
Expand Down
2 changes: 1 addition & 1 deletion jaq-core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,5 @@ base64 = { version = "0.22", optional = true }
urlencoding = { version = "2.1.3", optional = true }

[dev-dependencies]
jaq-parse = { version = "1.0.0", path = "../jaq-parse" }
jaq-syn = { version = "1.6.0", path = "../jaq-syn" }
serde_json = "1.0"
7 changes: 4 additions & 3 deletions jaq-core/tests/common/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ fn yields(x: jaq_interpret::Val, f: &str, ys: impl Iterator<Item = jaq_interpret
let mut ctx = jaq_interpret::ParseCtx::new(Vec::new());
ctx.insert_natives(jaq_core::core());

let (f, errs) = jaq_parse::parse(f, jaq_parse::main());
assert!(errs.is_empty());
ctx.yields(x, f.unwrap(), ys)
let f = jaq_syn::parse(f, |p| p.module(|p| p.term()))
.unwrap()
.conv(f);
ctx.yields(x, f, ys)
}

pub fn fail(x: Value, f: &str, err: jaq_interpret::Error) {
Expand Down
3 changes: 0 additions & 3 deletions jaq-interpret/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,3 @@ hifijson = { version = "0.2.0", optional = true }
indexmap = "2.0"
once_cell = "1.16.0"
serde_json = { version = "1.0.81", optional = true }

[dev-dependencies]
jaq-parse = { version = "1.0.0", path = "../jaq-parse" }
5 changes: 2 additions & 3 deletions jaq-interpret/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,10 @@
//! let mut defs = ParseCtx::new(Vec::new());
//!
//! // parse the filter
//! let (f, errs) = jaq_parse::parse(filter, jaq_parse::main());
//! assert_eq!(errs, Vec::new());
//! let f = jaq_syn::parse(filter, |p| p.module(|p| p.term())).unwrap().conv(filter);
//!
//! // compile the filter in the context of the given definitions
//! let f = defs.compile(f.unwrap());
//! let f = defs.compile(f);
//! assert!(defs.errs.is_empty());
//!
//! let inputs = RcIter::new(core::iter::empty());
Expand Down
7 changes: 4 additions & 3 deletions jaq-interpret/tests/common/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@ use serde_json::Value;

fn yields(x: jaq_interpret::Val, f: &str, ys: impl Iterator<Item = jaq_interpret::ValR>) {
let mut ctx = jaq_interpret::ParseCtx::new(Vec::new());
let (f, errs) = jaq_parse::parse(f, jaq_parse::main());
assert!(errs.is_empty());
ctx.yields(x, f.unwrap(), ys)
let f = jaq_syn::parse(f, |p| p.module(|p| p.term()))
.unwrap()
.conv(f);
ctx.yields(x, f, ys)
}

pub fn fail(x: Value, f: &str, err: jaq_interpret::Error) {
Expand Down
4 changes: 3 additions & 1 deletion jaq-interpret/tests/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@ fn index_access() {
fn iter_access() {
gives(json!([0, 1, 2]), ".[]", [json!(0), json!(1), json!(2)]);
gives(json!({"a": [1, 2]}), ".a[]", [json!(1), json!(2)]);
gives(json!({"a": [1, 2]}), ".a.[]", [json!(1), json!(2)]);
gives(json!({"a": 1, "b": 2}), ".[]", [json!(1), json!(2)]);
// TODO: correct this
//gives(json!({"b": 2, "a": 1}), ".[]", [json!(2), json!(1)]);
Expand Down Expand Up @@ -74,6 +73,9 @@ fn iter_assign() {
);
}

yields!(index_keyword, r#"{"if": 0} | .if"#, 0);
yields!(obj_keyword, "{if: 0} | .if", 0);

yields!(key_update1, "{} | .a |= .+1", json!({"a": 1}));
yields!(key_update2, "{} | .a? |= .+1", json!({"a": 1}));

Expand Down
29 changes: 26 additions & 3 deletions jaq-interpret/tests/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ yields!(cartesian_arith, "[(1,2) * (3,4)]", [3, 4, 6, 8]);
#[test]
fn add() {
give(json!(1), ". + 2", json!(3));
give(json!(1.0), ". + 2.", json!(3.0));
give(json!(1.0), ". + 2.0", json!(3.0));
give(json!(1), "2.0 + .", json!(3.0));
give(json!(null), "1.e1 + 2.1e2", json!(220.0));
give(json!(null), "1.0e1 + 2.1e2", json!(220.0));

give(json!("Hello "), ". + \"world\"", json!("Hello world"));
give(json!([1, 2]), ". + [3, 4]", json!([1, 2, 3, 4]));
Expand All @@ -48,7 +48,7 @@ yields!(sub_arr, "[1, 2, 3] - [2, 3, 4]", json!([1]));
#[test]
fn mul() {
give(json!(1), ". * 2", json!(2));
give(json!(1.0), ". * 2.", json!(2.0));
give(json!(1.0), ". * 2.0", json!(2.0));
give(json!(1), "2.0 * .", json!(2.0));

give(json!("Hello"), "2 * .", json!("HelloHello"));
Expand Down Expand Up @@ -117,6 +117,27 @@ fn precedence() {
give(json!(null), "2 * 3 + 1", json!(7));
}

// these tests use the trick that `try t catch c` is valid syntax only for atomic terms `t`
// TODO for v2.0
//yields!(atomic_def, "try def x: 1; x + x catch 0", 2);
yields!(atomic_neg, "try - 1 catch 0", -1);
yields!(atomic_if, "try if 0 then 1 end catch 2", 1);
yields!(atomic_try, "try try 0[0] catch 1 catch 2", 1);
yields!(atomic_fold, "try reduce [][] as $x (0; 0) catch 1", 0);
yields!(atomic_var, "0 as $x | try $x catch 1", 0);
yields!(atomic_call, "def x: 0; try x catch 1", 0);
yields!(atomic_str1, r#"try "" catch 1"#, "");
yields!(atomic_str2, r#"def @f: .; try @f "" catch 1"#, "");
yields!(atomic_rec, "try .. catch 0", json!(null));
yields!(atomic_id, "try . catch 0", json!(null));
yields!(atomic_key1, "{key: 0} | try .key catch 1", 0);
yields!(atomic_key2, r#"{key: 0} | try . "key" catch 1"#, 0);
yields!(atomic_key3, r#"def @f: .; {key: 0} | try .@f"key" catch 1"#, 0);
yields!(atomic_num, "try 0 catch 1", 0);
yields!(atomic_block, "try (1 + 1) catch 0", 2);
yields!(atomic_path, "try [1][0] catch 0", 1);
yields!(atomic_opt, "def x: 0; try x? catch 1", 0);

yields!(neg_arr_iter1, "[-[][]]", json!([]));
yields!(neg_arr_iter2, "try (-[])[] catch 0", 0);

Expand Down Expand Up @@ -164,6 +185,8 @@ yields!(
"{a: 1, b: 2} | {a, c: 3}",
json!({"a": 1, "c": 3})
);
yields!(obj_var, r#""x" as $k | {$k}"#, json!({"k": "x"}));
yields!(obj_var_val, r#""x" as $k | {$k: 0}"#, json!({"x": 0}));
yields!(
obj_multi_keys,
r#"[{("a", "b"): 1}]"#,
Expand Down
4 changes: 1 addition & 3 deletions jaq-parse/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,12 @@ mod prec_climb;
mod string;
mod token;

use jaq_syn as syn;

pub use def::{defs, main};
use token::{Delim, Token};

use alloc::{string::String, string::ToString, vec::Vec};
use chumsky::prelude::*;
use syn::Spanned;
use jaq_syn::Spanned;

/// Lex/parse error.
pub type Error = Simple<String>;
Expand Down
2 changes: 0 additions & 2 deletions jaq-play/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,11 @@ crate-type = ["cdylib", "rlib"]

[dependencies]
jaq-syn = { version = "1.1.0", path = "../jaq-syn" }
jaq-parse = { version = "1.0.0", path = "../jaq-parse" }
jaq-interpret = { version = "1.2.0", path = "../jaq-interpret" }
jaq-core = { version = "1.2.0", path = "../jaq-core" }
jaq-std = { version = "1.2.0", path = "../jaq-std" }
aho-corasick = "1.1.2"
codesnake = { version = "0.1" }
chumsky = { version = "0.9.0", default-features = false }
hifijson = "0.2"
log = "0.4.17"
unicode-width = "0.1.13"
Expand Down
Loading
Loading