Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

line numbers in the AST re #732 #750

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

ffrank
Copy link
Contributor

@ffrank ffrank commented Mar 23, 2024

I wrote a suggestion of how to store the mcl code coordinates of notable AST nodes in the parser. (It's not done for each last node.)

The approach is possibly flawed; the amount of copy/paste makes me sad.

This was still the easier part though. Next I was trying to figure out how to generate an error message during type unification, that indicates the == operator in the following snippet:

$a = 1
$b = "one"

if $a == $b {
        $c = 3
}

But the unification approach that collects all invariants, and then loops over them, makes this very hard, maybe impossible.

Will we need to keep references to the AST nodes in the invariants? So that we can trace where problematic invariants originate?

@purpleidea
Copy link
Owner

the amount of copy/paste makes me sad.

I expect some copy-paste, that's to be expected.

during type unification

Don't worry about this at the moment. I have some structural changes which are coming and I don't know if this will break anything that happens there yet, so TBD.

It's not done for each last node.)

Just a POC is fine!

@purpleidea
Copy link
Owner

PS: Note there is also the %error stuff in parser-- that works well enough. Line numbers in the code is the goal =D

Copy link
Owner

@purpleidea purpleidea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some random quick thoughts.

lang/parser/parser.y Outdated Show resolved Hide resolved
lang/parser/parser.y Show resolved Hide resolved
lang/ast/structs.go Outdated Show resolved Hide resolved
@ffrank
Copy link
Contributor Author

ffrank commented Mar 24, 2024

In the latest commit 913c100 the code essentially works. It will store start and end positions in nodes types that have received the TextArea embed.

This does not survive interpolation though, unless Interpolate is adapted as demonstrated for StmtBind and StmtIf here.

I even have an idea for a type unifier that could allow us to drill down to the offensive expressions.

EDIT: after some shy attempts to implement a prototype, I want to say this idea did not pan out. Refactored unification sounds much more promising 👇🏻

@purpleidea
Copy link
Owner

FYI: I'm currently refactoring the type unification stuff, so after that's done it will be great to revisit this and get it merged in some form. Sneak peak: The new unification has a pointer to the Expr in question so as long as the Expr has the ROW/COL information, then we'll be able to show that without doing any additional plumbing.

@ffrank
Copy link
Contributor Author

ffrank commented Apr 13, 2024

I'm not moving this in any direction until that change lands then.

@purpleidea
Copy link
Owner

Yeah, sorry for the delay. The start of that is here: https://github.com/purpleidea/mgmt/tree/feat/unification in case you're knowledgeable in this area and want to help. I'm kind of a terrible algorithmist and a few things are up in the air, but if you or anyone is an expert in this area, please chime in =D

@purpleidea
Copy link
Owner

Boy has time flown by! I guess you know all the things I was busy with. In any case, unification has merged, so if you'd like to rebase this, that would be awesome!

It might also be very easy for you to plug things into unification error messages too (or I can help with that)

LMK thanks!

@ffrank
Copy link
Contributor Author

ffrank commented Sep 8, 2024

Hey, nice work on the resolver, feels good so far.

I can build this branch and ./mgmt run lang error.mcl where error.mcl has

# this shall fail
$value = 1 + "a"
test "t1" { anotherstr => $value, }

The error message will be this:

cli parse error: could not unify types: unify error with: topLevel(func() { <built-in:_operator> }): type error: int != str

The trouble is that the error is reported at the definition of the + operator rather than the invocation.

Note that I did change the String function for FuncCalls to give the location. But it is not used at the moment. I have a commit in here that makes mgmt dump a string representation of the AST and it includes

call:_operator(str("+"), int(1), str("a")) @ (1 1)

(Never mind that this position is also wrong, but right now it seems more important that the resolver can report it at all.)

@ffrank
Copy link
Contributor Author

ffrank commented Sep 8, 2024

(Wrong text position now also fixed)

@purpleidea
Copy link
Owner

resolver

What's a resolver?

The trouble is that the error is reported at the definition of the + operator rather than the invocation.

All type unification errors that are found during unification are reported at the same place... We can likely improve errors a bit though.

You asked me to comment and look at this here, I need a bit more info what you're asking, sorry!

@ffrank
Copy link
Contributor Author

ffrank commented Sep 10, 2024

Okay trying to clear it up. I meant solver of course, not resolver.

The current issue is that we get this error:

topLevel(func() { <built-in:_operator> }): type error: int != str

That is correct, but more helpful to the user will be the information that the call of this operator is 1 + "a" in line 2 column 7.

In essence, I see a need to show a "call stack". Of course, this is at type checking time, so nothing is called (is it?)

Maybe what's rather needed is a few levels of AST parents. Rather than just saying "this unnamed internal operator was given bad operands", we should include that "its parent is this expression 1 + "a" in this bind statement $value = ...". (The immediate parent will be enough in this example, but not sure if more complex code will bring a need for more.)

@purpleidea
Copy link
Owner

Okay trying to clear it up. I meant solver of course, not resolver.

Duh now! I don't know why my mind missed that, lol

That is correct, but more helpful to the user will be the information that the call of this operator is 1 + "a" in line 2 column 7.

Yeah I agree, we probably want to hide the "ExprTopLevel" and "ExprSingleton" expressions. To do so, you'd want to wrap this x.Expr call:

return nil, errwrap.Wrapf(err, "unify error with: %s", x.Expr)

With trueCallee from here:

// trueCallee is a helper function because ExprTopLevel and ExprSingleton are

I expect this will cause an import cycle, so I'd have to ponder what the best move is... I'd like to avoid needing to add a new method on Expr to get the "TrueCallee" for everyone.

Does that help?

@ffrank
Copy link
Contributor Author

ffrank commented Sep 10, 2024

Yes the cycle prevented doing it from the solver. Spot on.

I pushed a silly little hack to do this recursion from the GAPI scope, but it does not work. This is the output now:

cli parse error: could not unify types [[ in expression: func() { <built-in:_operator> } ]]: unify error with: topLevel(func() { <built-in:_operator> }): type error: int != str

So the result from trueCallee still does not show information.

This makes sense to me. As said, it's true that it's the + operator that ultimately is affected. But we don't care. We are interested in its AST parent(s).

@purpleidea
Copy link
Owner

You want to unwrap the x.Expr part...

If you see this as your error:

topLevel(func() { built-in:_operator }): type error: int != str

Then TrueCallee(_) would return:

func() { built-in:_operator }: type error: int != str

If that's not what you want, I am lost. But I don't see how this error is not correct. What do you expect it to print?

@ffrank
Copy link
Contributor Author

ffrank commented Sep 11, 2024

I may be barking up the wrong tree, going on about the AST.

But the issue (in terms of AST nodes) is that this type error is reported on the ExprFunc (i.e. the declaration) of the + operator, which will never be helfpul. I need it reported on the ExprCall representing (in this case) 1 + "a".

Remember, ultimately we want to report the line and column number of the code that caused the unification issue.

@purpleidea
Copy link
Owner

the issue (in terms of AST nodes) is that this type error is reported on the ExprFunc (i.e. the declaration) of the + operator, which will never be helfpul. I need it reported on the ExprCall representing

I agree. I don't know the correct way to achieve this at the moment, but I will dig into it if I can.

Remember, ultimately we want to report the line and column number of the code that caused the unification issue.

If it reports the definition site instead of the call site for now, that's okay =D

@ffrank
Copy link
Contributor Author

ffrank commented Sep 12, 2024

If it reports the definition site instead of the call site for now, that's okay =D

I don't believe I agree, at least until I see a typing issue that is not reported on this kind of builtin element. Why have line numbers when we cannot report errors that use them?

I had the idea to include the pointer to the entire AST node in the error, rather than just the Expr within. Then, we could run a graph search through the AST, so that we can basically "zoom in" on the problematic expression(s).

The problem is that I cannot very easily run a search through the AST. That's why I created #773 in order to have some discussion around that. (It might be good to talk about this in person a little.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants