Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implicit value at start of array failing fix #674

Merged
merged 3 commits into from
Mar 2, 2021

Conversation

gluax
Copy link
Contributor

@gluax gluax commented Feb 17, 2021

Resolves #607.

@gluax gluax self-assigned this Feb 17, 2021
Copy link
Contributor

@Protryon Protryon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an ideal approach. What we should do, is skip over unresolvable types and keep AST node in some kind of backlog array then go back and reprocess them if a type is found. We need to enforce all subnodes have the same type in this process.

asg/src/expression/array_inline.rs Outdated Show resolved Hide resolved
asg/src/expression/array_inline.rs Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Feb 17, 2021

Codecov Report

Merging #674 (c7bcab3) into master (2794b9a) will decrease coverage by 0.04%.
The diff coverage is 92.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #674      +/-   ##
==========================================
- Coverage   75.14%   75.10%   -0.05%     
==========================================
  Files         563      563              
  Lines       17645    17658      +13     
==========================================
+ Hits        13260    13262       +2     
- Misses       4385     4396      +11     
Impacted Files Coverage Δ
asg/src/expression/array_inline.rs 60.33% <88.88%> (+1.40%) ⬆️
asg/tests/pass/array/mod.rs 100.00% <100.00%> (ø)
compiler/tests/integers/int_macro.rs 92.15% <0.00%> (-2.95%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2794b9a...c7bcab3. Read the comment docs.

@acoglio
Copy link
Collaborator

acoglio commented Feb 17, 2021

@Protryon @gluax @collinc97 Apologies if the following is obvious already, but is the current Leo type inference using a Hindley-Milner-style algorithm?

@Protryon
Copy link
Contributor

@Protryon @gluax @collinc97 Apologies if the following is obvious already, but is the current Leo type inference using a Hindley-Milner-style algorithm?

I don't actually know how you would map it to formal definitions -- it's a simple type bubble up/down kind of system. We don't do any implicit type changes (i.e. casts) or reverse-dependent type inference.

I.e. the following doesn't work:

let x = 5;
let y = x + 5u32;

but this does:

let x = 5u32;
let y = x + 5;

This isn't what rust does AFAIK, but it's performant, effective, and easy to implement/maintain. It'd be more accurate to call this fill-in type checking. There is also an argument to be made that it avoids complex "WTF" moments with type inference due to bidirectional inference.

@acoglio
Copy link
Collaborator

acoglio commented Feb 17, 2021

Understood, seems fine. If we run into limitations at some point, we can consider Hindley-Milner. It looks like Hindley-Milner is what Rust uses too -- I just looked that up.

In case you are curious (but feel free to skip the rest of this message if not interested), it works like this (which you may well know it already, perhaps under a different name).

Conceptually:

  1. Wherever a type is missing (in an untyped literal or in a let statement, for Leo), you add a unique type variable.
  2. You go through the program to type-check it, and when the checking involves some type variable (which you can't type-check because you don't have a concrete type for it yet), you generate an assertion expressing the type-checking conditions.
  3. You iteratively solve the system of generated type assertions (analogously to Gaussian elimination for linear systems), until you either find a unique solution for each type variable, or the system has no solution (the program is type-incorrect), or the system has multiple solutions (the program is type-ambiguous).

In an implementation, you would generally perform the steps together, resolving assertions as they come up when possible, etc.

Languages like Standard ML use this approach.

There are variants in which the assertions are equalities (e.g. typevar = u8), or inequalities (e.g. typevar <= u32), etc.

This kind of algorithm provably generates the most general typing (if successful). It has exponential complexity (EXPTIME class), but that behavior seems to occur only in pathological cases. A drawback is that sometimes error messages may be obscure (due to the involvement of type variables in possibly distant places in the code), just as you note in your message.

@collinc97
Copy link
Collaborator

This issue and PR have gone a bit stale. To revive it, we need to answer the question:

Do we allow inferred types for implicit array elements?

function main () {
    let a = [1u8, 2u8, 3u8, 4];
    let b = [1u8, 2u8, 3, 4u8];
    let c = [1u8, 2, 3u8, 4u8];
    let d = [1, 2u8, 3u8, 4u8];
}

If Yes, then these should all pass.
If No, then these should all fail.

It is my opinion that the syntax we choose should encourage developers to write easy to read code. Therefore, I would answer No to the above question because I think the cleanest syntax would be:

    let a: [u8; 4] = [1, 2, 3, 4];

We would also allow:

    let a = [1u8, 2u8, 3u8, 4u8];

Which is only slightly better than the original 4 cases in question, but nonetheless it is subjectively easier to read for me.

Requesting comments from @acoglio @gluax @howardwu

@acoglio
Copy link
Collaborator

acoglio commented Feb 25, 2021

@collinc97 My own inclination is to say 'yes' just because all 4 have a unique type inference solution, so it's clear cut -- soundness and completeness. I may also be biased by having used languages where those would be legal for the reason just stated -- I just write a few explicit sufficient types, imagining how they "fill" the other missing types based on the type requirements, and let Hindley-Milner work its magic.

However, if we prefer to use, at least initially, a simpler, sound but not complete, type inference algorithm (as we have now), I think it's fine for the answer to be 'no'.

One thing that the documentation should ideally make clear to the user is when type inference succeeds and when it doesn't. Just saying that "the compiler will either infer or error" might be okay, but it doesn't seem great. A good thing for languages like Standard ML that use Hindley-Milner (and also Rust does, I believe) is that there is a simple characterization -- succeeds if inference exists and is unique, errors otherwise.

I also agree that, in any case, it's a bit odd to have some of the 4 cases pass and others fail, since there is a form of symmetry among them. Nonetheless, if there is a clear explanation of an asymmetry (e.g. the first element counts), it might be reasonable to have some pass and others fail.

@collinc97
Copy link
Collaborator

@collinc97 My own inclination is to say 'yes' just because all 4 have a unique type inference solution, so it's clear cut -- soundness and completeness. I may also be biased by having used languages where those would be legal for the reason just stated -- I just write a few explicit sufficient types, imagining how they "fill" the other missing types based on the type requirements, and let Hindley-Milner work its magic.

However, if we prefer to use, at least initially, a simpler, sound but not complete, type inference algorithm (as we have now), I think it's fine for the answer to be 'no'.

One thing that the documentation should ideally make clear to the user is when type inference succeeds and when it doesn't. Just saying that "the compiler will either infer or error" might be okay, but it doesn't seem great. A good thing for languages like Standard ML that use Hindley-Milner (and also Rust does, I believe) is that there is a simple characterization -- succeeds if inference exists and is unique, errors otherwise.

I also agree that, in any case, it's a bit odd to have some of the 4 cases pass and others fail, since there is a form of symmetry among them. Nonetheless, if there is a clear explanation of an asymmetry (e.g. the first element counts), it might be reasonable to have some pass and others fail.

Thank you for clarifying I change my answer to Yes. If it necessary for type inference completeness then let's support all 4 cases.

@gluax can you update this PR to the current master when you get the chance?

Copy link
Collaborator

@collinc97 collinc97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

@bors
Copy link
Contributor

bors bot commented Mar 1, 2021

👎 Rejected by code reviews

@collinc97 collinc97 dismissed Protryon’s stale review March 1, 2021 23:57

Conversation resolved above

@collinc97
Copy link
Collaborator

bors r+

@bors
Copy link
Contributor

bors bot commented Mar 2, 2021

Build succeeded:

@bors bors bot merged commit 4253ac9 into master Mar 2, 2021
@collinc97 collinc97 deleted the bug/607-array-start-implicit branch March 31, 2021 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Arrays starting with implicit value failed to resolve type
4 participants