Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make \cdot and \cdotp operators equivalent #25157

Merged
merged 1 commit into from
Apr 23, 2021

Conversation

iagobaapellaniz
Copy link
Contributor

@iagobaapellaniz iagobaapellaniz commented Dec 18, 2017

@fredrikekre
Copy link
Member

Shouldn't this be done as in #19464?

@iagobaapellaniz
Copy link
Contributor Author

Probably. I'm not an expert on how to do this.

I'll follow this thread and I'll study the PR #19464 in order to figure out how it can be done.

Thanks

@stevengj
Copy link
Member

stevengj commented Dec 18, 2017

Yes, the right way to do this is to just add a line to https://github.com/JuliaLang/julia/blob/master/src/flisp/julia_charmap.h to make this substitution.

Also add a note to NEWS, documentation at the end of https://github.com/JuliaLang/julia/blob/master/doc/src/manual/variables.md#allowed-variable-names (along with the note about ε), and add a test.

@stevengj
Copy link
Member

(Note also that you shouldn't open a new PR. Just revert the commit in this branch and push new commits to fix it the right way. It will all get squashed into a single commit when it is merged. That way all of the discussion will go together.)

@iagobaapellaniz
Copy link
Contributor Author

Yes. I'll do that :-) Thanks @stevengj

@@ -122,7 +122,8 @@ Different ways of entering Unicode combining characters (e.g., accents)
are treated as equivalent (specifically, Julia identifiers are NFC-normalized).
The Unicode characters `ɛ` (U+025B: Latin small letter open e)
and `µ` (U+00B5: micro sign) are treated as equivalent to the corresponding
Greek letters, because the former are easily accessible via some input methods.
Greek letters, and '·' (U+00B7: Middle dot) is treated as the mathematical
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevengj
Copy link
Member

Needs a NEWS item, then should be good to go.

@iagobaapellaniz
Copy link
Contributor Author

iagobaapellaniz commented Dec 18, 2017

Not yet, I think. I don't know if I'm missing something but Julia still complains about \cdotp[TAB] with

julia> ·
ERROR: syntax: invalid character "·"

I'm pretty sure that this should be fixed in julia-parser.scm. HELP needed.

Summary: I checked a random equivalence and it works {€ -> +}, now I can sum with 2 € 2. On the other hand the error of invalid character is present at that file, but it would take me too long to learn SCHEME. We need to U+00b7 ('·': middle dot) be a valid character for Julia.

julia>+ (generic function with 177 methods)

Could I suggest @JeffBezanson to review this?

@ararslan ararslan added the needs news A NEWS entry is required for this change label Dec 18, 2017
@stevengj
Copy link
Member

stevengj commented Dec 19, 2017

Oh, I know the problem. The issue is that normalization of symbols only happens after they are parsed in the file, and since · is a punctuation character (Unicode category Po) it is not treated as a valid operator symbol. This doesn't happen with because is already a valid identifier.

The fix, at first glance, should be to add · to the list of operators at the top of src/julia-parser.scm. Just add to the prec-times list.

@iagobaapellaniz
Copy link
Contributor Author

iagobaapellaniz commented Dec 19, 2017

I checked this and it does not work. Now it is a valid identifier but it seems that the mapping arrives before since

julia> ·
ERROR: UndefVarError: · not defined

Maybe this is necessary, but still does not fix it. The character inserted and the displayed in the error are U+00b7.

@stevengj
Copy link
Member

stevengj commented Dec 19, 2017

It could be that the normalization is not done to operators (as opposed to ordinary variable names), let me check.

@stevengj
Copy link
Member

Yup, the normalization is only done to variable names, not to operators. There would need to be a call to a normalization routine at the end of read-operator.

@iagobaapellaniz
Copy link
Contributor Author

Nice! With that I think I'm able to finish this. Thanks a lot!

@@ -549,7 +549,9 @@
op))
(else '|.|)))))

((opchar? c) (read-operator port (read-char port)))
((if (opchar? c)
(accum-julia-symbol c port)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is wrong; accum-julia-symbol reads an identifier (not an operator) from the input stream, leaving nothing else for read-operator to read.

I think you will need to export a new flisp function to normalize a symbol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new commit. It works now after make cleanall and make

@iagobaapellaniz
Copy link
Contributor Author

iagobaapellaniz commented Dec 20, 2017

It works now for me after make cleanall and make

julia>    # \cdotp[TAB] or U+00b7
dot (generic function with 9 methods)
julia> ·   # \cdot[TAB] or U+22c5
dot (generic function with 9 methods)
julia> dot
dot (generic function with 9 methods)

Should I do it in another way? I'm not able to find an alternative.

src/julia-parser.scm Outdated Show resolved Hide resolved
@iagobaapellaniz iagobaapellaniz changed the title Make \cdot and \cdotp equivalent WIP: Make \cdot and \cdotp equivalent Dec 20, 2017
@iagobaapellaniz iagobaapellaniz changed the title WIP: Make \cdot and \cdotp equivalent WIP: Make \cdot and \cdotp operators equivalent Dec 20, 2017
@stevengj
Copy link
Member

Ping, what this the status here?

@iagobaapellaniz
Copy link
Contributor Author

I'm not able to write anything that really works. I think we should work on the parser, which is written in SCM/C++. At the moment I have no time to learn all this.

It seems that there is a function in the parser that only works to make equivalent the identifiers, but not the operators. Help is needed, or a new PR should be opened to solve the issue.

@JeffBezanson
Copy link
Member

I can help with this at some point, but it's not urgent since it's a non-breaking change.

@mlhetland
Copy link
Contributor

Some relevant discussion on Discourse.

@mlhetland
Copy link
Contributor

mlhetland commented Jul 18, 2018

A summary of the Discourse discussion based on my understanding:

  • Greek ano teleia (U+0387) is a valid identifier character. This is motivated by it having the Other_ID_Continue property, as recommended by UAX notofonts/noto-fonts#31.
  • Middle dot (U+00b7, \cdotp) is not a valid identifier character, despite falling under the same UAX recommendation, and being canonically equivalent to the ano teleia.
  • Julia currently normalizes the (valid) ano teleia into the (invalid) middle dot.
  • Ano teleia, middle dot and dot operator look quite similar in many fonts; the first two are often identical (cf. ··⋅).
  • The middle dot is easy to type on many keyboards.

@stevengj
Copy link
Member

PR #28167 removes U+0387 from the list of valid identifiers. If this PR is revived at some point, we will want to normalize both U+0387 and U+00b7 to U+22c5 (\cdot).

@stevengj
Copy link
Member

stevengj commented Aug 5, 2020

I rebased this on master and fixed it by adding code to normalize operators as they are parsed, as I suggested above. U+0387 is also treated as equivalent. The test now passes for me.

@stevengj stevengj added parser Language parsing and surface syntax triage This should be discussed on a triage call and removed needs news A NEWS entry is required for this change triage This should be discussed on a triage call labels Aug 5, 2020
@iagobaapellaniz
Copy link
Contributor Author

Thanks! This will help a lot of people using a spanish keyboard layout, since U+00B7 is just typed by pressing SHIFT+3. Thanks again

@stevengj stevengj added the triage This should be discussed on a triage call label Aug 14, 2020
@iagobaapellaniz
Copy link
Contributor Author

Could this make its way to 1.6?

@iagobaapellaniz iagobaapellaniz changed the title WIP: Make \cdot and \cdotp operators equivalent Make \cdot and \cdotp operators equivalent Dec 15, 2020
@stevengj
Copy link
Member

Too late for 1.6, I think.

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
@oscardssmith oscardssmith removed the triage This should be discussed on a triage call label Apr 22, 2021
@vtjnash
Copy link
Member

vtjnash commented Apr 22, 2021

Triage didn't seem to see why this was put on the list. It seemed like nobody had objected to merging?

@JeffBezanson JeffBezanson added the merge me PR is reviewed. Merge when all tests are passing label Apr 22, 2021
@musm musm merged commit d7d2b0c into JuliaLang:master Apr 23, 2021
@simeonschaub simeonschaub removed the merge me PR is reviewed. Merge when all tests are passing label Apr 28, 2021
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
johanmon pushed a commit to johanmon/julia that referenced this pull request Jul 5, 2021
Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Language parsing and surface syntax
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants