Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "unusual Julia features" section in the manual noteworthy diffs. #11966

Closed
Ismael-VC opened this issue Jun 30, 2015 · 45 comments
Closed

Add "unusual Julia features" section in the manual noteworthy diffs. #11966

Ismael-VC opened this issue Jun 30, 2015 · 45 comments
Labels
docs This change adds or pertains to documentation

Comments

@Ismael-VC
Copy link
Contributor

Reference:

Some things in Julia doesn't compare to anything else, help me list those things so I can document them!

  • Julia allows identifiers with complex unicode, ie x² = x*x. (more examples of this)
@ScottPJones
Copy link
Contributor

That particular example does bear distinguishing, julia is great in allowing Unicode, and using Unicode,
but using what look like operators as parts of identifiers, maybe should be revisited.
👍 on adding this section to the documentation!

@tkelman
Copy link
Contributor

tkelman commented Jul 1, 2015

"unique" is a strong word - for essentially any feature you can find an example of it having been done before somewhere, Julia's not really about treading new ground in terms of individual unique language features but rather synthesizing a compelling set of features in new combinations.

@ScottPJones
Copy link
Contributor

@tkelman @JeffBezanson What's your opinion on using a Unicode "Mathematical Operator" class character (0x2200-0x22ff) as part of a julia variable name?
I understand that functions (and operators) are just variables in Julia, but I think that these need to be handled specially, so that they are only allowed in single character names.
Being able to say: x² = 5 seems like a very bad thing to me.
(edited: I originally thought √x was a problem, but that works like I would expect, it is equivalent to sqrt(x))

@Ismael-VC
Copy link
Contributor Author

@ScottPJones that doesn't seem bad at all to me:

julia>= 5
5

julia> x = sqrt(x²)
2.23606797749979

julia> with_rounding(Float64, RoundDown) do== x^2
       end
true

@tkelman yes that's what I meant! Things like:

Julia is the only dynamic language with generic functions, parametric types and multiple dispatch all by default so far and this in itself it's a unique feature because it solves x use case like this ...and then an example.

From Wikipedia:

Language Type system Generic functions Parametric types
Julia dynamic default yes
Common Lisp dynamic opt-in yes (but no dispatch)
Dylan dynamic default partial (no dispatch)
Fortress static default yes

How could I rephrase it?

[pao: fix CLisp in table]

@Ismael-VC
Copy link
Contributor Author

@ScottPJones this is another advantage:

julia> with_rounding(Float64, RoundDown) do
            @time (x^2 + x^2*x^2) / (x^2 - (x^2)^(x^2))
    end
3.688 microseconds (15 allocations: 320 bytes)
-0.009615384615384616

julia> with_rounding(Float64, RoundDown) do
            @time (x² +*x²) / (x² -^x²) 
    end
2.603 microseconds (7 allocations: 192 bytes)
-0.009615384615384616

More performant and also more readable IMHO.

@Ismael-VC
Copy link
Contributor Author

Is this a bug?

julia> with_rounding(Float64, RoundDown) do
           @show== x^2
           @show x², x^2
           println()
           @show*== x^4
           @show*x², x^4
       end;
x² == x ^ 2 = true
(x²,x ^ 2) = (5,5.0)

x² *== x ^ 4 = false
(x² * x²,x ^ 4) = (25,25.000000000000004)


julia>*== round(x^4)
true

@Ismael-VC
Copy link
Contributor Author

This xˆ2 (char 710) shouldn't be allowed if is, though. Now that is understand can be confusing, and make us think it's an expression instead of an identifier:

julia> xˆ2 = round(x^2); xˆ2 ==true

julia> Int('ˆ'), Int('^')
(710,94)

@ScottPJones
Copy link
Contributor

@Ismael-VC I'm not saying that you shoudn't use an variable to store x^2, I had simply suggested earlier that it should be x_squared instead of , because looks like a postfix square operator
(which might be nice to have in Julia).

@jiahao
Copy link
Member

jiahao commented Jul 1, 2015

@ScottPJones you keep mentioning the "postfix square operator", but no character of this description exists in the mathematical operators block of Unicode characters. What you are actually seeing is U+B2 in the Latin-1 block, which is semantically a superscript digit 2 (official Unicode name SUPERSCRIPT TWO) and not an operator.

@rsrock
Copy link
Member

rsrock commented Jul 1, 2015

@ScottPJones Being able to write instead of x_squared is one of the nicer aspects of writing formulas in Julia. I would also include with that the implicit multiply in expressions like 4x. It means that formulas can often match exactly what comes out of a paper, which really reduces the friction when going back and forth. I use this all the time. See @jiahao 's example of the L-J potential from the other issue for another good example of this.

@KristofferC
Copy link
Member

I'll give you my Unicode variables when you can pry them from my cold, dead hands.

@johnmyleswhite
Copy link
Member

Unicode variable names are staying. That's not really up for debate, so let's mellow this conversation out.

@ScottPJones
Copy link
Contributor

@jiahao, that is somewhat for historical reasons, they were already in Latin1 before Unicode existed.
Unicode is a superset of Latin1, which is why those are in the Latin1 area, and not along with other similar characters.
@KristofferC I never said that there shouldn't be Unicode variables. I said that characters that act as operators (even if that operation is not yet supported in julia) should not be allowed as part of a name, because it can be dangerously confusing.

@ScottPJones
Copy link
Contributor

@rsrock I didn't say anything at all against being able to write in a formula. I said that it should be compiled as square(x), instead of depending on having been assigned a value earlier in the program.
Also, if the compiler can inline square(x), it should be able to determine that it is constant within a loop, so that the argument about using xx or as a variable to cache the value doesn't really hold water.

@jiahao
Copy link
Member

jiahao commented Jul 1, 2015

Nevertheless, contains a superscript 2, not a mathematical operator. Semantically, ² is not an operator. I don't think you are seriously suggesting that x⁻¹² gets magically parsed as x^-12...

Also, if the compiler can inline square(x), it should be able to determine that it is constant within a loop, so that the argument about using xx or x² as a variable to cache the value doesn't really hold water.

In the LJ example I gave, I had to compute a lot of different powers other than squares. I have yet to find a compiler smart enough to figure out the optimizations needed for that case, like the fact that computing the intermediate inverse square allows one to build a common subexpression tree. The naming of things like x_cubed, x_inversecubed, x_inversesixthpower gets exceedingly cumbersome.

@carnaval
Copy link
Contributor

carnaval commented Jul 1, 2015

Can we stop with the "confusion" argument ? In that case the confusion is extremely simple to solve (and thus not "dangerous" at all) : just paste it in the REPL. Every choice like this will induce some bounded form of confusion on some class of people, what we should instead aim for is not to have confusing and subtle things in the language.

@rsrock
Copy link
Member

rsrock commented Jul 1, 2015

@ScottPJones But that's exactly the point, and probably what you're missing here. I don't want to be compiled as square(x). I want it to be its own symbol. For example, in many such cases I won't take a sqrt(x²) at the end of my calculation, because I don't need it.

@jiahao
Copy link
Member

jiahao commented Jul 1, 2015

@Ismael-VC please see JuliaStrings/utf8proc#11 for a discussion of possibly canonicalizing all visually confusable characters to the same canonical form.

@ScottPJones
Copy link
Contributor

If you see in a fomula, what do you assume, that it is a variable, or the variable x squared?
I said nothing about doing a sqrt() on anything.
It just doesn't seem consistent at all that things like ∛x act as cbrt(x), but is a variable name.

@pao
Copy link
Member

pao commented Jul 1, 2015

There are variables in a number of domains that legitimately have a superscript 2 as a part of their identity: the χ² statistic, for instance. (Yes, I work with code where I wish I could actually just use that as the name of the variable and I can't because it's not Julia code.)

(I'm not a huge fan of the sqrt symbol, myself.)

@ScottPJones
Copy link
Contributor

@pao That's the first good reason I've heard for having it as part of a name.
It doesn't seem strange to you though that things like x²y are also valid identifier names?

@rsrock
Copy link
Member

rsrock commented Jul 1, 2015

@ScottPJones I expect it to be a variable, like below. This is equivalent to the use of χ².

function expected!(ret, pixels, β)
    μx, μy = β[1:2]
    σ = β[3]
    b² = β[4]
    Na² = β[5]
    for i in 1:length(pixels)
        x, y = pixels[i]
        ret[i] = Na²*twoDGauss(x, y, μx, μy, σ) + b²
    end
    ret
end

Before the change that allowed this, I had b2 in there. Imagine my joy when I could write !

(By the way, I know you said nothing about sqrt(). I did. It was my example, for you)

@pao
Copy link
Member

pao commented Jul 1, 2015

Strange? Sure. "Doctor, it hurts when I use this identifier."

There is a natural tension between trying to make style uniform (on this end of the spectrum, Go, where you code fails to compile if it's stylistically invalid) and allowing users to do things that may make sense in context but are dangerous in others. So far, Julia has been mostly designed along a "consenting adults" principle, and you aren't prevented from doing dumb things.

When you hit technical computing, you start to get users who have a preferred notation. I learned a notation for kinematics/dynamics that requires a left superscript, a right superscript, and a right subscript. Even Julia doesn't give me that much notational flexibility! And when I compare what I implement on my computer with what I have in my notes, I have to translate, which makes it easier to introduce errors between what the math says and what I'm computing.

There are a lot of applications for Julia where allowing an idiosyncratic notation will be valuable. This code may be seen by only a few people, or people working in a particular domain. If you are working in a bigger group, or a more disparate group, or aren't doing things where this is valuable--no one will prevent you from putting your own house style rules in place.

(For those keeping score at home, yes, my opinion on Unicode-characters-in-code-related issues has changed drastically over the past couple of years.)

@ScottPJones
Copy link
Contributor

OK, I hope people understand why I raised this as an issue, which I think definitely needs to be pointed out in @Ismael-VC's documentation section, as something that would definitely be confusing to people not coming from a technical computing background. As my background has been in software engineering and computer languages, followed by decades in an industry where confusability leading to bugs can literally lead to life or death situations, I always like things to be as non confusable as possible.
For the record, I don't mind the Unicode characters in code in general at all, it's quite nice, I just get concerned about ones that seem inconsistent or confusable.

@rsrock
Copy link
Member

rsrock commented Jul 1, 2015

Sure thing-- it's absolutely a valid point, and a valid discussion that followed. One of the things that makes Julia so great is that all of these angles are considered.

One other thing to keep in mind here. When my code crashes, it doesn't kill anyone. Instead, it's part of an experiment, and is constantly in flux. I appreciate language elements that make such changes easier. That's why some practices that would be horrific in your earlier experience are desirable, even necessary, in the technical computing arena (necessary, within reason, of course).

On Jul 1, 2015, at 12:00 PM, Scott P. Jones <notifications@git.luolix.topmailto:notifications@github.com> wrote:

OK, I hope people understand why I raised this as an issue, which I think definitely needs to be pointed out in @Ismael-VChttps://github.com/Ismael-VC's documentation section, as something that would definitely be confusing to people not coming from a technical computing background. As my background has been in software engineering and computer languages, followed by decades in an industry where confusability leading to bugs can literally lead to life or death situations, I always like things to be as non confusable as possible.
For the record, I don't mind the Unicode characters in code in general at all, it's quite nice, I just get concerned about ones that seem inconsistent or confusable.


Reply to this email directly or view it on GitHubhttps://github.com//issues/11966#issuecomment-117751502.


Ronald S. Rock, Jr., Associate Professor
Dept. of Biochemistry and Molecular Biology, The University of Chicago
GCIS W240, 929 E. 57th St.
Chicago, IL 60637 USA
+1 773.702.0716 (w), +1 773.702.0439 (f)
rrock@uchicago.edumailto:rrock@uchicago.edu

@ScottPJones
Copy link
Contributor

Maybe this is something a Lint option could check.
If you have a variable x, and it is used to initialize a variable such as , and then x is modified before the use of , it could give a warning.

@kshyatt kshyatt added the docs This change adds or pertains to documentation label Jul 2, 2015
@stevengj
Copy link
Member

stevengj commented Jul 2, 2015

Note that we also allow superscript parens, which are commonly used in identifiers in mathematics in combination with superscript letters or numbers or ± signs. e.g. χ⁽³⁾ is a valid variable name in Julia, and is e.g. used in electromagnetism for the third-order nonlinear susceptibility. Another example is ∇², which is the name of the Laplacian operator (also denoted , U+2206, not to be confused with delta), and you wouldn't want it to be parsed as ∇^2.

Anyway, this entire issue seems to have gone off the rails into an unrelated argument over what characters should be allowed in identifiers. Wasn't this about documentation?

@ScottPJones
Copy link
Contributor

By that principal, for consistency, it would seem you would have wanted also ∛x as a variable name, not cbrt(x).
Given that it seems the inconsistency is going to stay in julia, for better or worse, what do you think about tightening up the rules a bit, and/or having Lint enhanced to check for possible bugs?
Are there any rules that could be expressed about using superscripts/subscripts for example in identifiers? Such as, they must come at the end of an identifier, and superscripts and subscripts should not be intermixed, and may need to be in always superscript(s) followed by subscript(s)?
i.e. χ⁽³⁾ is valid, but χ⁽³⁾5 would not be. (the reason for supersubscripts not being intermixed, is if
x superscript 3 subscript 5 and x subscript 5 superscript 3 get rendered the same, that would be confusible. There are a lot of issues with confusibility in Unicode these days, because of composed and decomposed characters.

@stevengj
Copy link
Member

stevengj commented Jul 2, 2015

@ScottPJones, it would be helpful if we could keep issues focused. Open separate issues (or comment on existing relevant issues) if you want to enhance Lint (e.g. to warn about superscripts, or by warning about usage of two variables that are NFKC-equivalent in the same scope), restrict usage of super/subscripts, canonicalize (or warn about) confusables (see also JuliaStrings/utf8proc#11), or to allow ∛x as a variable name (I doubt you'll get much traction on that one).... Composed/decomposed characters are already handled by NFC normalization (#5462).

This issue was supposed to be about (potentially) documenting a short summary of features that are unique to Julia, not about debating endlessly whether those features are a good idea.

@Ismael-VC
Copy link
Contributor Author

This issue was supposed to be about (potentially) documenting a short summary of features that are unique to Julia

@stevengj I will certainly do this, I'm waiting for more material to come in order to document a good chunk of info at once. I already have several good examples of unicode identifiers.

What other languages use staged functions? I'd like more examples of peculiar if not unique julian features.

@Ismael-VC Ismael-VC changed the title Add "unique Julia features" section in the manual noteworthy diffs. Add "peculiar Julia features" section in the manual noteworthy diffs. Jul 2, 2015
@ScottPJones
Copy link
Contributor

@stevengj This particular issue was opened precisely to discuss this and other similar issues, where Julia was different from most languages, because I noted the inconsistency in #11927 (comment), as referenced in the opening comment. I also never proposed ∛x as a variable name, why did you say that? I'd be dead set against that (I think it is nice that it does act as a prefix operator, and is consistent with other things in Julia). That would just make things worse than they are now with .
This is also not an endless debate, this issue was raised two days ago to sound out peculiarites of Julia that need to be at the very least documented. Here is another fun thing to consider:

julia> cube(x) = x³
cube (generic function with 1 method)
julia> cube(3) # Silly me, I think I'll get 27
ERROR: UndefVarError: x³ not defined
 in cube at none:1
julia>= 42
42
julia> cube(3)
42

And also things like this:

julia> y = div(typemax(UInt),3)
0x5555555555555555
julia> z = typemax(UInt)/3
6.148914691236517e18
julia> Int(y)
6148914691236517205

Because of the / always returns Float rule, different from any other language I've dealt with, you lose 3 digits from your result, which doesn't seem very good, and it is totally surprising to programmers coming from anywhere else.

@StefanKarpinski
Copy link
Member

what we should instead aim for is not to have confusing and subtle things in the language.

Thank you, Oscar.

Let's call this "unusual features" – "unique" is too strong, "peculiar" sounds too negative.

@StefanKarpinski
Copy link
Member

@ScottPJones, this was discussed previously – before you were around. The current behavior was agreed upon and people who use reason to use this kind of mathematical notation a lot like it. There's no compelling reason to revisit that decision, despite your misgivings. Feel free to document it, however.

@Ismael-VC
Copy link
Contributor Author

@ScottPJones ...empty your cup of 🍵

I know I had to do it (coming from Python) when I started watching MIT 6.001 Structure and Interpretation in order to learn Scheme, in order to learn Lisp like macros, in order to learn Julia macros, staged functions ...and beyond! (still stuck there)

Maybe I don´t have many years of programming experience like you all do (and everyone started from scratch at one point), but the little that I do have right now, has cost me a lot of effort. Julia being relatively new and radical in may aspects is of course going to be misunderstood quite often (which is why I'm trying to help diminish that).

But I would at least expect that the other people interested in Julia also study how to use it (I'm not expecting that they also study how it works, like I'm also doing). If they don't have the common sense to do this or the will to exercise their effort and they simply expect that it should just work like x or y thing that they already know ...then they are free to shoot themselves in their feet for all I care.

it is totally surprising to programmers coming from anywhere else.

And what about the millions of future programmers that are out there! ...yet to be tainted by anything non Julian?

@Ismael-VC
Copy link
Contributor Author

Let's call this "unusual features" – "unique" is too strong, "peculiar" sounds too negative.

@StefanKarpinski thanks! unusual it is then, my english is not that good to tell the difference.

@Ismael-VC Ismael-VC changed the title Add "peculiar Julia features" section in the manual noteworthy diffs. Add "unusual Julia features" section in the manual noteworthy diffs. Jul 3, 2015
@ScottPJones
Copy link
Contributor

@StefanKarpinski I wasn't asking to change that particular issue, and I had read some of the long ago discussions, but that is exactly the sort of thing that belongs in the proposed "unusual Julia features" documentation. They don't belong in the "differences between Julia and C/C++", or some other section, because they are different from most if not all languages out there.
(The / returning float wasn't really even a problem for me, because I also used CacheObjectScript, where / was floating point division, and \ was integer division, but it seems to come up frequently as a "gotcha" from people coming from most other languages)

@ScottPJones
Copy link
Contributor

But I would at least expect that the other people interested in Julia also study how to use it

@Ismael-VC Agreed, but in order to save them valuable time, they need something to study, which is precisely why I wrote the "Noteworthy differences between Julia and C/C++" section, and why we need to determine what unusual features in Julia people are stubbing their toes on already, or are likely to in the future, and at the very least, get them well documented in the section you have nicely proposed.
Besides documentation, I think the compiler itself can help, like in my example of cube(x) above.
If you define a method, where you have an identifier that is not local in scope, but is composed of a local identifier followed by superscripts, then that would be a good time to give a warning. If you have an assignment in the local scope, but an assignment to what looks like the "root" identifier happens between that assignment and the use of the superscripted variable, that's another place where a warning might prevent problems.

And what about the millions of future programmers that are out there! ...yet to be tainted by anything non Julian?

Well, I don't know if you would consider my children "tainted" by Lua, I think that's a very fine little language, but I'm spending a bit of time each day this summer teaching them Julia, to hopefully save them from C++ hell!

@Ismael-VC
Copy link
Contributor Author

@ScottPJones that's what I've been talking about all this time! I believe no one disagrees to document it.

I'm spending a bit of time each day this summer teaching them Julia

That's great!

Also learn from them! You could show them the pros of using unicode () they will probably like it!

After that you could let them use it in more and more examples and then after all that show them the cons (if they haven't find them yet) and the alternatives (xx, x_cube, etc.) all over the place in C++ code! I'm curious to know what they could think about that? 💭

@stevengj
Copy link
Member

stevengj commented Jul 3, 2015

@ScottPJones, this issue was opened to document unusual features of Julia. Can't you see that turning it into an open-ended discussion of what identifiers should be allowed and what warnings should be raised is a mite off-topic?

Having a long unrelated (or marginally related) discussion thread in any issue makes life harder for everyone trying to follow Julia development.

@stevengj
Copy link
Member

stevengj commented Jul 3, 2015

At this point I'm inclined to just close this issue and @Ismael-VC can open a fresh one in order to increase the signal-to-noise ratio.

@ScottPJones
Copy link
Contributor

@stevengj No. If something is unusual enough from most all other languages that it needs to be specially documented, it deserves at least a little time to discuss whether or not it actually is needed in the language, and if so, what else, besides documentation, can be done to preserve the sanity of people coming new to Julia.

@stevengj
Copy link
Member

stevengj commented Jul 3, 2015

This is the wrong issue for that discussion. Please don't hijack issues that are not about changing the language in order to focus on your preferences for changing the language.

@ScottPJones
Copy link
Contributor

@stevengj I haven't said anything about changing the language (except giving warnings) for most of this discussion, ever since @pao's comment. I've only been thinking of ways of decreasing the possibility of confusion among programmers new to Julia, by documentation or other means (not everyone reads all of the documentation).

@tkelman
Copy link
Contributor

tkelman commented Jul 3, 2015

Yes, examples to document these points would be great. Please open a PR to start with any you can think of. This issue has gone down a rabbit hole and does not need to be kept open.

@ScottPJones ...empty your cup of 🍵

Please heed this advice. This was supposed to be a request for other examples, and it's instead become an overly long discussion about unicode in identifiers that belongs on the mailing list. Open-ended questions that aren't directly addressing the current issue or PR don't belong on the bug tracker.

@tkelman tkelman closed this as completed Jul 3, 2015
@Ismael-VC
Copy link
Contributor Author

julia> julia  :👽
true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation
Projects
None yet
Development

No branches or pull requests