Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

julep: a plan for backticks #12139

Closed
8 tasks
StefanKarpinski opened this issue Jul 13, 2015 · 68 comments
Closed
8 tasks

julep: a plan for backticks #12139

StefanKarpinski opened this issue Jul 13, 2015 · 68 comments
Labels
design Design of APIs or of the language itself julep Julia Enhancement Proposal speculative Whether the change will be implemented is speculative

Comments

@StefanKarpinski
Copy link
Sponsor Member

It has not infrequently seemed a shame to me that Markdown has trained us all to quote code by wrapping it in backticks. a + b is how we want to write the quoted expression a + b. That's considerably nicer than :(a + b) – the frownieface operator is just kind of weird – and it has some syntactic issues since the parens are actually part of the expression being quoted, not the quotation syntax; this has tripped quite a few people up.

Currently, backticks are used for quoting external commands using a convenient shell-like syntax. You don't want to use single or double quotes for this since it's quite common to want to use those quote characters in command expressions. But there's one bit of syntax we haven't exploited yet: backtick custom-literal strings. (This option only just occurred to me the other week.) So I would propose the following syntaxes:

  1. Use bare backticks to quasiquote Julia code: a + b + $ex. The dollar sign splices expressions into the quoted code as it does inside of :(...) currently.
  2. Use cmd-prefixed backticks to write commands: pipe(cmdfind -name *.$ext, cmdhead -n$n). The dollar sign splices values into commands as it does into backticks currently.
  3. Use colon for symbol literals, allowing double quotes to write symbols that aren't valid identifiers: e.g. :foo for symbol("foo"), :"foo bar" for symbol("foo bar") or :123 for symbol("123").

Using backticks for quasiquoting has the advantage that it's what lisp does. Getting to this point without breaking everything will require a substantial deprecation process:

  • Introduce custom backtick literals – foo...`` – and allow people to use those for a while.
  • Introduce cmdfind -name *.$ext`` as a syntax for external commands.
  • Deprecate bare backticks for commands.
  • Wait a release cycle to let the deprecation "take".
  • Change a + b to meaning quasiquotation – this breaks code using ... for commands.
  • Deprecate :(a + b) as quasiquotation.
  • Wait a release cycle to let the deprecation set it.
  • Disable :(a + b) for quasiquotation, enable :"foo bar" for non-identifier symbols.

That's a long process, but I think it's a better use of backticks. It has the advantages of matching how we write quoted code in Markdown and most Lisps use backtick for quasiquotation – in Lisp style just at the front, of course, but still, I think it will be more familiar to Lispers.

@StefanKarpinski StefanKarpinski added the julep Julia Enhancement Proposal label Jul 13, 2015
@jakebolewski
Copy link
Member

-1 from me. Although I like the proposal, I don't think what we gain here is worth the massive breakage.

@simonster
Copy link
Member

If we're going to change the command syntax, why not have cmd be an ordinary custom string literal? It's not clear to me what we'd gain from backtick custom literals besides confusion and extra special cases in the code. We can always use cmd"""x""" if there are quotes, but I'm not sure there are many cases where you can't use single quotes in the command. (In fact I'm not sure there are many cases where you actually want quotes at all if you have interpolation.)

Ref #9945 for the proposed symbol changes.

@ScottPJones
Copy link
Contributor

I agree with @simonster, cmd"""xxx""" or cmd"xxx" instead of backticks for commands.
I don't think backticks are used for commands that frequently, in base, I found 9 files that used it, plus 8 files in the pkg directory, and a lot more places where backticks were part of documentation.
In packages, it seems that most places where backticks are used as commands, was as a literal argument to run, so I that for those cases, why can't run also accept a normal string, and treat it as if it were in backticks?
Add the cmd"..." and cmd"""...""", along with run("...") and run("""..."""), deprecate backticks for commands at the same time, and then think about using backticks for other things.

@tkelman
Copy link
Contributor

tkelman commented Jul 14, 2015

I'm not sure there are many cases where you can't use single quotes in the command.

The cmd shell on Windows doesn't handle single quotes the same way a posix shell does, as one case.

@hayd
Copy link
Member

hayd commented Jul 14, 2015

cmd".." is also easy to add to Compat, no need for parser changes.

-1 to run et al accepting plain strings due to the difference between interpolation (e.g. of arrays; see http://julialang.org/blog/2013/04/put-this-in-your-pipe/).

@toivoh
Copy link
Contributor

toivoh commented Jul 14, 2015

+1 to this proposal, and the end goal.
Beyond making quoting much more readable, it also introduces a distinction between quoted symbols and bare ones, which I seem to recall is needed to improve macro hygiene.

One thing that would be lost though: it would no longer be possible to nest quasiquotes lexically. But I think the needs of that would be infrequent enough that you could easily work around it.

@ivarne
Copy link
Sponsor Member

ivarne commented Jul 14, 2015

This seems like two independent issues.

  1. Enable backtick custom string literals, and how they should work.
  2. Reclaim unprefixed backtick literals for a more widely used purpose.

Previously we had @*_str and @*_mstr macros, but they were merged when the deindentation function for tripple quoted strings were moved to the parser. Should prefixed backtick quoting be just another string literal that calls @*_str, with different parser behavior with regard to escaping, or do we want a different concept?

2 will be a long process with deprecation periods to allow people to migrate to the new solution, so there is no hurry deciding what we will use the syntax for.

@mauro3
Copy link
Contributor

mauro3 commented Jul 14, 2015

Would triple back-ticks replace quote-blocks?

@ScottPJones
Copy link
Contributor

@hayd Pardon my ignorance, could you give an example of why run("string") could not be treated as the equivalent of run(string)? I read the link, but couldn't see just where that said or implied that that wouldn't work. Thanks.

@toivoh Example of nested quasiquotes please? Thanks.

@MikeInnes
Copy link
Member

+1, this seems like a nice improvement. If nothing else I'm not going to be able to unsee the frownyface operator now.

@toivoh I don't think we'd necessarily lose that ability. We can already nest strings as e.g. "foo $("bar") baz", and the parser realises that " doesn't end the string because it's inside an expression. foo(`bar`) could work in the exact same way.

@pao
Copy link
Member

pao commented Jul 14, 2015

@ScottPJones Command interpolation works differently. For instance, assume we had a program nargs which returns argc, and let arg = "one two". run("nargs $arg") would return 3, but run(nargs $arg) would return 2.

@tbreloff
Copy link

+1 for the @StefanKarpinski proposals 1 and 3 (backticks become expressions, colons are symbols), and also the @mauro3 suggestion to use triple back-ticks to replace quote/end blocks. Also agree that cmd"..." and cmd"""...""" are sufficient and don't require special back-tick notation. I was burned yesterday by the subtlety of the frownyface notation, and I think there should be a clear distinction between expressions and symbols. As @ScottPJones pointed out, there are very few current uses of back-ticks so I say just go ahead and break stuff.

@ScottPJones
Copy link
Contributor

@pao Thanks. That's a rather subtle difference I would think.
With the backtick quoting, what would one do if they want things split up like in the first example (with " quotes)?

@StefanKarpinski
Copy link
Sponsor Member Author

@StefanKarpinski
Copy link
Sponsor Member Author

@ScottPJones, please read http://julialang.org/blog/2012/03/shelling-out-sucks/ and http://julialang.org/blog/2013/04/put-this-in-your-pipe/ for more background on why Julia's backticks exist, work the way they do, and are important for calling external programs reliably. I went through it there in a great bit of detail with lots of examples. No point in rehashing that unnecessarily.

@StefanKarpinski
Copy link
Sponsor Member Author

@one-more-minute wrote:

@toivoh I don't think we'd necessarily lose that ability. We can already nest strings as e.g. "foo $("bar") baz", and the parser realises that " doesn't end the string because it's inside an expression. foo(`bar`) could work in the exact same way.

Good point. Since you can parenthesize expressions you could always write (...) for nested quasiquotation. I also like the idea of ... for quote end – again, it fits nicely with how triple backticks are used in Markdown.

@ScottPJones
Copy link
Contributor

OK, I did read the second one, that @hayd mentioned, I'll read the other one. Thanks.

@jakebolewski
Copy link
Member

It is not a very strong argument that just because the cmd syntax is not used often in Base, it can just be freely deprecated without too much impact. Of course it is not used in Base, you should make minimal assumptions about your environment if you want to be cross platform. Command syntax is used often in data processing pipelines. It is often faster to call unix functionality through shelling out than to use Julia code to munge your data as the unix utilities are currently much faster.

The real deprecation here is not with the cmd syntax but with expression quoting (which is used everywhere). I agree that the backtick is marginally nicer syntax, but is it _that_ much nicer to go through all this code churn? @tbreloff you say that the current syntax is subtle, could you give a concrete example?

At the current release rate, this proposal would have us adapting to deprecations and rewriting code for ~2 years. To go through and fix packages is a lot of effort for often little gain. I'm just raising the red flag that we should actually be gaining something tangible from this proposal (other than it is more aesthetically pleasing) before committing to it.

@StefanKarpinski
Copy link
Sponsor Member Author

@ivarne wrote:

Previously we had @*_str and @*_mstr macros, but they were merged when the deindentation function for tripple quoted strings were moved to the parser. Should prefixed backtick quoting be just another string literal that calls @*_str, with different parser behavior with regard to escaping, or do we want a different concept?

Ah, I see you beat me to this proposal.

@StefanKarpinski
Copy link
Sponsor Member Author

@jakebolewski, this is a good point, but I do think that aesthetics matter and this is something that will be in the language forever. I don't really want to live with the frowneyface operator forever, especially when there's this other much nicer syntax so tantalizingly close.

@tbreloff
Copy link

regarding subtlety... here's a few quick examples which are non-obvious with a quick glance (for non-expert users anyways):

julia> x = :(); typeof(x)
Expr

julia> x = :(x); typeof(x)
Symbol

julia> x = :(+); typeof(x)
Symbol

julia> x = :(+5); typeof(x)
Int64

julia> x = :(+(5)); typeof(x)
Expr

I feel like it would be much clearer to see:

``   # equivalent to :()
:x
:+
`+5`
`+(5)`

Aesthetics matter a ton. I want to be able to scan code in 1-2 seconds to understand what it's doing.. I don't want to spend my time looking for matching parens and reasoning about what something means in context. This is doubly valuable if I can add logic to my syntax highlighter that clearly identifies expressions in the code. I can't easily do that if symbols and expressions share syntax.

@MikeInnes
Copy link
Member

:(x) == :x is particularly fiddly, because it means that you can't reliably return quasiquote syntax from macros, and instead have to have use :($(Expr(:quote, x))) everywhere (linking back to the nested-quasiquotes issue). Making a distinction between symbols and quoted expressions makes a ton of sense.

@jakebolewski
Copy link
Member

@tbreloff, @one-more-minute wouldn't using explicit quote ... end blocks solve most of the points you raise (except +5 which is transformed in the parser).

@MikeInnes
Copy link
Member

Possibly, although wrapping things in a redundant Expr(:block) isn't always convenient either.

@tbreloff
Copy link

@jakebolewski yes you can obviously get around these problems, but quote ... end adds it's own confusion and messiness.

Julia is still 0.4 (dev)... if there are good solutions to making the language easy to understand/read, we should do it.

@jakebolewski
Copy link
Member

@tbreloff what is the confusion and messiness with quote ... end blocks? Block syntax is fundamental to Julia.

Users who are manipulating quoted expressions have entered "sufficiently advanced user territory". We don't even commit to having a stable Expr AST representation.

@mbauman
Copy link
Sponsor Member

mbauman commented Jul 14, 2015

I think there'd be a certain elegance to have always wrap its contained expression in an Expr(:quote, …) Expr, akin to how quote … end is always Expr(:block, …) (and that could become ). Then you'd no longer need to worry about potentially getting AST literals back, either.

(Edited, thanks @toivoh)

@mbauman
Copy link
Sponsor Member

mbauman commented Jul 14, 2015

Thinking about the commonalities between quasi-quotation and command line syntax, they're both some sort of executable string with syntax. Perhaps the custom foo literals should be encouraged for writing DSLs or other interop like `sql`…. With that in mind, should there be any differences in the parsing or macro name between foo"…" and bar…``?

@quinnj
Copy link
Member

quinnj commented Jul 14, 2015

+1 to eventually using as default quoting syntax. I also think that having the shell> mode lessens some of the impact here since that's, at least for me, the most common use of shelling out in Julia.

@mbauman brings up a good point. Maybe the convention going forward is foo".." string literals return objects, while foo...`` backtick literals actually call a method of some kind, i.e. execution.

@tkelman
Copy link
Contributor

tkelman commented Jul 15, 2015

No, I still think that's an imprecise analogy - it works the way code inside an interpolation (or existing quoting) works, not the way interpolation inside strings works. Interpolation is its own parsed context within the string. You're proposing making backticks a parsed context, except when prefixed by a formatting macro? Seems maybe useful, but not a dramatic improvement. The funny lowering of custom string literals is already kind of hidden and confusing, now we're going to add another version of it?

@tkelman
Copy link
Contributor

tkelman commented Jul 15, 2015

Or to take this another step, if we're going to do this, why not apply the exact same treatment to single quotes while we're at it. I'm sure there's a better use for them than chars, we can just use char'a' for that. (not sure if joking)

@mschauer
Copy link
Contributor

That is quite nice. 1 + 1 is quasi-quoted (julia)-code, and cmdecho -e "\033[2J" is code (a command with args in the execvp sense) and bashecho -e "\E[2J" is some specific shell code etc

@hayd
Copy link
Member

hayd commented Jul 16, 2015

You're proposing making backticks a parsed context, except when prefixed by a formatting macro?

Isn't the point that you could define parsing rules on prefixed backticks? e.g. cmd/cxx/sql.

The conventions between parsing and executing are a bit unclear: sql/cxx execute on construction (IIUC), cmd/:( don't and need to be run/eval'd.

@StefanKarpinski
Copy link
Sponsor Member Author

Yes, there's a bit of a distinction to be made between constructs that construct code versus evaluate code. It might be worth making it more uniform, but it's unclear why one would want to construct but not evaluated SQL or C++ code objects.

@jakebolewski
Copy link
Member

If we really are going to change the syntax, I think we should start deprecation process in this release so that at least we can write non-deprecated code going forward in the 0.5 release cycle.

@prcastro
Copy link
Contributor

This the chance to get infix operators. That's the only nice way to achieve custom infix operators that I'm aware of, and if we decide to use this syntax for quasi-quotation we will never get that.

@StefanKarpinski
Copy link
Sponsor Member Author

I don't see why backticks are a desirable choice for infix operator syntax.

@prcastro
Copy link
Contributor

Because they are a minimal piece of syntax that we can put surrounding a custom operator:
a operator b is much nicer than a $operator$ b or something like that.

Unless we can define a symbol to be an infix operator without surrounding it with syntax, say:

@infix :operator

@johnmyleswhite
Copy link
Member

Can we not worry about infix operators when we're getting close to a release and need to get things moving sooner rather than later?

@prcastro
Copy link
Contributor

Got it

@StefanKarpinski
Copy link
Sponsor Member Author

I've never liked backticks for infix operators, so I'm not too worried about that.

@quinnj
Copy link
Member

quinnj commented Jul 23, 2015

Yeah, I used to think having generalized infix operators was super important, but over time, it's become less and less important IMO. I think the natural Julia style with multiple dispatch generalizes to more than just two arguments, so infix suddenly isn't as important. Using backticks for quoting code is much more natural.

@StefanKarpinski
Copy link
Sponsor Member Author

x f y is currently a syntax error outside of array and macro context, which would be more appealing, especially since it generalizes the x in y special syntax. To me the main question is how (if) one would generalize it to more arguments, associativity, etc.

@mbauman
Copy link
Sponsor Member

mbauman commented Jul 23, 2015

Back on topic, I think the order of Stefan's checklist is right. We first need to allow and implement `foo```. How we do that is still in discussion (line numbers? conventions for construction vs execution?) and a bit of work. I'm not sure it's worth delaying 0.4 more for this.

@jakebolewski
Copy link
Member

Leaving huge breaking syntax changes up in the air doesn't really help anyone. If this is going to change it should happen sooner rather than later.

@ViralBShah
Copy link
Member

I personally feel that we should put this on the backburner for now, and discuss earlier in the 0.5 cycle.

@JeffBezanson
Copy link
Sponsor Member

I agree with @ViralBShah ; at this point I don't want a single additional thing to worry about for 0.4. The "do it sooner rather than later" argument has merit, but can only go so far. Otherwise as soon as somebody raises a breaking syntax idea, we have to delay whatever release we're working on until it's settled. It's sort of a breaking-change-filibuster.

I like backticks for code quoting, but I'm worried about the nesting behavior. It vaguely reminds me of the syntax of prefix $ before we made it parse just one atom. One was never sure exactly how much syntax it would eat. People may end up preferring to disambiguate by writing ( ... ), but then we're in a similar situation as we are with :(). See for example #11611. So we might want to make parens part of the syntax from the beginning, i.e. the open backquote is backtick-open-paren, and the close backquote is close-paren-backquote.

@StefanKarpinski
Copy link
Sponsor Member Author

While I'm still in favor of this, I agree that jumping into it too fast is a bad idea. This is post 0.4.

@StefanKarpinski StefanKarpinski added design Design of APIs or of the language itself speculative Whether the change will be implemented is speculative labels Aug 1, 2016
@ararslan
Copy link
Member

ararslan commented Aug 4, 2016

I think I'm in the vast minority here, but I actually prefer the frowny face operator; having :(...) construct an Expr seems a natural extension of having :x construct a Symbol. If anything, I'd prefer something like :{...} as a replacement for frowns over backticks, or even go full on Lisp and use '(...).

I also think that the triple backticks for block syntax is much less clear than quote/end, and is less consistent with how we do blocks in other contexts (e.g. do, for, begin). It bears resemblence to the multiline string syntax, but multiline strings contain arbitrary stuff whereas blocks contain code. I think it's safe to assume that quoted blocks more often contain code than arbitrary text.

I agree with Jake that it's a lot of ecosystem-wide breakage for, in my very humble opinion, marginal benefit.

@andyferris
Copy link
Member

I was (perhaps unnecessarily) a bit worried about how nesting expressions will work nicely, which is the nice thing about having them surrounded by brackets. Here's just an idea:

`I am a symbol with spaces and #hashes and $string-like interpolation`
`(I am an expression with $expr-like interpolation #comment)`

with triple backticks being more like the latter, I suppose? Having symbol makes it seem a bit more like it is just another type of string (an interned string - though I'm not sure whether or not to add all the String-like methods to make it so...)

@StefanKarpinski
Copy link
Sponsor Member Author

I don't think we're going to do this. The problem with backticks in place of :( ) is that it doesn't next, which is very annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design of APIs or of the language itself julep Julia Enhancement Proposal speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests