Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Escape non janet values from the repl as a string #214

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ALSchwalm
Copy link
Contributor

Fixes #211

This isn't done, but I wanted to get it up to see what you thought of the idea. Essentially it's just doing essentially what (I think) @bakpakin was suggesting. We add our own peg for janetsh, which allows embedded janet values, but also raw tokens that are not limited to valid janet symbols. Those tokens are then escaped as strings before being passed to ($ ...).

This seems to work, but I could be missing something. Or there could be a better way to do this. For example, we might want to expose something to the user that they could call like ($-raw "ls 1.png") or something.

Note also this is my first time writing a PEG, so if things can be done more cleanly, let me know.

@andrewchambers
Copy link
Owner

Thank you for experimenting with this. A huge downside to this approach is it entirely breaks expectations in script mode.

I am not yet convinced this work around is better than just having users understand that janet shell is a janet repl more than 'sh'. Perhaps we could just have an explanation in a well written tutorial telling people to use quotation in this case?

The thing I am most concerned about floats being reprinted differently from what a user typed. It is a correctness foot gun which bothers me much more than the error messages about symbols starting with numbers.

In any case, I am still quite tempted to close #211 as won't fix (after changing keywords).

@andrewchambers
Copy link
Owner

Something like this makes some sense to me as a user extension/plugin that can be advertised so people can make the explicit choice.

@bakpakin
Copy link
Contributor

bakpakin commented Jun 18, 2019

The thing I am most concerned about floats being reprinted differently from what a user typed. It is a correctness foot gun which bothers me much more than the error messages about symbols starting with numbers.

Why do we need to parse everything, and then write it back via (buffer/format buf "(sh/$? %s)\n" (escape-raw-line line))? Couldn't the peg parse the arguments and invoke sh/$? on them directly? I think that would eliminate correctness and reprinting issues. This obviously might require changes in the repl.

Edit: After reading this PR, I actually don't see how the numbers issue is a problem. As far as I can tell this should work fine. That said, I don't think that needing to quote things that start with a number is such a problem in the first place, as long as its possible to enter any possible filename.

@andrewchambers
Copy link
Owner

andrewchambers commented Jun 18, 2019

here is the problem with numbers:

$ echo 1.34322343243232432432532432432432
1.34322

Obviously not what the user intended. It an edge case, I am probably over thinking it, and my point was mainly that it may not be worth fixing even though it annoys me.

Adding a string based macro and then hiding the fact from the user is confusing in it's own way.

@andrewchambers
Copy link
Owner

andrewchambers commented Jun 18, 2019

An example of why I object to this change as is:

This will be totally ok:

echo 1.c 

But now this is broken because it doesn't use the string version of the macro:

(and ($ echo 1.c) ($ echo 2.c))

This is about as confusing to me as the original problem.

So the proposed solution is:

(and ($-raw "echo 1.c") ($-raw "echo 2.c"))

and the already working solution:

(and ($ echo "1.c") ($ echo "2.c"))

@bakpakin
Copy link
Contributor

I think the issue here is that the peg defines :value to be (+ :janet ... :number). If :janet were last in the ordered choice, the long number would be converted to a string before it was truncated.

@bakpakin
Copy link
Contributor

Yeah, if you want to be able to have identical syntax inside ($ ...), you would need either a completely custom parser, wrap the whole command in quotes, or just live with the fact that some tokens will need quotes that wouldn’t otherwise. I think the original problem isn’t so bad in retrospect.

@andrewchambers
Copy link
Owner

Maybe a case of https://en.wikipedia.org/wiki/Worse_is_better

@ALSchwalm
Copy link
Contributor Author

Hmm, my initial reaction was that not being able to, for example, cat files starting with numbers without quoting them would be very annoying. In practice though, such files are probably not much more common than ones containing parenthesis. I think you may be right and just documenting this is a better solution. I'll think on this a bit more an probably close this PR if I can't come up with any compelling cases where it would be a problem.

@ALSchwalm
Copy link
Contributor Author

Hmm, looking at my bash history, there are quite a few cases where I would need to quote things with janetsh without this pr. For example, any git command using a hash would need the hash quoted, as well as any branch creation starting with a number (like this branch, which I think is a common idiom). Even more significantly, any ip address would need to be quoted. I think those together are a pretty significant usability issue, and suggest that we should not consider this something that we should close as just a documentation issue (though I am not saying this particular PR is the best approach).

@andrewchambers
Copy link
Owner

andrewchambers commented Jun 19, 2019

I'm not sure what can be done without adding something like a parser macro to janet that lets you overload the janet parse module within the brackets. I do think that would be pretty cool, but it has been ruled out as overcomplicated.

Some research could be done as to how racket adds this feature to make a more informed proposal. I think the problem is that if janet grows features like that, things like that simple PEG describing the whole of janet no longer work, so in the future things like auto formatters of code no longer work.

Janet shell could fairly easily implement it's own reader by forking the parser module, but then it is no longer janet code and we might as well change the name.

@x4lldux
Copy link
Contributor

x4lldux commented Jul 9, 2019

This seems to work, but I could be missing something.

@ALSchwalm I've been playing with this for few days now and there were some things missing. I've added/modified (see x4lldux/janetsh@cebd357) to make it more bearable in day to day use:

  • quoting (any thing after a quote was removed),
  • support for double & single quotes (single quotes are translated to double quotes and any double quote in them is escaped),
  • support for env variables (env vars were quoted as strings and weren't interpolated by janetsh),
  • redirecting (it was impossible to redirect stderr),
  • support for escaping spaces (translates it to a quoted string)

But still something like echo "User's home dir is $HOME" is not possible.

A huge downside to this approach is it entirely breaks expectations in script mode.

@andrewchambers I think it's actually the opposite. Shell users expect to be able do the same things in scripting mode as they can do in shell mode. In Janetsh you can write expression ls | wc -l in shell but you can't do that in script where you need to enclose it in sh/$ functions/macros. Whilist you can use the same expresion in both scripting and shell mode in other lisp shells like Closh or Eshell (Emacs Lisp shell).
Right now janetsh scripting is nothing more than a janet scripting (even sh module is not auto imported) - so what's the use case here? Users wanting to write Janet scripts, can just use plain Janet.

I am not yet convinced this work around is better than just having users understand that janet shell is a janet repl more than 'sh'.

@andrewchambers IMHO, this work is a step in a right direction.

That said, I don't think that needing to quote things that start with a number is such a problem in the first place, as long as its possible to enter any possible filename.

@bakpakin Unless you're a sys-admin or and has to deal with a lot of IP addresses (also port numbers, git hashes, orprocess ids). Quoting starts be annoying real quick.

Janet shell could fairly easily implement it's own reader by forking the parser module, but then it is no longer janet code and we might as well change the name.

@andrewchambers It still would be janet code but it would be enhanced to be more usable in a shell environment. Once again, look at Closh or eshell (Emacs Lisp shell).

@andrewchambers
Copy link
Owner

andrewchambers commented Jul 9, 2019

echo "User's home dir is $HOME"

This is something that can be fixed at the macro level and I am interested in fixing this.

Right now janetsh scripting is nothing more than a janet scripting

Yes this is by design, compatibility with janet is a feature. The only reason the 'sh' mode in scripting is not just a janet library is that janetsh has a 'runtime' to do with signal handling. This may be fixed up in the future.

Whilist you can use the same expresion in both scripting and shell mode in other lisp shells like Closh or Eshell

Last time I used closh it did not have a scripting mode at all. I don't really have too much interest in Eshell or Elisp.

@bakpakin Unless you're a sys-admin or and has to deal with a lot of IP addresses (also port numbers, git hashes, orprocess ids). Quoting starts be annoying real quick.

IP addresses are definitely annoying.

It still would be janet code but it would be enhanced to be more usable in a shell environment. Once again, look at Closh or eshell (Emacs Lisp shell).

It still would be janet code but it would be enhanced to be more usable in a shell environment. Once again, look at Closh or eshell (Emacs Lisp shell).

I feel the only way this is going to get implemented is if Janet gets a custom reader api. I know GNU guile has an api that should make this possible, it is probably quite doable to port janetsh to guile scheme which may be a fun experiment.

@x4lldux
Copy link
Contributor

x4lldux commented Jul 10, 2019

Last time I used closh it did not have a scripting mode at all. I don't really have too much interest in Eshell or Elisp.

I know GNU guile has an api that should make this possible, it is probably quite doable to port janetsh to guile scheme which may be a fun experiment.

I'm not advertising to change Janet lisp to some other lisp language, but that both Closh and Eshell were able to retain it's lisp-iness and also get shell-iness.

Yes this is by design, compatibility with janet is a feature.
That creates inconsistency between what you can do interactively & in scripting mode.

I feel the only way this is going to get implemented is if Janet gets a custom reader api.

Not necessarily. We can use transpilation. In a sens it has always been used with implicit parens, while this PR extends it further by using PEG's replace to quote tokens or automatically add a colon before redirects.
To do this, we could add an import-sh function, which would work like import but would first transpile from Janetsh to Janet (maybe via that ($-raw "foo")). That would need to create a version of dofile & require. Or if @bakpakin would allow to add custom transformation to be run before loading a file on dofile we could skip those -sh functions all together. Maybe a custom module/loaders kind could also work here...

@bakpakin
Copy link
Contributor

This is basically my original suggestion, although I had not originally envisioned using dofile. Perhaps dofile could take some data structure for reading forms, and this would make it easier. However, if you read through dofile, it shouldn’t be too hard to modify for the needs of janetsh.

@andrewchambers
Copy link
Owner

andrewchambers commented Jul 14, 2019

I hit #217, the more issues I encounter the more likely this is to be merged, so don't give up hope totally.

We should also consider some edge cases and how mixing strings/janet works.

($ "ls -la (+ 1 2 3) ")
($ "ls -la" (+ 1 2 3))
($ "ls -la" (+ 1 2 3))
($ echo " \" " " \" ")

@andrewchambers
Copy link
Owner

@x4lldux janet already supports custom loaders btw, so we definitely could do that.

@x4lldux
Copy link
Contributor

x4lldux commented Jul 14, 2019

@andrewchambers I don't :) In fact, after reading through bakpakin/mendoza repo some more, I did find that Janet already supports custom loaders and was I planning to start working on it. Just now, it would translate to ($ ) in the form it is defined right now, so I could just concentrate in unification of shell/repl & script mode. But that is a little different story than this PR and #211.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusing behavior with filenames starting with a digit
4 participants