Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Something changed with printing, now syntax highlighting is confused #2628

Closed
harold opened this issue Apr 23, 2019 · 35 comments
Closed

Something changed with printing, now syntax highlighting is confused #2628

harold opened this issue Apr 23, 2019 · 35 comments
Labels
bug help wanted high priority Tickets of particular importance

Comments

@harold
Copy link
Contributor

harold commented Apr 23, 2019

Actual behavior

I often return sequences of maps to have them printed in the repl. Since upgrading to 0.21.0 I am seeing a lot of confused syntax highlighting like this:

image

Steps to reproduce the problem

Evaluating this form in the repl is enough to do it:

(for [i (range 40)]
     {:a (java.util.UUID/randomUUID)
     :b "String String String String String String String"})

Environment & Version information

CIDER version information

;; CIDER 0.21.0 (New York), nREPL 0.6.0
;; Clojure 1.9.0, Java 1.8.0_171

Lein/Boot version

$ lein --version
Leiningen 2.9.1 on Java 1.8.0_171 Java HotSpot(TM) 64-Bit Server VM

Emacs version

GNU Emacs 26.2 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.30) of 2019-04-12

Operating system

Ubuntu 18.04

PS. I love Cider it's the best part of my day every day. <3

@bbatsov bbatsov added the bug label Apr 24, 2019
@bbatsov
Copy link
Member

bbatsov commented Apr 24, 2019

I've noticed this as well recently. Probably it's some regression related to the REPL performance work done by @cichli in that release. We'll have to investigate this further.

@bbatsov
Copy link
Member

bbatsov commented Apr 24, 2019

PS. I love Cider it's the best part of my day every day. <3

Thanks for the kind words! 🙇

@harold
Copy link
Contributor Author

harold commented Apr 24, 2019

We'll have to investigate this further.

Great. Let me know if there's anything I can do to help.

@dpsutton
Copy link
Contributor

@harold i feel like i've seen this for a while so:

  1. thanks for reporting. things don't happen without that
  2. do you know which version you were last on that didn't have this bug?

@harold
Copy link
Contributor Author

harold commented Apr 24, 2019

On my main machine I have this in package-list-packages:
image

0.16.0 I used for a long time, and never saw this particular brand of syntax highlighting confusion.

0.20.0 I used for a much shorter time, and don't recall seeing this, but it could be in there.

For a long time (years) I have had this in my config as well: (setq cider-print-fn 'puget), but after reading about the print improvements in 0.21.0 I commented it out. So if puget use could mask this, it may have been around for a long time.

Does that help/make sense?

@dpsutton
Copy link
Contributor

more info is always helpful :)

i'm thinking this has been there for a while and I may have been the one to break it a few years ago. The breaks in font-locking happen when the response is split over several messages:

(<--
  id         "10"
  session    "714f85bb-5674-4149-8a7e-a08d8f4318fd"
  time-stamp "2019-04-23 22:13:51.769192492"
  value      "({:a0 #uuid "5e7120c0-38a9-4087-a4b5-156d92337968",
  :b0 "S..."
)
(<--
  id         "10"
  session    "714f85bb-5674-4149-8a7e-a08d8f4318fd"
  time-stamp "2019-04-23 22:13:51.811907403"
  value      "bb-2cbf37bdf8a5",
  :b9 "String String String String String ..."
)
(<--
  id         "10"
  session    "714f85bb-5674-4149-8a7e-a08d8f4318fd"
  time-stamp "2019-04-23 22:13:51.813519873"
  value      "
  :b18 "String String String String String String String"}
..."
)

so here it begins a new message in the middle of a response with "bb...". And the font locking breaks there:
font-locking

I remember there's a function that has to do with context and ansi coloring and I'm looking for that now.

It would be nice if you could confirm if puget as pretty printer can suppress this behavior

@harold
Copy link
Contributor Author

harold commented Apr 24, 2019

Ok, just tried w/ puget, syntax highlighting gets lost at the exact same spot as in my first screenshot (see the first gray 'g' about halfway through the print).

So puget no effect as far as I can tell.

@Malabarba
Copy link
Member

I get the same bug in other types of buffer (like compilation-mode, or inf-ruby) whenever there's a lot of output being produced really fast. I think it's just a matter of emacs font-locking "skipping" portions of the buffer when it can't keep-up with the speed of the output.

@harold
Copy link
Contributor Author

harold commented Apr 29, 2019

I get the same bug in other types of buffer

Be that as it may, some recent change has exacerbated the problem for the Cider repl - since upgrading I experience this on basically every print, while before I don't recall ever having seen it.

@bbatsov
Copy link
Member

bbatsov commented May 11, 2019

I'm almost certain this was broken by the streamed printing. It's like the font-locking got applied to the individual chunks that were streamed in the REPL independently, instead of to the REPL buffer as a whole. Someone will have to dig a bit deeper into that problem.

@stale
Copy link

stale bot commented Aug 9, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

@stale stale bot added the stale label Aug 9, 2019
@bbatsov bbatsov added the high priority Tickets of particular importance label Aug 9, 2019
@stale stale bot removed the stale label Aug 9, 2019
@harold
Copy link
Contributor Author

harold commented Apr 24, 2020

@bbatsov - Is it true that you're being paid to work on this project again?

Found out today (on the anniversary of this issue Apr 23!).

This must be a sign.

Let's fix this issue! What can I do to help?

@bbatsov
Copy link
Member

bbatsov commented Apr 24, 2020

Must be a sign indeed! 😆 I'll see what I can do about this, but I first have to wrap up some new nREPL middleware that I'm working on currently.

@harold
Copy link
Contributor Author

harold commented Apr 24, 2020

Good, good, good. This is tremendous news. Long live CIDER.

@bbatsov
Copy link
Member

bbatsov commented Apr 25, 2020

I didn't mention this before, but a simple workaround is to just disable font-locking in the REPL completely:

(seq cider-repl-use-clojure-font-lock nil)

I'm almost certain this was broken by the streamed printing. It's like the font-locking got applied to the individual chunks that were streamed in the REPL independently, instead of to the REPL buffer as a whole. Someone will have to dig a bit deeper into that problem.

Seems I was right in the past. The potential solutions I see for this are:

  • add an option to disable streaming in the REPL, which would fix the font-locking, but degrade performance
  • try to find some way to force the font-locking of results to redone once they are fully sent (this would probably have some impact on performance as well)

@bbatsov
Copy link
Member

bbatsov commented Apr 25, 2020

For more context - the problem is here:

(defun cider-repl-emit-result (buffer string show-prefix &optional bol)
  "Emit into BUFFER the result STRING and mark it as an evaluation result.
If SHOW-PREFIX is non-nil insert `cider-repl-result-prefix' at the beginning
of the line.  If BOL is non-nil insert at the beginning of the line."
  (with-current-buffer buffer
    (save-excursion
      (cider-save-marker cider-repl-output-start
        (goto-char cider-repl-output-end)
        (when (and bol (not (bolp)))
          (insert-before-markers "\n"))
        (when show-prefix
          (insert-before-markers (propertize cider-repl-result-prefix 'font-lock-face 'font-lock-comment-face)))
        (if cider-repl-use-clojure-font-lock
            (progn
              (insert-before-markers (cider-font-lock-as-clojure string)))
          (cider-propertize-region
              '(font-lock-face cider-repl-result-face rear-nonsticky (font-lock-face))
            (insert-before-markers string)))))))

As you can see we font-lock the result in chunks (they weren't chunks before nREPL 0.6, though) and this is where the breakage is coming from. Anyways, I'll figure something out.

@bbatsov
Copy link
Member

bbatsov commented Apr 25, 2020

After playing with the code a bit more I'll probably just remove the font-locking completely. It's going to be a big mess to keep track of the result boundaries and fontify them, so I guess we're better off without this.

@harold
Copy link
Contributor Author

harold commented Apr 25, 2020

remove the font-locking completely

Does this imply losing syntax highlighting on printed results in the repl?

@harold
Copy link
Contributor Author

harold commented Apr 25, 2020

One more neophyte question, how is the chunk-size determined?

If I understand your diagnosis correctly, it seems like the likelihood of a font-locking error is inversely proportionate to this 'chunk size' (bigger chunks are more likely to be font-locked correctly because there are fewer boundaries per result).

If the chunk size were enormous, or perhaps configurable, one could tune it to encounter this bug less.

Thanks again for looking into this, anyone doing anything nontrivial with CIDER is hitting this every day.

@WorldsEndless
Copy link

I have just received with this issue for time immemorial. I haven't ever expected repl to give pretty feedback for all but small data. You're telling me there's supposed to be another way?

@bbatsov
Copy link
Member

bbatsov commented Apr 25, 2020

Does this imply losing syntax highlighting on printed results in the repl?

Yes. I simply don't see a simple way to preserve streaming the result in chunks and do the font-locking effectively. The current approach was meant for the case when the entire result comes in one piece.

If the chunk size were enormous, or perhaps configurable, one could tune it to encounter this bug less.

True, but that you defeat the purpose of streaming the result in chunks, right? :-) The chunk size is configurable, but so is using streaming or not. However, without streaming if you make a mistake and print something big this will lock-up both nREPL and CIDER, which is why we introduced this feature in the first place.

Thanks again for looking into this, anyone doing anything nontrivial with CIDER is hitting this every day.

Well, it's annoying for me as well, especially now that I'm actually doing some Clojure development again. :-)

I have just received with this issue for time immemorial. I haven't ever expected repl to give pretty feedback for all but small data. You're telling me there's supposed to be another way?

In the past the result was returned from the server in one piece and this was font-locked just before it was inserted in the REPL. Now the server splits the result in 1024 byte chunks (by default) and because they are font-locked individually this breaks the font-locking on anything bigger.

As noted above I don't see any reasonable way around this, as we can't really collect all results and print them in bulk - that'd undo the massive performance gains of streaming results in chunks and would make printing uninterruptible. I can write some logic that constantly goes back to the expression boundary and font-locks the code again, but that'd be super slow and quite complex.

That's why it seems to me this font-locking should go away completely. I can also expose the streaming settings to the end users, so they can disable it if they want to or increase the result chunk size.

@harold
Copy link
Contributor Author

harold commented Apr 25, 2020

I got it now. Thanks for the clarification. I think the screenshot in the original report is actually typical.

Syntax highlighting on printed results is too good to lose.

What about just cranking the default chunk size? Like, maybe 8x? Was the current size chosen carefully? Does it optimize some other concern?

If font locking breaks on ginormous prints that’s definitely better than bringing Emacs and cider down (we’ve all done that).

What do you think about just increasing the chunk size? Can you tell me how to do it locally so I can test for myself?

Welcome back into the fold, can’t express how sweet it is to have you back.

@dpsutton
Copy link
Contributor

is it conceivable to not font-lock on messages until we get a done message and then call (font-lock-fontify-region beginning-of-first-message point-at-done-message-time) or something equivalent?

@bbatsov
Copy link
Member

bbatsov commented Apr 25, 2020

It is, but it's not very easy and it will also result in you seeing bigger results without font-locking until they are streamed until the end. The main complexity comes from the fact you can't just take everything between the previous and the next prompt, as some of it might be output.

@dpsutton
Copy link
Contributor

good point. Does the result viewer functionality give us some breathing room? Maybe the repl output isn't font-locked but you could call some type of cider-inspect-last-result and see a font-locked-version if needed?

@bbatsov
Copy link
Member

bbatsov commented Apr 25, 2020

What about just cranking the default chunk size? Like, maybe 8x? Was the current size chosen carefully? Does it optimize some other concern?

Frankly, I don't remember how we chose the default. :D Maybe increasing it 4/8x is fine. 1k buffer-sizes are relatively small indeed.

What do you think about just increasing the chunk size? Can you tell me how to do it locally so I can test for myself?

You can redefine the cider--nrepl-print-request-map function like this:

(defun cider--nrepl-print-request-map (&optional right-margin)
  "Map to merge into requests that require pretty-printing.
RIGHT-MARGIN specifies the maximum column-width of the printed result, and
is included in the request if non-nil."
  (let* ((width-option (cider--print-option "right-margin" cider-print-fn))
         (print-options (thread-last
                            (map-merge 'hash-table
                                       `((,width-option ,right-margin))
                                       cider-print-options)
                          (map-pairs)
                          (seq-mapcat #'identity)
                          (apply #'nrepl-dict))))
    (map-merge 'list
               `(("nrepl.middleware.print/stream?" "1")
                  ("nrepl.middleware.print/buffer-size" "8192"))
               (when cider-print-fn
                 `(("nrepl.middleware.print/print" ,(cider--print-fn))))
               (when cider-print-quota
                 `(("nrepl.middleware.print/quota" ,cider-print-quota)))
               (unless (nrepl-dict-empty-p print-options)
                 `(("nrepl.middleware.print/options" ,print-options))))))```

It's current definition doesn't specify an explicit buffer-size, but I can expose this via a defcustom. 

@bsless
Copy link

bsless commented Apr 25, 2020

Is it possible to perhaps play dynamically with the quota (looking at replying-PrintWriter) such that streamed chunks will end at distinct Clojure objects? It might require double buffering but if the next object can only be partially added to the stream, maybe it should be moved to the next chunk?
This could enable maintaining correct syntax highlighting with higher probability with less need to increase buffer size (it might have to be temporarily exceeded for hefty object though)
What do you think, could this work?

@harold
Copy link
Contributor Author

harold commented Apr 26, 2020

Ok! We've learned a lot.

There are a couple of typos in what you shared, @bbatsov - but I got this working:

(defun cider--nrepl-print-request-map (&optional right-margin)
  "Map to merge into requests that require pretty-printing.
RIGHT-MARGIN specifies the maximum column-width of the printed result, and
is included in the request if non-nil."
  (let* ((width-option (cider--print-option "right-margin" cider-print-fn))
         (print-options (thread-last
                            (map-merge 'hash-table
                                       `((,width-option ,right-margin))
                                       cider-print-options)
                          (map-pairs)
                          (seq-mapcat #'identity)
                          (apply #'nrepl-dict))))
    (map-merge 'list
               `(("nrepl.middleware.print/stream?" "1")
                 ("nrepl.middleware.print/buffer-size" 8192))
               (when cider-print-fn
                 `(("nrepl.middleware.print/print" ,(cider--print-fn))))
               (when cider-print-quota
                 `(("nrepl.middleware.print/quota" ,cider-print-quota)))
               (unless (nrepl-dict-empty-p print-options)
                 `(("nrepl.middleware.print/options" ,print-options))))))

And with that, the original code in the report works nicely:
image

Now, 8192 is a number, but 16384, 32768, and 65536 are also numbers. The last one maybe being the best choice here as Gates once said, "640K ought to be enough for anyone." .. err maybe 1/10th that, or well, whatever.

Some of the more clever solutions proposed in this issue sound interesting, but I think for simplicity my vote would currently be for just cranking this print buffer size a bit. For sure, turning off font locking by default for repl prints is bad.

One other interesting tidbit I discovered in playing today is that there seems to be some kind of upper bound on the size of output that CIDER will attempt to font-lock at all.

This (with a suitably large buffer) works:
image

But this (regarless of the buffer size), just gives up:
image

This all seems super-reasonable to me; and I think we're dialing in toward something great here.

Thanks everyone.

@dpsutton
Copy link
Contributor

dpsutton commented Apr 26, 2020 via email

@bbatsov
Copy link
Member

bbatsov commented Apr 26, 2020

There are a couple of typos in what you shared

Sorry about this! I've updated my example. Code editing outside of Emacs is not my strong suit! :D

One other interesting tidbit I discovered in playing today is that there seems to be some kind of upper bound on the size of output that CIDER will attempt to font-lock at all.

There's no limit in CIDER's font-locking function, so I'm guessing there's some Emacs limit after which font-locking doesn't kick-in at all.

I'll think a bit more about the solution, but I'm leaning towards disabling this by default as I tend to value simplicity and consistency.

@harold
Copy link
Contributor Author

harold commented Apr 26, 2020

I'm leaning towards disabling this by default as I tend to value simplicity and consistency

To be clear, the current behavior (inconsistent font locking) is much better than having no font locking (syntax highlighting) on printed repl results.

I never thought logging this issue would lead to losing syntax highlighting functionality.

Increasing nrepl.middleware.print/buffer-size is minimally invasive, fixes the problem as originally reported, and preserves imporant CIDER functionality (human readability of printed repl results).

@pbwolf
Copy link

pbwolf commented Apr 26, 2020

Legibility is nice when the decorating can be quick and accurate. The larger the output, the less I care about decoration. (In other words, I should refine my query.) Best would be to decorate whole outputs if small (32k?), and for anything bigger, optimize ruthlessly for speed.

@practicalli-johnny
Copy link
Contributor

I stopped using the REPL buffer directly a long time ago. The cider-inspect tools are far quicker and more effective for browsing data, especially for nested or large results.
I've recently been working with Clojure versions of large GeoJSON files and they worked flawlessly in cider-inspect.
The way the inspector works means I don't feel the need for highlighting to understand the result.

All the other evaluation is done in the source code buffers.

@bbatsov
Copy link
Member

bbatsov commented May 19, 2020

For those curious about the solution - I just opted to check if the result is a balanced expression and font-lock it only in this case. Almost always this would mean that we're doing with a single chunk result that we can safely font-lock.

bbatsov added a commit that referenced this issue May 19, 2020
This makes it possible to stream results in bigger chunks (and as a corollary -
fontify as Clojure bigger results).
@bbatsov
Copy link
Member

bbatsov commented May 19, 2020

I've also introduced cider-print-buffer-size which is set to 4k by default. I'm wary of using bigger buffer sizes by default, but everyone can bump them if they want to. I'm wondering whether to allow pretty-printing without streaming, but I probably won't do it as I don't see many benefits of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help wanted high priority Tickets of particular importance
Projects
None yet
Development

No branches or pull requests

8 participants