Make Merge behave consistently #4042

pascalkuthe · 2022-09-29T21:29:09Z

Supersedes #3695.

Currently, the HighlightEvent merge function overlays spans Vec<(usize, std::ops::Range<usize>)>`` on an Iterator of HighlightEvent`. The current merge doesn't handle overlapping spans well:

diagnostic ranges may be discarded if they overlap
ghost text may appear when a language server sends overlapping diagnostics

After a discussion with @the-mikedavis we realized that the solution in #3695 has a problem:
The spans must remain in a certain order to allow (somewhat performant) conversion to HighlightEvents.
However, with #3015 we usually want to show the highest priority diagnostic on top (so that an error is not overlapped by a hint for example). This is not possible with existing approach or the one in #3695.

However, a few key realizations lead to an alternative approach:

Implementing an overlay for non-overlappig spans is trivial and always produces the correct result
overlapping spans with the same Highlight scope can be merged together so that the resulting spans do not overlap
The only case where overlapping spans occur are diagnostics. However, there are only 5 different Highlight scopes for diagnostics: One for each severity. Therefore, the diagnostics can be separated into 5 sets of spans that all share the same highlights. These can then be used as an overlay as explained above.

This PR implements that approach.
The overlay I am referring to here is essentially merge (renamed it to make it more clear).
I implemented it from scratch because the old implementation was more complex than it needed to be and it was easier to integrate the merging directly into the iterator instead of handeling it in a seperate function.
Compared to previous implementation it should also perform a bit better (does not use any allocations for example).
This implementation can also automatically merge overlapping spans under the assumption that they have the same highlight. Therefore, there are two functions that produce such an iterator:

monotonic_overlay: accepts Spans (range + Highlight) that must be monotonically ascending (no overlaps and sorted by start)
overlapping_overlay: accepts a single Highlight and multiple Ranges. (sorted by start)

The overlapping_overlay is used to layer the diagnostic levels and the monotonic_overlay is used for cursor/selections (and in the future rainbow brackets).

As we are now layering up to 6 iterators on top of each other, I was a bit concerned about performance.
However, the new overlay implementation is very performant so the overhead is very small.
Furthermore, I refactored the calls to the iterators a bit to completely avoid dynamic dispatch.
When dynamically dispatching an iterator a v-table lookup is necessary for each call to next (occurs in the hot loop).
While LTO might theoretically move this out of the loop, the code is pretty complex so its probably not happening.
Benchmarking the syntax highlighting code is kind of tricky but my -not very scientific- glances at flamegraphs show that this is probably neutral or an improvement to performance.

I have not written an extensive test-tsuite as @the-mikedavis did for 3695 #because the algorithm is so simple that one can easily reason about all possible scenarios.
In particular all test-cases used there trivially work here.
The old merge implementation was more complex and didn't have any tests either.
However, a test-case that covers #2248 would probably be good, although I struggle to come up with one.
I am certain that this PR fixes that issue but until I can test it properly I am marking this as a draft.
This definitely fixes #2248

It might be good to rebase #2857 on this as that will probably produce the most overlays when thats merged.
However, the conflicts looked pretty daunting so I didn't get into that yet.
I merge #2857 with this and have not found any problems

helix-term/src/ui/editor.rs

the-mikedavis · 2022-10-02T17:32:30Z

helix-core/src/syntax.rs

+    fn next(&mut self) -> Option<Self::Item> {
+        let res = match self.0.next()?.unwrap() {
+            // TODO: use byte slices directly
+            // convert byte offsets to char offset
+            HighlightEvent::Source { start, end } => {
+                let text = self.0.source;
+                let start = text.byte_to_char(ensure_grapheme_boundary_next_byte(text, start));
+                let end = text.byte_to_char(ensure_grapheme_boundary_next_byte(text, end));
+                HighlightEvent::Source { start, end }
+            }
+            event => event,
+        };
+
+        Some(res)
+    }


The old maps on HighlightIter were doing two things:

Unwrapping the result (which currently always returns Ok)

I'm concerned that helix_core is not the appropriate place to be unwrapping. HighlightIter returns Result items so that in the future we can implement cancellation of highlighting (timer based for example). There isn't a concrete plan for that AFAIK so it's hard to say how/where the Result should be handled.

Translating bytes to chars in Source events.

@archseer does the TODO for using byte slices applies anymore? I think we can just translate bytes to chars in the Iterator implementation for HighlightIter

I'm concerned that helix_core is not the appropriate place to be unwrapping. HighlightIter returns Result items so that in the future we can implement cancellation of highlighting (timer based for example). There isn't a concrete plan for that AFAIK so it's hard to say how/where the Result should be handled.

I have moved unwrapping the closure back into helix-term in 25427ce altough it make the code simpler if we simply returned HighlightEvent from the HighlightIterator (panicking directly in the iterator) and then switched to a result once its actually required (because an unwrap is required everywhere the HighlighterIterator is used all call-sites will have to be changed in the future anyway). What do you think is better

@archseer does the TODO for using byte slices applies anymore? I think we can just translate bytes to chars in the Iterator implementation for HighlightIter

In markdown.rs the highlighting directly indexes into a string and therefore uses byte indices. Using char indices and translating back would be inefficient in that case so I am not sure there is a better solution right now.
Rewriting the rendering to use byte indices seems to amount to translating byte indices to chars anyways so there might not be a better approach.

helix-core/src/syntax/overlay.rs

helix-core/src/syntax/overlay/test.rs

helix-term/src/ui/editor.rs

the-mikedavis · 2022-10-02T18:49:07Z

helix-term/src/ui/editor.rs

-                )
-            })
-            .collect()
+        let diagnostic_ranges = doc.diagnostics().iter().filter_map(move |diagnostic| {


We end up filtering doc.diagnostics() for each severity, right? Could we iterate over it once and separate out the diagnostics into a HashMap<Severity, SmallVec<Range<usize>> to save some work?

I think the extra allocations would outweigh the cost of filtering here. I think SmallVec would be even worse because you essentially get an extra branch on every call to next() so the overhead on the iteration would be similar to just filtering.

helix-term/src/ui/editor.rs

the-mikedavis · 2022-10-02T18:54:55Z

helix-term/src/ui/editor.rs

+pub enum DocSyntaxHighlight<'d> {
+    TreeSitter(CharacterHighlightIter<'d>),
+    Default(iter::Once<HighlightEvent>),
+}


This is so that we can use static dispatch for everything, right? As opposed to a Box of a dyn trait implementation?

This enum makes the uses of doc_syntax_highlights awkward (looking at the usages in picker.rs and here in editor.rs) and I think what we lose in elegance of the code might not be worth the performance improvement.

I agree that this was a bit ugly, this is one of the rare cases where the limitations of the rust type system are kind of showing (generic assoicated types would help here or generic closures in general).

I managed to implement an alternative solution in 2c6f681 that turned out quite good.
The idea is that we never return the iterator and instead call the rendering code in the same function.
An Overlay (like diagnostics, selections) are passed as parameter that implements the HighlightOverlayso that they are generic over the input Iterator. HighlightOverlay are composable using tuples. For example the diagnostics Layers are implemented by creating a tuple of 5 DiagnosticOverlays (hint_overlay, info_overlay, unkown_overlay, warning_overlay, error_overlay).

Co-authored-by: Michael Davis <mcarsondavis@gmail.com>

the-mikedavis · 2022-10-05T15:30:14Z

I hacked together an FPS counter for the render loop: the-mikedavis@02ae7ad and this change looks costly based on those rough numbers. I see master at 868 FPS and this branch at 805 FPS for the same code when there are no diagnostics. It seems like there is some amount of overhead for overlaying an empty vec on the highlight stream that I think can be avoided.

I have some local changes that take care of the merging of overlapping diagnostics within Editor::doc_diagnostics_highlights which seems to be fast; I'll make a PR soon. I think it should be possible to solve this with minimal or no changes to Merge and maybe come out with a performance boost when there are no diagnostics.

pascalkuthe · 2022-10-05T22:26:35Z

closed in favor of #4113

pascalkuthe force-pushed the highlight-overlay branch from 67d8401 to 8929f6a Compare September 29, 2022 21:30

kirawi added A-core Area: Helix core improvements S-waiting-on-review Status: Awaiting review from a maintainer. labels Sep 29, 2022

pascalkuthe marked this pull request as draft September 29, 2022 21:35

implement consistent behaviour for highlight overlays

5b098c3

pascalkuthe force-pushed the highlight-overlay branch from 8929f6a to 5b098c3 Compare September 29, 2022 21:41

pascalkuthe commented Sep 29, 2022

View reviewed changes

helix-term/src/ui/editor.rs Outdated Show resolved Hide resolved

This was referenced Oct 1, 2022

Span merge review the-mikedavis/helix#3

Closed

Rainbow tree-sitter matches 🌈 #2857

Draft

pascalkuthe marked this pull request as ready for review October 1, 2022 11:41

the-mikedavis reviewed Oct 2, 2022

View reviewed changes

pascalkuthe and others added 8 commits October 3, 2022 20:51

Apply stylistic suggestions from code review

ed3cdd3

Co-authored-by: Michael Davis <mcarsondavis@gmail.com>

more elegant static dispatch

2c6f681

remove unnecessary close from DiagnosticsOverlay

c3f652f

move unwrap from helix-core to helix-term

25427ce

move contents of overlay/tests.rs to overlay.rs

1f26bca

fixup overlay test documentation

87f1a17

check overlay invariants

c2bdbbf

add missing assert

7ed579d

the-mikedavis mentioned this pull request Oct 5, 2022

Overlay all diagnostics with highest severity on top #4113

Merged

pascalkuthe closed this Oct 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Merge behave consistently #4042

Make Merge behave consistently #4042

pascalkuthe commented Sep 29, 2022 •

edited

Loading

the-mikedavis Oct 2, 2022

pascalkuthe Oct 3, 2022

the-mikedavis Oct 2, 2022

pascalkuthe Oct 3, 2022

the-mikedavis Oct 2, 2022

pascalkuthe Oct 3, 2022

the-mikedavis commented Oct 5, 2022

pascalkuthe commented Oct 5, 2022

Make Merge behave consistently #4042

Make Merge behave consistently #4042

Conversation

pascalkuthe commented Sep 29, 2022 • edited Loading

the-mikedavis Oct 2, 2022

Choose a reason for hiding this comment

pascalkuthe Oct 3, 2022

Choose a reason for hiding this comment

the-mikedavis Oct 2, 2022

Choose a reason for hiding this comment

pascalkuthe Oct 3, 2022

Choose a reason for hiding this comment

the-mikedavis Oct 2, 2022

Choose a reason for hiding this comment

pascalkuthe Oct 3, 2022

Choose a reason for hiding this comment

the-mikedavis commented Oct 5, 2022

pascalkuthe commented Oct 5, 2022

pascalkuthe commented Sep 29, 2022 •

edited

Loading