Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using tree-sitter in csharp-mode? #201

Closed
theothornhill opened this issue Dec 7, 2020 · 29 comments
Closed

Using tree-sitter in csharp-mode? #201

theothornhill opened this issue Dec 7, 2020 · 29 comments
Labels
font-lock tree-sitter Solved by using csharp-tree-sitter-mode

Comments

@theothornhill
Copy link
Collaborator

I have been following emacs-tree-sitter development for a while, and after watching https://emacsconf.org/2020/talks/23/, I feel we at least should talk about this feature being incorporated into csharp-mode after #200 and numerous other issues.

What would be the benefits?

  1. No reliance on CC Mode
  2. Highlighting would be easier to configure
  3. Indentation could be provided (I think?)
  4. More?

This would be yet another rewrite - yay! However, csharp-mode is in a kind of good place, in that what it provides is mostly indentation and highlighting, leaving most of the smarts to lsp-mode. This is obviously a wishlist feature.

CC @ubolonton (Just letting you know this issue exists)

@josteink
Copy link
Collaborator

josteink commented Dec 7, 2020

There has been similar discussions for typescript-mode too, although nobody has stepped up to do the actual work 👍

emacs-typescript/typescript.el#4

@theothornhill
Copy link
Collaborator Author

There has been similar discussions for typescript-mode too, although nobody has stepped up to do the actual work 👍

emacs-typescript/typescript.el#4

I'll look i to it 😄

@theothornhill
Copy link
Collaborator Author

theothornhill commented Dec 11, 2020

I've started to look into this, and have created a branch - tree-sitter to develop this feature.
I based it on emacs-tree-sitter/elisp-tree-sitter#70.

What I've done so far:

  • Comment out all my cc mode code (yay!)
  • Enable one proof of concept highlighting

To try this you need to use my fork of emacs-tree-sitter or wait until emacs-tree-sitter/elisp-tree-sitter#79 is merged

I think the simplest transition would be to keep the indentation provided by CC Mode until we have some tree-sitter-indentation implementation to rely on.

Yet another rework! 😄

For context:
image

Edit:
This is fun 😄
image

@theothornhill
Copy link
Collaborator Author

More progress:
image

@josteink
Copy link
Collaborator

josteink commented Dec 13, 2020

I just tried running the branch, but it seems I'm doing something wrong.

When trying to activate it, I'm not getting any highlighting at all, I just instead the following in my logs:

File mode specification error: (error No language registered for major mode ‘csharp-mode’)

Probably an easy fix, or a quick one-liner I'm missing, but it's not currently runnable OOB.

Edit: After reading some docs and some of your PRs against emacs-tree-sitter I found the following to work:

;;;###autoload
(add-to-list 'tree-sitter-major-mode-language-alist '(csharp-mode . c_sharp))

Should I add it to our branch for now, or should we just assume it will be merged upstream soon enough?

@josteink
Copy link
Collaborator

After playing slightly with this code, I'm very much sold on this.

  • we've externalized the general "recursive descent parsing" problem to fast native code
  • we've externalized bindings to that native code to another module
  • we've externalized the language-specification to a third party

All we have to do is tell Emacs how we want that language to look inside our major-mode. And that's really all we should be focusing on. Perfect!

For instance, I took note that attributes are rendered differently with and without arguments:

image

I looked at the (to me!) completely alien major-mode definition, found this:

image

Noticed and fixed an inconsistency:

image

And voila!

image

This really couldn't be easier.

Compared to any other Emacs major-mode I have worked with, this was by far the simplest fix. I'm all in awe!

So:

  • Nice job on suggesting the initiative.
  • Nice job on putting in the initial research!
  • Nice job actually following through with a complete major-mode, and not just a POC!

This almost has me rooting for a similar solution for typescript-mode!

My only concern/regret about such an approach is the dependency on native executables, and what (few) platforms they currently exist for, compared to the wide array of platforms Emacs itself supports.

But again, I guess that can be solved upstream, and if the use of tree-sitter-mode in major-modes start increasing, there will be a natural push, by more people than just me, for better platform support (*BSD, ARM64, etc).

@theothornhill
Copy link
Collaborator Author

Great! I'm so glad to hear that!

My only concern/regret about such an approach is the dependency on native executables, and what (few) platforms they currently exist for, compared to the wide array of platforms Emacs itself supports.

But again, I guess that can be solved upstream, and if the use of tree-sitter-mode in major-modes start increasing, there will be a natural push, by more people than just me, for better platform support (*BSD, ARM64, etc).

Yeah, this is an issue, and we need to find a good solution for this.

Right now I've just hurled everything into one spot, as you can see. I think we should factor this one out in separate variables and then just ,@ everything into the proper variable. It should be super easy for people to find inconsistencies (like you did - I'm sure there are lots of them) and make simple prs. No more relying on monkey patches etc.

Yeah, tree sitter is amazing. I'm looking into tree-sitter-indent as we speak as well :)

@josteink
Copy link
Collaborator

Just sent in 2 patches to your branch! Attributes and namespaces.

This shit is simple!

@theothornhill
Copy link
Collaborator Author

theothornhill commented Dec 13, 2020

Just fire away! We can extract stuff later.

Some things I've noted:

  • modifiers could be both supplied as a syntax construct, or as a string. We have to supply the try, but not the internal. I think we should try to keep as little as possible in the "catch-all" at the top

  • some constructs have to be defined several times. ex:

         (parameter
          type: (identifier) @type
          name: (identifier) @variable)
         (parameter (identifier) @variable)

This is intentional. Also:

@theothornhill
Copy link
Collaborator Author

I opened an issue here:
https://codeberg.org/FelipeLema/tree-sitter-indent.el/issues/1
in this repo: https://codeberg.org/FelipeLema/tree-sitter-indent.el/

I feel like I'm missing something here, but I have a proof of concept for indentation also 🎉 . I hope @FelipeLema can help us out here. If we can iron out some bugs (or teach me how to read) I think we can ditch CC Mode pretty soon(tm)

@theothornhill
Copy link
Collaborator Author

Right now this code indents correctly when typing it out manually

namespace Foo
{
    public class Bar
    {
        public Baz Quux()
        {
            // ...
            var test = new Generic();
            var text = new Thing.Thang
            {
                a,
                b,
                c
            };
            yield return new InnerA.InnerB
            {
                Boo = "foo",
                May = "Yay"
            };
            
        }
    }
}

Marking the whole thing and pressing TAB yields some strange behavior.

However, note that the yield part indents correctly! It doesn't on master, and required huge hacks (if I recall correctly) from before rework. So we are already harvesting benefits here. From very limited late night hacking 🪓

@theothornhill
Copy link
Collaborator Author

@josteink If you want, you can test the branch now both for indentation and coloring, and add things as you see fit. Most of the infrastructure should be usable, and we have the same "config-based" rules for indentation!

One small tip: If something doesn't indent properly, try making it syntactically valid first I.E add a semi after a brace. See the issue over at codeberg for more details 👍

Most of the code in indentation-tests indents correctly as you type. It's still rough around the edges and needs to be cleaned up of course :) Just letting you know the status

@theothornhill theothornhill changed the title Using tree-sitter in charp-mode? Using tree-sitter in csharp-mode? Dec 18, 2020
@josteink
Copy link
Collaborator

Found this gem here when doing a fat finger:

image

Hopefully not too difficult to figure out :)

@theothornhill
Copy link
Collaborator Author

oOo!

My guess is that this is pretty hard :P We have no logic for dealing with that, now at least. I'll look into this :)

Side note: Lots of changes coming :)

@theothornhill
Copy link
Collaborator Author

I added back your interface code. It was a hassle with the defcustoms, much easier to iterate on highlighting when everything is in one place. Sorry for the back and forth. We will factor it out when we are satisfied in the end.

@theothornhill
Copy link
Collaborator Author

theothornhill commented Dec 20, 2020

Good news! I think we have feature parity, both with indentation and highlighting! 🤞

Caveat: You need https://codeberg.org/theo/tree-sitter-indent.el/ (fork of FelipeLemas thing) for the latest changes. They haven't been merged yet. However, it should be a simple eval-buffer to get that.

Now indentation-tests doesn't alter buffer when I use C-x h TAB. I think that is a huge step forward!

One thing to note, which is a little weird:

Consider this code:

namespace Foo
{
|                                     // This is how things are opened - point is '|'
}

If I start typing something, then we get the extra indentation. I'm thinking of ways to get tree-sitter-indent to do this immediately, but haven't figured it out yet. However, if you type something, then press RET we have correct indentation, curtesy of electric-indent

Why don't you test it a little? I'm a little eager to get this merged into csharp-mode, documenting how to adjust highlighting etc so that people can start to contribute. I do want it to be functional though.

(I'm also thinking of stealing the indent part - it's only a couple hundred LOC and making it C# specific. However, I think the community could need some debugging on that thing to make it super robust 😄 )

@theothornhill
Copy link
Collaborator Author

Found this gem here when doing a fat finger:

image

Hopefully not too difficult to figure out :)

I don't think this is solvable: https://github.com/tree-sitter/tree-sitter-c-sharp/blob/master/grammar.js#L1471

Roslyn doesn't seem to acknowledge scandinavia, I'm afraid

@josteink
Copy link
Collaborator

Given how that is legal, compilable code (even though bad form), that seems a bit odd. Oh well.

Good researching 🙂

@theothornhill
Copy link
Collaborator Author

Ok I stole the tree-sitter-indent. Now it works OOTB. Will still contribute upstream, but it is faster to iterate this way :)

@theothornhill theothornhill added the tree-sitter Solved by using csharp-tree-sitter-mode label Dec 20, 2020
@intrasight
Copy link

Question: does this require installing tree-sitter?

https://ubolonton.github.io/emacs-tree-sitter/installation/

@josteink
Copy link
Collaborator

Question: does this require installing tree-sitter?

https://ubolonton.github.io/emacs-tree-sitter/installation/

I believe our dependency on Emacs-tree-sitter should ensure we get a bundled version of the actual tree-sitter binary for all "common" platforms.

What OS are you using?

@theothornhill
Copy link
Collaborator Author

You don’t need any external tree sitter related packages. Everything should be included with the tree sitter branch.

However, we depend on ubolontons mode as well

@theothornhill
Copy link
Collaborator Author

Closing after 3cff337

@seagle0128
Copy link

Sorry guys! I got this error with the latest version if I didn't install tree-sitter.

Error loading autoloads: (void-variable tree-sitter-major-mode-language-alist)

@theothornhill
Copy link
Collaborator Author

Do you get the error if you manually install tree-sitter first? If not, I’ll update readme to be clearer about this behavior. For now I don’t want to introduce dependencies to csharp mode.

@jcs090218
Copy link
Collaborator

@seagle0128 Are you sure you are installing csharp-mode with the latest version? I have opened this issue at #208 and it should resolved by #209. You should probably see the following error log if you don't have tree-sitter installed.

Error (bytecomp): Cannot open load file: No such file or directory, tree-sitter

@seagle0128
Copy link

Thanks, @theothornhill @jcs090218 .

Now I got this message while installing csharp-mode.

Cannot open load file: No such file or directory, tree-sitter

@jcs090218
Copy link
Collaborator

jcs090218 commented Jan 3, 2021

@seagle0128 Notice that tree-sitter support is experimental and this isn't an official yet. You may have to manually install package tree-sitter to resolve this issue.

By seeing csharp-mode.el it does not have Package-Requires specify.

@seagle0128
Copy link

Gocha, glad to know. Thanks, @jcs090218 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
font-lock tree-sitter Solved by using csharp-tree-sitter-mode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants