-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: Official Tree-Sitter grammar #564
Comments
The fundamental complication is that Quarto allows code cells that should be parsed with a different grammar, and we don't control ahead of time which language will be allowed there. I'm not sure how to do that within tree-sitter outside of being able to dynamically generate tree-sitter files, or providing a (massive) tree-sitter definition that includes "most" languages someone might be interested in. With that said, it's probably not too hard for us to provide a Quarto tree-grammar for the Markdown parts of Quarto as a small set of extensions over commonmark. In practice, the syntax highlighting in VS Code, Emacs and RStudio is done with help from the editor. In VS code, we use virtual docs. In Emacs, we use polymodes (I'm not sure about RStudio). I think a good editor experience will necessarily require help from Zed here. |
Hi there. I may be wrong, but I believe that tree-sitter/editors include the concept of injections. That is to simply say that "this block here is Python" and then the editor will simply use it's python grammar. It then falls on the user to have installed the plugins for R,Python,yaml. There is a bit more about it in this link Tree-sitter also supports extending other grammars. That is you can take the markdown grammar and extend that. Only defining what you add. How well that works I do not know. |
Yes, the problem is that we don't know ahead of time which language it is, so a single .qmd tree-sitter grammar will have to decide, ahead of time, which languages it supports, and that's not how Quarto works. |
Aplogies if that came of terse. I am grateful for a great product and I was a bit unclear. I am 90% sure that the tree-sitter grammar for regular markdown does not specify a list of languages it supports for the code blocks. And as tree-sitter is extendable and there is the possibility of npm requireing the Again. Thank you for an amazing product. I especially loved seeing the Typst support added a few releases back. |
My responses are not meant to imply we won't do it! Rather, I'm attempting to scope the problem.
I think but again am not sure that, in that case, it's not the tree-sitter grammar itself that's doing the forwarding work, but something in Zed controlling the "grammar injection", and this is what I meant by "I think a good editor experience will necessarily require help from Zed here." Otherwise, where's the knowledge that In any case, I think we'd start by finding a Commonmark grammar and extending it. We have internal reasons to want to do so, but it's not going to happen in the immediate future. |
I understand it as if it's both. As in that injections are a tree sitter command that tells the editor: "find a parser that handles this language and parse this block with it". The editor still needs to have said parser, ie the user would have had to install that extension. I guess it's a mode command. So in a way it's the editor. But there is nothing Zed specific and NVim propably handles things in the same manner.
I had not thought of that. Yes in the case of ojs it would have required the javascript extension to specify speaks
I understand. As it's pretty early days in REPL for Zed it's not like it's a rush on it from my perspective. I mostly still work in NVim when it comes to data science (VSCode does not like batteries) but every now and then I open. |
@cscheid you can think of the injection as being a two step process. The quarto tree-sitter grammar would parse the file to the best of its abilities, recognizing Zed would take the resulting tree and do a second pass over certain sections of it. It would use an injection "query" like this one to find all of the fenced code blocks in the document: And it uses the As a side note, note that tree-sitter supports both dynamic and static language detection:
|
@DavisVaughan Thanks, this is very helpful. (I'm going to followup with you at some point in 2025 when we have the cycles to do this.) |
There are a few attempts at a tree sitter grammar for Quarto out there but the tend to fall short or are barely usable.
As most major editors now support (sometimes exclusively) tree sitter grammars.
This would make supporting Quarto in Zed much easier.
The text was updated successfully, but these errors were encountered: