-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start macro expansion chapter #26
Conversation
@nikomatsakis I don't really know anything about hygiene, proc macros, or custom derive, but I added a bit about macros-by-example, and left TODOs for the rest... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Thanks =) I left some small suggestions
src/macro-expansion.md
Outdated
|
||
Macro expansion happens during parsing. `rustc` has two parsers, in fact: the | ||
normal Rust parser, and the macro parser. During the parsing phase, the normal | ||
Rust parser will call into the macro parser when it encounters a macro. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you be more precise about what a reference to a macro is? e.g. ,do you mean a macro invocation, like foo!(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is it really called from the parser? I thought there was a second phase that came after parsing, but maybe I'm going to learn something here =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me verify that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so it looks like
- the macro parser is called from
macro_rules::compile
https://github.com/rust-lang/rust/blob/ca9cf3594ab25d2809ac576dfc9defb8e87b45b8/src/libsyntax/ext/tt/macro_rules.rs#L184 which transforms a macro invocation into a syntax extension. macro_rules::compile
is called fromlibrustc_resolve/macros.rs
andlibrustc_resolve/build_reduced_graph.rs
... which I'm guessing is doing name resolution? Does this run after the parser?
src/macro-expansion.md
Outdated
normal Rust parser, and the macro parser. During the parsing phase, the normal | ||
Rust parser will call into the macro parser when it encounters a macro. The | ||
macro parser, in turn, may call back out to the Rust parser when it needs to | ||
bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, you mean when the macro is trying to parse the contents of the macro invocation against one of the macro arms?
src/macro-expansion.md
Outdated
Basically, the macro parser is like an NFA-based regex parser. It uses an | ||
algorithm similar in spirit to the [Earley parsing | ||
algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro parser is | ||
defined in `src/libsyntax/ext/tt/macro_parser.rs`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make links into GH here (master branch)? this at least allows us to detect if those links rot
src/macro-expansion.md
Outdated
Rust parser will call into the macro parser when it encounters a macro. The | ||
macro parser, in turn, may call back out to the Rust parser when it needs to | ||
bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to | ||
be explained. The code for macro expansion is in `src/libsyntax/ext/tt/`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make links into GH here (master branch)? this at least allows us to detect if those links rot
src/macro-expansion.md
Outdated
bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to | ||
be explained. The code for macro expansion is in `src/libsyntax/ext/tt/`. | ||
|
||
### The macro parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a meta-comment, I think it's a good idea to start out with some kind of concrete example and walk it through. For example:
Imagine we have a macro
macro_rules! foo {
($metavariable:tt) => { ... }
}
now you can reference this example from the text below
src/macro-expansion.md
Outdated
parse different types of metavariables, such as `ident`, `block`, `expr`, etc., | ||
the macro parser must sometimes call back to the normal Rust parser. | ||
|
||
Interestingly, both definitions and invokations of macros are parsed using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling: invocations
src/macro-expansion.md
Outdated
_using the macro parser itself_. | ||
|
||
When the compiler comes to a macro invokation, it needs to parse that | ||
invokation. This is also known as _macro expansion_. The same NFA-based macro |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling: invocation
src/macro-expansion.md
Outdated
When the compiler comes to a macro invokation, it needs to parse that | ||
invokation. This is also known as _macro expansion_. The same NFA-based macro | ||
parser is used that is described above. Notably, the "pattern" (or _matcher_) | ||
used is the first token tree extracted from the rules of the macro _definition_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is where the running example would be really handy
src/macro-expansion.md
Outdated
parser is used that is described above. Notably, the "pattern" (or _matcher_) | ||
used is the first token tree extracted from the rules of the macro _definition_. | ||
In other words, given some pattern described by the _definition_ of the macro, | ||
we want to match the contents of the _invokation_ of the macro. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling: invocation
src/macro-expansion.md
Outdated
that non-terminal. Then, the macro parser proceeds in parsing as normal. | ||
|
||
For more information about the macro parser's implementation, see the comments | ||
in `src/libsyntax/ext/tt/macro_parser.rs`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to repo
BTW, there's a very interesting discussion about hygiene and proc-macro in rust-lang/rust#45934 |
@nikomatsakis I updated the chapter (a lot). I think I have addressed your comments. Let me know. Thanks! Also, copying this here, because the comment above is "outdated":
Ok, so it looks like
|
@federicomenaquintero Would you be interested in filling in some of the TODOs? I want to learn how they all work, but I don't have the bandwidth in the near future... |
Yes, it does |
src/macro-expansion.md
Outdated
|
||
`$mvar` is called a _metavariable_. Unlike normal variables, rather than binding | ||
to a value in a computation, a metavariable binds _at compile time_ to a tree of | ||
_tokens_. A _token_ zero or more symbols that together have some meaning. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A token zero or more symbols that together have some meaning.
This sentence is not grammatical and I'm not quite sure how to fix it. =) In particular, I don't think of a token as "zero or more symbols" (and it's sort of unclear to me what you mean by symbol, which in parsing terminology is often used to mean the union of token and nonterminal).
I think I would maybe say something like this:
"A token is a single "unit" of the grammar, such as an identifier (e.g., print
) or punctuation (e.g., =>
). Token trees resulting from paired parentheses-like characters ((...)
, [...]
, and {...}
) -- they include the open and close and all the tokens in between (we do require that parentheses-like characters be balanced)."
but it doesn't seem like the best either :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nice. I left a few nits.
src/macro-expansion.md
Outdated
|
||
The process of expanding the macro invocation into the syntax tree | ||
`println!("{}", foo)` and then expanding that into a call to `Display::fmt` is | ||
called _macro expansion_, it is the topic of this chapter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the word it
is not needed here
In the analogy of a regex parser, `tts` is the input and we are matching it | ||
against the pattern `ms`. Using our examples, `tts` could be the stream of | ||
tokens containing the inside of the example invocation `print foo`, while `ms` | ||
might be the sequence of token (trees) `print $mvar:ident`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tying back to the example is 💯
@nikomatsakis I updated the paragraph on tokens, as you suggested... I am wondering
|
I was asking myself this exact question when I wrote the start of the parser chapter. Should we add a small note up the top saying we assume people know how a basic recursive descent parser works and what tokenizing/lexical analysis is? The idea being this is a book about There is already loads of good quality material on basic parsers on the internet, a couple paragraphs at the top of the chapter probably wouldn't be able to do it justice. |
I agree that we shouldn't try to teach parsing here, but given that I don't expect most people to know basic parsing, I worry that it would discourage contributions... Perhaps we can
What do you think? |
Erg... Sorry, I fat-fingered the "close and comment" button.... Updated my post above |
Yes, I'll see what I can do. |
Sounds like a good idea. We could say something like this:
It's essentially recursive descent 101, but you could tie all of this back to EDIT: @mark-i-m this conversation probably belongs in #13, so I'm moving it over there. |
I think I agree with both of you. I don't think we want a lot of introductory material; a few links don't hurt, but not too much. But I think there's a third way, though it may take some iteration to get there: To some extent, I think you can serve both audiences by doing a kind of "walk through" of the code. In other words, e.g. to explain tokenizing, we might point to the token data structure and give some source showing how it would be divided into tokens (we can always link to wikipedia or something too). This way, if you know what a token is, you learn about the Rust-specific parts of it. If you don't know what a token is, you can just understand it as this Rsut data structure and later learn about the more general form. Similarly I imagine we can say something like "Rust has a recursive-descent parser" (where we link to wikipedia) and then walk through how it would parse some small example, showing a few key functions (eg., the one that parses a type). If you're not familiar with recursive descent, this will basically give you the idea, but if you are, then you'll learn about the names of key concepts in the code. |
I just went through this code to implement
?
macro repetition, so I thought I could take a stab at the chapter 😄