Start macro expansion chapter #26

mark-i-m · 2018-01-25T21:57:30Z

I just went through this code to implement ? macro repetition, so I thought I could take a stab at the chapter 😄

mark-i-m · 2018-01-26T00:00:12Z

@nikomatsakis I don't really know anything about hygiene, proc macros, or custom derive, but I added a bit about macros-by-example, and left TODOs for the rest...

nikomatsakis

This is great! Thanks =) I left some small suggestions

nikomatsakis · 2018-01-26T13:58:33Z

src/macro-expansion.md

+
+Macro expansion happens during parsing. `rustc` has two parsers, in fact: the
+normal Rust parser, and the macro parser. During the parsing phase, the normal
+Rust parser will call into the macro parser when it encounters a macro. The


can you be more precise about what a reference to a macro is? e.g. ,do you mean a macro invocation, like foo!(...)?

Also, is it really called from the parser? I thought there was a second phase that came after parsing, but maybe I'm going to learn something here =)

Let me verify that :)

@nikomatsakis

Ok, so it looks like

the macro parser is called from macro_rules::compile https://github.com/rust-lang/rust/blob/ca9cf3594ab25d2809ac576dfc9defb8e87b45b8/src/libsyntax/ext/tt/macro_rules.rs#L184 which transforms a macro invocation into a syntax extension.

macro_rules::compile is called from librustc_resolve/macros.rs and librustc_resolve/build_reduced_graph.rs... which I'm guessing is doing name resolution? Does this run after the parser?

nikomatsakis · 2018-01-26T13:59:28Z

src/macro-expansion.md

+normal Rust parser, and the macro parser. During the parsing phase, the normal
+Rust parser will call into the macro parser when it encounters a macro. The
+macro parser, in turn, may call back out to the Rust parser when it needs to
+bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to


here, you mean when the macro is trying to parse the contents of the macro invocation against one of the macro arms?

nikomatsakis · 2018-01-26T14:00:17Z

src/macro-expansion.md

+Basically, the macro parser is like an NFA-based regex parser. It uses an
+algorithm similar in spirit to the [Earley parsing
+algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro parser is
+defined in `src/libsyntax/ext/tt/macro_parser.rs`.


can we make links into GH here (master branch)? this at least allows us to detect if those links rot

nikomatsakis · 2018-01-26T14:00:23Z

src/macro-expansion.md

+Rust parser will call into the macro parser when it encounters a macro. The
+macro parser, in turn, may call back out to the Rust parser when it needs to
+bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to
+be explained. The code for macro expansion is in `src/libsyntax/ext/tt/`.


can we make links into GH here (master branch)? this at least allows us to detect if those links rot

nikomatsakis · 2018-01-26T14:01:18Z

src/macro-expansion.md

+bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to
+be explained. The code for macro expansion is in `src/libsyntax/ext/tt/`.
+
+### The macro parser


as a meta-comment, I think it's a good idea to start out with some kind of concrete example and walk it through. For example:

Imagine we have a macro

macro_rules! foo { ($metavariable:tt) => { ... } }

now you can reference this example from the text below

nikomatsakis · 2018-01-26T14:01:59Z

src/macro-expansion.md

+parse different types of metavariables, such as `ident`, `block`, `expr`, etc.,
+the macro parser must sometimes call back to the normal Rust parser.
+
+Interestingly, both definitions and invokations of macros are parsed using the


Spelling: invocations

nikomatsakis · 2018-01-26T14:02:43Z

src/macro-expansion.md

+_using the macro parser itself_.
+
+When the compiler comes to a macro invokation, it needs to parse that
+invokation. This is also known as _macro expansion_. The same NFA-based macro


Spelling: invocation

nikomatsakis · 2018-01-26T14:03:10Z

src/macro-expansion.md

+When the compiler comes to a macro invokation, it needs to parse that
+invokation. This is also known as _macro expansion_. The same NFA-based macro
+parser is used that is described above. Notably, the "pattern" (or _matcher_)
+used is the first token tree extracted from the rules of the macro _definition_.


this is where the running example would be really handy

nikomatsakis · 2018-01-26T14:03:46Z

src/macro-expansion.md

+parser is used that is described above. Notably, the "pattern" (or _matcher_)
+used is the first token tree extracted from the rules of the macro _definition_.
+In other words, given some pattern described by the _definition_ of the macro,
+we want to match the contents of the _invokation_ of the macro.


Spelling: invocation

nikomatsakis · 2018-01-26T14:03:58Z

src/macro-expansion.md

+that non-terminal. Then, the macro parser proceeds in parsing as normal.
+
+For more information about the macro parser's implementation, see the comments
+in `src/libsyntax/ext/tt/macro_parser.rs`.


link to repo

federicomenaquintero · 2018-01-26T18:43:17Z

BTW, there's a very interesting discussion about hygiene and proc-macro in rust-lang/rust#45934

mark-i-m · 2018-01-26T20:43:50Z

@nikomatsakis I updated the chapter (a lot). I think I have addressed your comments. Let me know. Thanks!

Also, copying this here, because the comment above is "outdated":

Also, is it really called from the parser? I thought there was a second phase that came after parsing, but maybe I'm going to learn something here =)

Ok, so it looks like

the macro parser is called from macro_rules::compile https://github.com/rust-lang/rust/blob/ca9cf3594ab25d2809ac576dfc9defb8e87b45b8/src/libsyntax/ext/tt/macro_rules.rs#L184 which transforms a macro invocation into a syntax extension.
macro_rules::compile is called from librustc_resolve/macros.rs and librustc_resolve/build_reduced_graph.rs... which I'm guessing is doing name resolution? Does this run after the parser?

mark-i-m · 2018-01-27T06:13:25Z

@federicomenaquintero Would you be interested in filling in some of the TODOs? I want to learn how they all work, but I don't have the bandwidth in the near future...

nikomatsakis · 2018-01-29T14:08:21Z

Does this run after the parser?

Yes, it does

nikomatsakis · 2018-01-29T14:14:07Z

src/macro-expansion.md

+
+`$mvar` is called a _metavariable_. Unlike normal variables, rather than binding
+to a value in a computation, a metavariable binds _at compile time_ to a tree of
+_tokens_. A _token_ zero or more symbols that together have some meaning. For


A token zero or more symbols that together have some meaning.

This sentence is not grammatical and I'm not quite sure how to fix it. =) In particular, I don't think of a token as "zero or more symbols" (and it's sort of unclear to me what you mean by symbol, which in parsing terminology is often used to mean the union of token and nonterminal).

I think I would maybe say something like this:

"A token is a single "unit" of the grammar, such as an identifier (e.g., print) or punctuation (e.g., =>). Token trees resulting from paired parentheses-like characters ((...), [...], and {...}) -- they include the open and close and all the tokens in between (we do require that parentheses-like characters be balanced)."

but it doesn't seem like the best either :)

nikomatsakis

This is really nice. I left a few nits.

nikomatsakis · 2018-01-29T14:14:48Z

src/macro-expansion.md

+
+The process of expanding the macro invocation into the syntax tree
+`println!("{}", foo)` and then expanding that into a call to `Display::fmt` is
+called _macro expansion_, it is the topic of this chapter.


Nit: the word it is not needed here

nikomatsakis · 2018-01-29T14:15:46Z

src/macro-expansion.md

+In the analogy of a regex parser, `tts` is the input and we are matching it
+against the pattern `ms`. Using our examples, `tts` could be the stream of
+tokens containing the inside of the example invocation `print foo`, while `ms`
+might be the sequence of token (trees) `print $mvar:ident`.


tying back to the example is 💯

mark-i-m · 2018-01-29T17:25:28Z

@nikomatsakis I updated the paragraph on tokens, as you suggested... I am wondering

Is this a term we should add to the glossary?
Should the parsing chapter go into more detail about lexing and parsing or rely on external sources for the basics? If it went into more detail, maybe we can just assume that the reader knows about tokens from that chapter and omit the paragraph from this chapter altogether?

Michael-F-Bryan · 2018-01-30T13:10:04Z

Should the parsing chapter go into more detail about lexing and parsing or rely on external sources for the basics?

I was asking myself this exact question when I wrote the start of the parser chapter. Should we add a small note up the top saying we assume people know how a basic recursive descent parser works and what tokenizing/lexical analysis is? The idea being this is a book about rustc internals, not a book an introduction to parsing.

There is already loads of good quality material on basic parsers on the internet, a couple paragraphs at the top of the chapter probably wouldn't be able to do it justice.

mark-i-m · 2018-01-30T15:31:34Z

I agree that we shouldn't try to teach parsing here, but given that I don't expect most people to know basic parsing, I worry that it would discourage contributions... Perhaps we can

Add a high level overview of the algorithm and point to a few solid resources for learning in detail
Give some key term definitions
Tie them all back to the code

What do you think?

mark-i-m · 2018-01-30T15:35:47Z

Erg... Sorry, I fat-fingered the "close and comment" button.... Updated my post above

federicomenaquintero · 2018-01-30T16:09:36Z

@federicomenaquintero Would you be interested in filling in some of the TODOs? I want to learn how they all work, but I don't have the bandwidth in the near future...

Yes, I'll see what I can do.

Michael-F-Bryan · 2018-01-31T07:32:04Z

What do you think?

Sounds like a good idea. We could say something like this:

Rust syntax is specified by a grammar (link) which is essentially a list of rules where each rule specifies how a piece of the language is written (e.g. a crate contains multiple items, an item may be a function declaration, a function has a bunch of statements, a statement is a ...), with each rule being written in terms of other rules or terminals (the base case, typically tokens).

Generally speaking, for each grammar rule there will be one parser method. In this way we can translate a token stream into an AST by recursively calling the appropriate method.

It's essentially recursive descent 101, but you could tie all of this back to rustc by inspecting a sample code snippet (e.g. an if statement) and then showing what would be called when parsing it.

EDIT: @mark-i-m this conversation probably belongs in #13, so I'm moving it over there.

nikomatsakis · 2018-01-31T16:15:47Z

I agree that we shouldn't try to teach parsing here, but given that I don't expect most people to know basic parsing, I worry that it would discourage contributions... Perhaps we can

I think I agree with both of you. I don't think we want a lot of introductory material; a few links don't hurt, but not too much. But I think there's a third way, though it may take some iteration to get there: To some extent, I think you can serve both audiences by doing a kind of "walk through" of the code.

In other words, e.g. to explain tokenizing, we might point to the token data structure and give some source showing how it would be divided into tokens (we can always link to wikipedia or something too). This way, if you know what a token is, you learn about the Rust-specific parts of it. If you don't know what a token is, you can just understand it as this Rsut data structure and later learn about the more general form.

Similarly I imagine we can say something like "Rust has a recursive-descent parser" (where we link to wikipedia) and then walk through how it would parse some small example, showing a few key functions (eg., the one that parses a type). If you're not familiar with recursive descent, this will basically give you the idea, but if you are, then you'll learn about the names of key concepts in the code.

Start macro expansion chapter

1627505

mark-i-m force-pushed the macros branch from d45fc2c to 1627505 Compare January 25, 2018 23:21

mark-i-m mentioned this pull request Jan 25, 2018

"macro expansion" #15

Closed

Add a bit about macro expansion

4992b47

mark-i-m changed the title ~~[WIP] Start macro expansion chapter~~ Start macro expansion chapter Jan 25, 2018

Oops rename

ba3dd18

nikomatsakis requested changes Jan 26, 2018

View reviewed changes

Updated macros to address Niko's comments

858dfdf

nikomatsakis reviewed Jan 29, 2018

View reviewed changes

nikomatsakis requested changes Jan 29, 2018

View reviewed changes

Rewrite 'tokens' para...

dee42c1

Corrected relationship of macro and rust parsers

82da67a

mark-i-m closed this Jan 30, 2018

mark-i-m reopened this Jan 30, 2018

Michael-F-Bryan mentioned this pull request Jan 31, 2018

"The parser" #13

Open

8 tasks

nikomatsakis approved these changes Jan 31, 2018

View reviewed changes

nikomatsakis merged commit b4b2b0d into rust-lang:master Jan 31, 2018

mark-i-m deleted the macros branch May 23, 2018 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start macro expansion chapter #26

Start macro expansion chapter #26

mark-i-m commented Jan 25, 2018

mark-i-m commented Jan 26, 2018

nikomatsakis left a comment

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

mark-i-m Jan 26, 2018

mark-i-m Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

nikomatsakis Jan 26, 2018

federicomenaquintero commented Jan 26, 2018

mark-i-m commented Jan 26, 2018

mark-i-m commented Jan 27, 2018

nikomatsakis commented Jan 29, 2018

nikomatsakis Jan 29, 2018

nikomatsakis left a comment

nikomatsakis Jan 29, 2018

nikomatsakis Jan 29, 2018

mark-i-m commented Jan 29, 2018

Michael-F-Bryan commented Jan 30, 2018

mark-i-m commented Jan 30, 2018 •

edited

Loading

mark-i-m commented Jan 30, 2018

federicomenaquintero commented Jan 30, 2018

Michael-F-Bryan commented Jan 31, 2018 •

edited

Loading

nikomatsakis commented Jan 31, 2018

Start macro expansion chapter #26

Start macro expansion chapter #26

Conversation

mark-i-m commented Jan 25, 2018

mark-i-m commented Jan 26, 2018

nikomatsakis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

federicomenaquintero commented Jan 26, 2018

mark-i-m commented Jan 26, 2018

mark-i-m commented Jan 27, 2018

nikomatsakis commented Jan 29, 2018

Choose a reason for hiding this comment

nikomatsakis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mark-i-m commented Jan 29, 2018

Michael-F-Bryan commented Jan 30, 2018

mark-i-m commented Jan 30, 2018 • edited Loading

mark-i-m commented Jan 30, 2018

federicomenaquintero commented Jan 30, 2018

Michael-F-Bryan commented Jan 31, 2018 • edited Loading

nikomatsakis commented Jan 31, 2018

mark-i-m commented Jan 30, 2018 •

edited

Loading

Michael-F-Bryan commented Jan 31, 2018 •

edited

Loading