How to go about including pegen code in CPython? #7

lysnikolaou · 2020-03-09T18:18:44Z

How would you go about moving all the pegen code into cpython? What I have done so far is the following.

Copy the pegen directory into cpython/Parser (and remove all the stuff that's not needed).
Move pegen.c and parse_string.c from cpython/Parser/pegen into cpython/Parser.
Move pegen.h and parse_string.h from cpython/Parser/pegen into cpython/Include.
Generate two new functions that call run_parser_* and can be called from C.
Add a parse.h in cpython/Include that exports both the new generated functions.
Generate the parser into Parser/parse.c.

All the names are just dummy names that I didn't give much thought, but do you think something like this would work?

gvanrossum · 2020-03-09T23:18:20Z

Yeah, that's mostly reasonable, and it's good that we're thinking about this now.

One suggestion: I propose to move (copy) the pegen parser generator (for generating both Python and C) into the Tools directory, rather than following the 30-year-old example of Parser/pgen.

In that move, I would also propose to rename either the top-level directory (the repo root, corresponding to a specific subdirectory of Tools) or the actual module (pegen/pegen), so that there isn't the horrible ambiguity of what "pegen" refers to. Maybe we should keep "pegen" as the module name (else too many imports need to be changed) and put all the code in Tools/peg_generator?

I would move everything except the C code, which has has little or no business living in the same tree, and the "storyN" directories, which belong to my blog posts only. I would also probably weed out most of the data files (except the ones that are used by tests). We can weed out more later if we feel there is cruft.

I would then copy the hand-written .c and .h files into cpython/Parser and cpython/Include, as you indicate.

I would also copy the grammar (data/simpy.gram) into cpython/Grammar, renaming it to python.gram.

Then we need the two run functions like you say, and a module that allows calling these from Python (the functions themselves can be called from C).

I would like to come up with slightly better names for most files at runtime -- "parse" or "parser" is too generic, and "pegen" already is over-used.

Also, let's not overthink this -- we can start with some less-than-perfect scheme and iterate.

gvanrossum · 2020-03-18T16:49:38Z

Are you working on this (or any other aspect of the integration)? If not, I'd like to give it a try.

lysnikolaou · 2020-03-18T18:43:18Z

I'm not currently working on this, but was also thinking of starting some of these days. Did you start with this? Can I help out in any way?

gvanrossum · 2020-03-18T19:58:44Z

I am working on getting the code generator copied into Tools. I've almost got it working. Will push to a new branch when I am happy with that step, hopefully later today. Be safe!

gvanrossum · 2020-03-18T23:42:46Z

Pushed to branch "radical". I only copied stuff that's part of the parser generator or the support for the generated C code, and scripts and tests. I moved the .c and .h files to peg_parser/, so they are no longer mixed in with the generator (but the support is still in pegen/, since it's used by the generator to read the grammar). I renamed simpy.gram to python.gram, and renamed two other data files from .txt to .py. I tested most Makefile targets, and they all seem to work except for the former simpy targets (which I've renamed to test_local and test_global). I'll see if I can fix those next. Once that all is done I'll see if I can actually generate an extension module that works with 3.9 (currently it still requires 3.8 :-).

gvanrossum · 2020-03-19T04:16:27Z

In the process of moving to 3.9 I was reminded that the 3.9 grammar has very recently changed -- slice_ty no longer exists (python#9605).

gvanrossum · 2020-03-20T03:55:37Z

the 3.9 grammar has very recently changed

Fixed in the 'radical' branch.

Copy the pegen directory into cpython/Parser (and remove all the stuff that's not needed).

Move pegen.c and parse_string.c from cpython/Parser/pegen into cpython/Parser.

Move pegen.h and parse_string.h from cpython/Parser/pegen into cpython/Include.

[Added]

Generate the parser into Parser/parse.c.

These are done too (though slightly differently). I will make a PR and merge this into the 'pegen' branch as a single commit. [Done.]

gvanrossum · 2020-03-20T04:30:04Z

Next up: item 6.

lysnikolaou · 2020-03-24T18:50:15Z

Is it okay to occasionally merge python/cpython:master into pegen? I think, that the Windows and MacOS tests were fixed upstream, so we might be able to turn those on again if we do so.

gvanrossum · 2020-03-24T19:32:28Z

Is it okay to occasionally merge python/cpython:master into pegen?

Absolutely! Go ahead.

gvanrossum · 2020-03-27T22:16:17Z

Everything here's done now. ~~I'll open a new issue to track enabling the new parser using a command-line flag and/or environment variable.~~ [UPDATE: We already have #4 and #5]

lysnikolaou changed the title ~~How to go about including pegen code in cpython?~~ How to go about including pegen code in CPython? Mar 9, 2020

gvanrossum closed this as completed Mar 18, 2020

This comment has been minimized.

Sign in to view

gvanrossum reopened this Mar 18, 2020

gvanrossum mentioned this issue Mar 20, 2020

Merge steps 1-3 and 6 into 'pegen' branch #8

Merged

gvanrossum closed this as completed Mar 27, 2020

gvanrossum mentioned this issue Apr 17, 2020

Discussion about nicer error messages #84

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to go about including pegen code in CPython? #7

How to go about including pegen code in CPython? #7

lysnikolaou commented Mar 9, 2020

gvanrossum commented Mar 9, 2020

gvanrossum commented Mar 18, 2020

lysnikolaou commented Mar 18, 2020

gvanrossum commented Mar 18, 2020

gvanrossum commented Mar 18, 2020

This comment has been minimized.

gvanrossum commented Mar 19, 2020

gvanrossum commented Mar 20, 2020 •

edited

Loading

gvanrossum commented Mar 20, 2020 •

edited

Loading

lysnikolaou commented Mar 24, 2020

gvanrossum commented Mar 24, 2020

gvanrossum commented Mar 27, 2020 •

edited

Loading

How to go about including pegen code in CPython? #7

How to go about including pegen code in CPython? #7

Comments

lysnikolaou commented Mar 9, 2020

gvanrossum commented Mar 9, 2020

gvanrossum commented Mar 18, 2020

lysnikolaou commented Mar 18, 2020

gvanrossum commented Mar 18, 2020

gvanrossum commented Mar 18, 2020

This comment has been minimized.

gvanrossum commented Mar 19, 2020

gvanrossum commented Mar 20, 2020 • edited Loading

gvanrossum commented Mar 20, 2020 • edited Loading

lysnikolaou commented Mar 24, 2020

gvanrossum commented Mar 24, 2020

gvanrossum commented Mar 27, 2020 • edited Loading

gvanrossum commented Mar 20, 2020 •

edited

Loading

gvanrossum commented Mar 20, 2020 •

edited

Loading

gvanrossum commented Mar 27, 2020 •

edited

Loading