Serialized ATN const array generates horrible machine code fix #3009

xTachyon · 2020-12-18T05:57:09Z

Hello. I recently found a way to see how much space each symbol takes in an executable. To my surprise, I found that the winner for most space taken is Lexer::Initializer::Initializer, with more than 100kb of code.
The interesting code for that function looks like:
_serializedATN = { <big constant array> }
In my case, the ATN has about 8000 values of uint16.
While gcc and clang do the right think and place the array in .rdata(aka 16 kb), and then memcpy() it in the vector; msvc does not.
At a language level, that code:

creates an std::initializer_list of those value;
calls operator= on the vector with that std::initializer_list object.
Msvc does exactly that, plus the part where it sets each value individually in code. Example of how 2 uint16 get set:

00007FF6F687083D  mov         eax,3  
00007FF6F6870842  mov         word ptr [rsp+0A0h],ax  
00007FF6F687084A  mov         eax,608Bh  
00007FF6F687084F  mov         word ptr [rsp+0A2h],ax

Each uint16 gets a fantastic 13 bytes of x86 code to be set up. 13 bytes x 8000 values = 104000 bytes = 104kb. That's where my huge function size was coming for. You could say that's the compiler fault that it inlines this and you'd be right, but until msvc gets this fixed I think the sensible things to do is to generate C++ code that doesn't confuse any of the major compilers.

Now, that's what I was trying to do here, but I'm not sure that how I modified is the best way that can be done(I have minimal experience with java and .stg files). What I did is to remove the <if (rest(model.segments))> which was always false in my case and always generate "segments", and now it always generates code like:

static const uint16_t serializedATNSegment0[] = { <big constant array> };
_serializedATN.insert(_serializedATN.end(), serializedATNSegment0,
    serializedATNSegment0 + sizeof(serializedATNSegment0) / sizeof(serializedATNSegment0[0]));

which works as expected on every compiler. This change makes my executable 84 kb smaller, and I'm pretty sure that 16kb data + memcpy() is faster than 100 kb of x86 code.

There are few more horrible things in what antlr generates for C++, like duplicating that ATN numbers in both the lexer and parser cpps, but the compilers seem to figure that out and not generate duplicate code. This is the most important change that I could find at this moment.

xTachyon · 2021-01-11T20:29:24Z

Hello. Is there something wrong with my PR? Should I modify anything?

xTachyon · 2021-03-01T23:07:02Z

So, is this project dead?

ericvergnaud · 2021-03-02T01:34:02Z

@mike-lischke any view on this proposed change?

mike-lischke · 2021-03-02T07:52:18Z

It's always a risk to conclude just because something doesn't affect you (or doesn't happen to you etc) means it doesn't affect anyone else. The template code is necessary to split huge ATN code arrays into pieces that are still valid. You probably have a pretty simple grammar, where no splitting is necessary, but try one of the more serious grammars like C++ or, even better, MySQL and see what comes out with your patch. Can the generated files still be compiled?

If you really know more "horrible" things about the C++ code generation I'm eager to hear them. Maybe you can open a bug report for them here (please prefix its title with [C++] for me to find it quicker)? The mentioned "ATN code duplication" however is not an issue. Lexer and parser both have an own ATN, which differ from each other.

xTachyon · 2021-03-02T12:41:01Z

What I did is I made sure that it always generates generates split ATN code arrays, instead of the particular case where it generates one inline. And I checked on every major compiler that it generates sane code for amd64.

Maybe I'm missing your point? How would this impact badly other usages? Because the old version does impact badly mine.

And this is a real issue because it generates suboptiomal and oversized code that results in performance losses. It will behave correctly, sure, but my opinion is that we better do something about it.

I could try one of the bigger grammars to see if it still compiles though, if that's what you mean.

mike-lischke · 2021-03-03T07:32:58Z

Oh, somehow I misread the patch, but stg code is not one of my favourite.

You have to fix the conflict yet before this can be merged.

@parrt This is ready to be merged, once the conflict is resolved.

parrt · 2021-03-03T20:30:35Z

@mike-lischke @ericvergnaud looks like we have a build failure, but in Swift...should we pull this C++ PR in anyway?

ericvergnaud · 2021-03-04T01:36:39Z

@parrt Yes we should
(we seem to have sporadic failures caused I guess by unreliable connectivity between GitHub and the Mac at my home)

mike-lischke · 2021-03-04T07:56:38Z

Ugh, Github uses your pesonal machine @ericvergnaud? That's an odd set up and I don't know if I would rely on it if I were the repo owner.

ericvergnaud · 2021-03-04T08:03:30Z

@mike-lischke it's the only way we found to build on Mac at an affordable cost. Linux and Windows CI remain in the cloud.

ericvergnaud · 2021-03-04T08:04:33Z

(it's not my personal machine, it's a dedicated one, funded by our favorite BDFL)

xTachyon added 2 commits December 18, 2020 06:48

always generate static const array first for atn

f33a0f3

added myself to contributors

249ecf3

xTachyon marked this pull request as ready for review December 18, 2020 06:06

mike-lischke approved these changes Mar 3, 2021

View reviewed changes

xTachyon force-pushed the bettercpp branch from 96b6392 to 249ecf3 Compare March 3, 2021 10:54

Merge branch 'master' into bettercpp

76975c1

parrt merged commit 0f8bddc into antlr:master Mar 4, 2021

xTachyon deleted the bettercpp branch March 4, 2021 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialized ATN const array generates horrible machine code fix #3009

Serialized ATN const array generates horrible machine code fix #3009

xTachyon commented Dec 18, 2020 •

edited

Loading

xTachyon commented Jan 11, 2021

xTachyon commented Mar 1, 2021

ericvergnaud commented Mar 2, 2021

mike-lischke commented Mar 2, 2021

xTachyon commented Mar 2, 2021 •

edited

Loading

mike-lischke commented Mar 3, 2021 •

edited

Loading

parrt commented Mar 3, 2021

ericvergnaud commented Mar 4, 2021

mike-lischke commented Mar 4, 2021

ericvergnaud commented Mar 4, 2021

ericvergnaud commented Mar 4, 2021

Serialized ATN const array generates horrible machine code fix #3009

Serialized ATN const array generates horrible machine code fix #3009

Conversation

xTachyon commented Dec 18, 2020 • edited Loading

xTachyon commented Jan 11, 2021

xTachyon commented Mar 1, 2021

ericvergnaud commented Mar 2, 2021

mike-lischke commented Mar 2, 2021

xTachyon commented Mar 2, 2021 • edited Loading

mike-lischke commented Mar 3, 2021 • edited Loading

parrt commented Mar 3, 2021

ericvergnaud commented Mar 4, 2021

mike-lischke commented Mar 4, 2021

ericvergnaud commented Mar 4, 2021

ericvergnaud commented Mar 4, 2021

xTachyon commented Dec 18, 2020 •

edited

Loading

xTachyon commented Mar 2, 2021 •

edited

Loading

mike-lischke commented Mar 3, 2021 •

edited

Loading