Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialized ATN const array generates horrible machine code fix #3009

Merged
merged 3 commits into from
Mar 4, 2021

Conversation

xTachyon
Copy link
Contributor

@xTachyon xTachyon commented Dec 18, 2020

Hello. I recently found a way to see how much space each symbol takes in an executable. To my surprise, I found that the winner for most space taken is Lexer::Initializer::Initializer, with more than 100kb of code.
The interesting code for that function looks like:
_serializedATN = { <big constant array> }
In my case, the ATN has about 8000 values of uint16.
While gcc and clang do the right think and place the array in .rdata(aka 16 kb), and then memcpy() it in the vector; msvc does not.
At a language level, that code:

  • creates an std::initializer_list of those value;
  • calls operator= on the vector with that std::initializer_list object.
    Msvc does exactly that, plus the part where it sets each value individually in code. Example of how 2 uint16 get set:
00007FF6F687083D  mov         eax,3  
00007FF6F6870842  mov         word ptr [rsp+0A0h],ax  
00007FF6F687084A  mov         eax,608Bh  
00007FF6F687084F  mov         word ptr [rsp+0A2h],ax  

Each uint16 gets a fantastic 13 bytes of x86 code to be set up. 13 bytes x 8000 values = 104000 bytes = 104kb. That's where my huge function size was coming for. You could say that's the compiler fault that it inlines this and you'd be right, but until msvc gets this fixed I think the sensible things to do is to generate C++ code that doesn't confuse any of the major compilers.

Now, that's what I was trying to do here, but I'm not sure that how I modified is the best way that can be done(I have minimal experience with java and .stg files). What I did is to remove the <if (rest(model.segments))> which was always false in my case and always generate "segments", and now it always generates code like:

static const uint16_t serializedATNSegment0[] = { <big constant array> };
_serializedATN.insert(_serializedATN.end(), serializedATNSegment0,
    serializedATNSegment0 + sizeof(serializedATNSegment0) / sizeof(serializedATNSegment0[0]));

which works as expected on every compiler. This change makes my executable 84 kb smaller, and I'm pretty sure that 16kb data + memcpy() is faster than 100 kb of x86 code.

There are few more horrible things in what antlr generates for C++, like duplicating that ATN numbers in both the lexer and parser cpps, but the compilers seem to figure that out and not generate duplicate code. This is the most important change that I could find at this moment.

@xTachyon xTachyon marked this pull request as ready for review December 18, 2020 06:06
@xTachyon
Copy link
Contributor Author

Hello. Is there something wrong with my PR? Should I modify anything?

@xTachyon
Copy link
Contributor Author

xTachyon commented Mar 1, 2021

So, is this project dead?

@ericvergnaud
Copy link
Contributor

@mike-lischke any view on this proposed change?

@mike-lischke
Copy link
Member

It's always a risk to conclude just because something doesn't affect you (or doesn't happen to you etc) means it doesn't affect anyone else. The template code is necessary to split huge ATN code arrays into pieces that are still valid. You probably have a pretty simple grammar, where no splitting is necessary, but try one of the more serious grammars like C++ or, even better, MySQL and see what comes out with your patch. Can the generated files still be compiled?

If you really know more "horrible" things about the C++ code generation I'm eager to hear them. Maybe you can open a bug report for them here (please prefix its title with [C++] for me to find it quicker)? The mentioned "ATN code duplication" however is not an issue. Lexer and parser both have an own ATN, which differ from each other.

@xTachyon
Copy link
Contributor Author

xTachyon commented Mar 2, 2021

What I did is I made sure that it always generates generates split ATN code arrays, instead of the particular case where it generates one inline. And I checked on every major compiler that it generates sane code for amd64.

Maybe I'm missing your point? How would this impact badly other usages? Because the old version does impact badly mine.

And this is a real issue because it generates suboptiomal and oversized code that results in performance losses. It will behave correctly, sure, but my opinion is that we better do something about it.

I could try one of the bigger grammars to see if it still compiles though, if that's what you mean.

@mike-lischke
Copy link
Member

mike-lischke commented Mar 3, 2021

Oh, somehow I misread the patch, but stg code is not one of my favourite.

You have to fix the conflict yet before this can be merged.

@parrt This is ready to be merged, once the conflict is resolved.

@parrt
Copy link
Member

parrt commented Mar 3, 2021

@mike-lischke @ericvergnaud looks like we have a build failure, but in Swift...should we pull this C++ PR in anyway?

@ericvergnaud
Copy link
Contributor

@parrt Yes we should
(we seem to have sporadic failures caused I guess by unreliable connectivity between GitHub and the Mac at my home)

@parrt parrt merged commit 0f8bddc into antlr:master Mar 4, 2021
@mike-lischke
Copy link
Member

Ugh, Github uses your pesonal machine @ericvergnaud? That's an odd set up and I don't know if I would rely on it if I were the repo owner.

@ericvergnaud
Copy link
Contributor

@mike-lischke it's the only way we found to build on Mac at an affordable cost. Linux and Windows CI remain in the cloud.

@ericvergnaud
Copy link
Contributor

(it's not my personal machine, it's a dedicated one, funded by our favorite BDFL)

@xTachyon xTachyon deleted the bettercpp branch March 4, 2021 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants