Out of memory in parser #3799

oandregal · 2017-12-04T13:09:49Z

When trying to load a very old and large post from my blog and I get these errors:

As far as I've tested this fails randomly at lines 96, 349, or 372 of the parser when I reload the page.

I've tried with increasing the maximum file size allowed for the network but it didn't solve the issue. For context, if I deactivate the Gutenberg plugin, the post is loaded and shows the content.

Not sure how I can share any more details for debugging, just ping me and I'll provide what you need.

oandregal · 2017-12-04T13:12:33Z

cc @aduth @dmsnell @pento as you did recent work on that part of the code.

dmsnell · 2017-12-04T15:09:09Z

Can you share the source of the post?

oandregal · 2017-12-04T16:26:32Z

@dmsnell you can find it here. Expect broken links, non-existing media, and the like. I haven't touched it since I discovered the bug, so we can reproduce and debug what happened.

dmsnell · 2017-12-04T17:08:30Z

Hm. Not that big of a document. I'll start thinking about this - thanks!

pento · 2017-12-05T20:56:07Z

I can reproduce similar excessive memory usage with Gutenberg on this post:

Hello World post, Gutenberg disabled: 4.1MB
Hello World post, Gutenberg enabled: 4.2MB
This post, Gutenberg disabled: 4.3MB
This post, Gutenberg enabled: 46.6MB

dmsnell · 2017-12-05T21:07:20Z

I believe that I know why it's occurring and it's probably because of how we're handling those freeform HTML blocks. we build a list of each character in the parser and then later join them into a single string.

this was a fear of mine when converting to the nested parser with our move back to the idea of tokens (necessary) but not sure yet how to fix this.

oandregal · 2018-01-15T09:57:26Z

A follow-up to this in Gutenberg 2.0: the post loads and the memory issue is gone.

Not sure if related to the fix or not, but when the post loads it has the raw HTML and the text is grey:

The console shows an error:

Attached the error log.

When I convert this to blocks, it's done but some paragraphs and items within lists are stripped.

mcsf · 2018-01-15T13:47:58Z

For the record, the mismatch that fails the validation is based on   around tags:

In contrast, pasting the raw content works, though it's very slow (several seconds on a 3.1 GHz i7).

oandregal · 2018-02-21T09:29:36Z

I can confirm that @aduth #4591 fixed the issue: convert to blocks works. Thanks!

I have found some issues with a couple of lists when they are converted to Gutenblocks, but will report in a separate issue. The source of the error is likely an originally bad-formatted HTML.

mtias · 2018-03-01T19:21:34Z

Thanks for the follow ups, @nosolosw. :)

oandregal · 2018-11-16T23:25:52Z

I've been battle-testing Gutenberg again with this post I wrote 10 years ago. I've tried to convert it to blocks using Gutenberg 4.4:

Open the post in Gutenberg, the whole text is embedded in the classic block.
Check the document stats: it's 10.605 words long.
Convert it to blocks.
Check the document stats: it's 8.269 words long. One of the last sections was entirely gone (anything following the "Una panorámica general de las herramientas").

cc @dmsnell @mcsf @mtias @aduth @youknowriad

Going to see if I can bisect when this started to happen. Will post any follow ups here.

oandregal · 2018-11-17T00:35:48Z

Bisected from v4.0.0-rc.1: it looks like the content loss may have started happening in 489eb79 cc @iseulde

Note that before that commit, Gutenberg reported 9.981 words after converting the old post to blocks (less than 10.605 in the original) but I couldn't find noticeable content loss.

ellatrix · 2018-11-17T10:45:37Z

@nosolosw could you report a new issue with it?

oandregal · 2018-11-18T13:05:59Z

@iseulde created at #12029

mcsf · 2018-11-19T16:50:34Z

Thanks for battle-testing and dog-fooding, @nosolosw. Out of curiosity, since this issue was originally about the parser, have you noticed any issues with calling the server-side parser on that post of yours? Because since #10463 we no longer skip the parser (as fixed in #4591) but rather use the improved parser implementation added in #8083.

oandregal · 2018-11-20T09:07:18Z

I wasn't able to become familiar with all Gutenberg components in these few weeks in the project, so I'm not sure what the server-side parsing means. I'm happy to try if I'm given some steps to reproduce.

oandregal · 2018-11-20T12:19:51Z

Oh, now that I had some coffee, it occurred to me that by server-side parsing you meant if the content within the classic block in Gutenberg was the same that the content in the classic editor? I've just compared both and yes, the content is equal.

cc @mcsf

mcsf · 2018-11-20T16:59:42Z

Oh, now that I had some coffee

😄

it occurred to me that by server-side parsing you meant if the content within the classic block in Gutenberg was the same that the content in the classic editor? I've just compared both and yes, the content is equal.

@nosolosw, actually I meant whether PHP's gutenberg_parse_blocks ever acts up with your largest posts. At its simplest, this means: when loading the front-end for a post, does anything ever fail? In more detail, this could mean examining the output of gutenberg_parse_blocks (called in do_blocks) for anything unexpected.

oandregal added [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Type] Bug An existing feature does not function as intended labels Dec 4, 2017

aduth added the [Priority] High Used to indicate top priority items that need quick attention label Dec 4, 2017

dmsnell self-assigned this Dec 4, 2017

pento mentioned this issue Dec 11, 2017

Don't try to parse strings that don't contain blocks #3903

Merged

3 tasks

dmsnell mentioned this issue Jan 14, 2018

Parser: Client-side wpautop #4005

Merged

aduth mentioned this issue Jan 19, 2018

Framework: Skip server grammar parse, seek and replace dynamic blocks #4591

Merged

1 task

karmatosed added this to the Merge Proposal milestone Jan 25, 2018

oandregal closed this as completed Feb 21, 2018

oandregal mentioned this issue Feb 21, 2018

Convert to blocks adds unordered and ordered list #5172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of memory in parser #3799

Out of memory in parser #3799

oandregal commented Dec 4, 2017 •

edited

Loading

oandregal commented Dec 4, 2017

dmsnell commented Dec 4, 2017

oandregal commented Dec 4, 2017 •

edited

Loading

dmsnell commented Dec 4, 2017

pento commented Dec 5, 2017

dmsnell commented Dec 5, 2017

oandregal commented Jan 15, 2018

mcsf commented Jan 15, 2018

oandregal commented Feb 21, 2018

mtias commented Mar 1, 2018

oandregal commented Nov 16, 2018 •

edited

Loading

oandregal commented Nov 17, 2018

ellatrix commented Nov 17, 2018

oandregal commented Nov 18, 2018

mcsf commented Nov 19, 2018

oandregal commented Nov 20, 2018 •

edited

Loading

oandregal commented Nov 20, 2018 •

edited

Loading

mcsf commented Nov 20, 2018

Out of memory in parser #3799

Out of memory in parser #3799

Comments

oandregal commented Dec 4, 2017 • edited Loading

oandregal commented Dec 4, 2017

dmsnell commented Dec 4, 2017

oandregal commented Dec 4, 2017 • edited Loading

dmsnell commented Dec 4, 2017

pento commented Dec 5, 2017

dmsnell commented Dec 5, 2017

oandregal commented Jan 15, 2018

mcsf commented Jan 15, 2018

oandregal commented Feb 21, 2018

mtias commented Mar 1, 2018

oandregal commented Nov 16, 2018 • edited Loading

oandregal commented Nov 17, 2018

ellatrix commented Nov 17, 2018

oandregal commented Nov 18, 2018

mcsf commented Nov 19, 2018

oandregal commented Nov 20, 2018 • edited Loading

oandregal commented Nov 20, 2018 • edited Loading

mcsf commented Nov 20, 2018

oandregal commented Dec 4, 2017 •

edited

Loading

oandregal commented Dec 4, 2017 •

edited

Loading

oandregal commented Nov 16, 2018 •

edited

Loading

oandregal commented Nov 20, 2018 •

edited

Loading

oandregal commented Nov 20, 2018 •

edited

Loading