Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Parse doctypes and set full parser quirks mode correctly #7195

Closed
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
42be2f6
Stop on DOCTYPE tokens in next_token
sirreal Aug 14, 2024
d41d9b3
Handle doctype tokens in html5lib-tests
sirreal Aug 14, 2024
17aff1d
Add get_doctype_name method to tag processor
sirreal Aug 14, 2024
6d73db8
Add missing-whitespace-before-doctype-name test
sirreal Aug 14, 2024
7550d10
Allow gotos in tag processor
sirreal Aug 14, 2024
2003f1d
Add get_compat_mode method
sirreal Aug 14, 2024
a36ceeb
WIP parsing doctypes
sirreal Aug 14, 2024
150d18c
Handle system doctype
sirreal Aug 14, 2024
16df2cd
Handle public-id only + whitespace
sirreal Aug 14, 2024
7217cc7
lint
sirreal Aug 14, 2024
b5c5cfa
Update html5lib-tests to get doctype name, publicid, systemid
sirreal Aug 14, 2024
deafee4
Parsing doctypes and handling quirks mode correctly
sirreal Aug 16, 2024
d359997
Fix logic error when parsing public and system identifiers
sirreal Aug 16, 2024
8d22b60
Disable pre-body failing whitespace text test
sirreal Aug 16, 2024
cda6f58
Fix multiline quote
sirreal Aug 16, 2024
5aeadf2
Scaffold doctype info class
sirreal Aug 16, 2024
601f53b
Return DOCTYPE info class from get_doctype_info method
sirreal Aug 16, 2024
edba9e1
Move quirks detect to get_compatibility_mode method
sirreal Aug 16, 2024
8250bcb
Remove get_compat_mode method from processor class
sirreal Aug 16, 2024
fee4b70
Always return string on doctype attribute lookups
sirreal Aug 16, 2024
8ea4451
Update tests to use get_doctype_info function
sirreal Aug 16, 2024
2ffff8d
Update tests to use doctype info
sirreal Aug 16, 2024
edac23f
Update test ticket number
sirreal Aug 16, 2024
f38fe1c
Add "quirks mode" to "anything else" initial mode
sirreal Aug 16, 2024
65cca88
Move doctype contents parsing into doctype_info
sirreal Aug 19, 2024
621afad
Better comments and naming
sirreal Aug 19, 2024
364d348
Improve more documentation in comments
sirreal Aug 19, 2024
7578082
Add more information to the class doc block
sirreal Aug 19, 2024
a68d5ca
Determing compat mode on initial doctype parse
sirreal Aug 19, 2024
78e8c64
Add more info about compatibility mode property strings
sirreal Aug 19, 2024
ef734be
Refactor doctype info class to use from_html factory
sirreal Aug 20, 2024
2a4807e
Fix equals alignment lint
sirreal Aug 20, 2024
995129b
Make doctype info properties public and add more documentation
sirreal Aug 20, 2024
8e68dd6
Update full parser compat mode from doctype handling
sirreal Aug 20, 2024
c59993a
Add readonly notes to doctype info properties
sirreal Aug 20, 2024
3bda3f5
Add newline normalization and null byte replacement
sirreal Aug 20, 2024
dd9cb57
Update tests to use direct property access
sirreal Aug 20, 2024
122e393
Fix off-by-one error on minimum length
sirreal Aug 20, 2024
bab37e5
Update missing doctype name test to use null
sirreal Aug 20, 2024
aa1912f
Move DOCTYPE tests to specific file
sirreal Aug 20, 2024
271681f
Fix lint
sirreal Aug 21, 2024
2a424e9
Merge branch 'trunk' into html-api/full-parser-doctype-quirks-mode-ha…
sirreal Aug 21, 2024
f76e4eb
Remove redundant extra argument in html5lib test helper
sirreal Aug 21, 2024
907987e
Check for undefined doctype identifiers in html5lib test trees
sirreal Aug 21, 2024
452a98c
Remove test default argument that can't be used
sirreal Aug 21, 2024
c44a0b6
Remove test arguments from removed dataProvider
sirreal Aug 21, 2024
4ab8c64
Update doctype comments
sirreal Aug 21, 2024
0cb32e9
Final pass on documentation comments
sirreal Aug 21, 2024
84c1faa
Documentating and naming updates.
dmsnell Aug 22, 2024
3c4aa1d
Add optimization for normative HTML DOCTYPE declaration.
dmsnell Aug 22, 2024
fe6cac9
Merge remote-tracking branch 'upstream/trunk' into html-api/full-pars…
dmsnell Aug 22, 2024
3db7230
Merge remote-tracking branch 'upstream/trunk' into html-api/full-pars…
dmsnell Aug 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions phpcs.xml.dist
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,7 @@
in the parsing, and distance the code from its standard. -->
<rule ref="Generic.PHP.DiscourageGoto.Found">
<exclude-pattern>/wp-includes/html-api/class-wp-html-processor\.php</exclude-pattern>
<exclude-pattern>/wp-includes/html-api/class-wp-html-tag-processor\.php</exclude-pattern>
</rule>

<!-- Exclude sample config from modernization to prevent breaking CI workflows based on WP-CLI scaffold.
Expand Down
269 changes: 269 additions & 0 deletions src/wp-includes/html-api/class-wp-html-doctype-info.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
<?php
/**
* HTML API: WP_HTML_Doctype_Info class
*
* @package WordPress
* @subpackage HTML-API
* @since 6.7.0
*/

/**
* Core class used by the HTML API processor to represent a DOCTYPE declaration.
*
* @since 6.7.0
*/
class WP_HTML_Doctype_Info {
/**
* The name of the DOCTYPE.
*
* @var string|null
*/
private $name;

/**
* The public identifier of the DOCTYPE.
*
* @var string|null
*/
private $public_identifier;

/**
* The system identifier of the DOCTYPE.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just guessing but I bet a lot could be added to these properties and this class to educate people on the value (or danger) of these parts of the DOCTYPE. Just a comment saying something like "This exists mostly to be able to represent and copy HTML from one document from another, but it's best to use only the standard <!DOCTYPE html> declaration for HTML."

just doodling

* @var string|null
*/
private $system_identifier;

/**
* Whether the DOCTYPE token force-quirks flag is set.
*
* @var bool
*/
private $force_quirks_flag;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps $forces_quirks_flag might align better with the nuance? this declaration forces a document into quirks mode

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep this $force_quirks_flag which aligns with the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've re-read some of your other related comments and I think I agree after all. I'm going to re-work some of this to establish the resulting quirks mode on construction instead of re-running the algorithm in the get_quirks_mode function.


/**
* Constructor.
*
* The arguments to this constructor correspond to the "DOCTYPE token" as defined in the
* HTML specification.
*
* > DOCTYPE tokens have a name, a public identifier, a system identifier,
* > and a force-quirks flag.
*
* @see https://html.spec.whatwg.org/multipage/parsing.html#tokenization
*
* @since 6.7.0
*
* @param string|null $name The name of the DOCTYPE.
* @param string|null $public_identifier The public identifier of the DOCTYPE.
* @param string|null $system_identifier The system identifier of the DOCTYPE.
* @param bool $force_quirks_flag Whether the DOCTYPE token force-quirks flag is set.
*/
public function __construct(
?string $name,
?string $public_identifier,
?string $system_identifier,
bool $force_quirks_flag
) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this constructor even though it's probably fine with @access private

Did you consider moving the parsing code inside of it? We could replace this with a static creator method like createFromString( $html, $at, $length ) and parse evereything within this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving the parse of the token into this class could help resolve the force-quirks-flag information visibility problem as well.

when constructing, as soon as force-quirks-flag is activated, we can stop parsing and set $indicates_document_mode = 'quirks'; and so the four properties can be fully: name, public identifier, system identifier, indicated document mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the parsing into the constructor, made it access-private, and added some notes that it should really be constructed from a doctype token's get_modifiable_text.

The parser now expects what's "in" the doc type, and assumes that whatever it is passed is correct.

These classes become quite closely coupled. That may OK at this time given the narrow usefulness of this class (it's practically internal) and the documentation.

Curious to hear your thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of tightly coupling these and recreating some of the pains that Core already has, such as the $attr attribute list from HTML in wp_kses_hair(), I think it would be worth it to double-parse the DOCTYPE tag in its entirety. We can pass the full token into the class and reparse, as parsing the DOCTYPE declaration is much easier than parsing a tag name, and should only happen at most once during the parse of a document, even if multiple tokens appear (since we can ignore the successive tokens).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost passed the full token and then changed direction after looking at get_modifiable_text.

  • get_modifiable_text handles newline normalization and null bytes. This isn't a big deal to duplicate.
  • If there were pending updates from set_modifiable_text, we'd miss them because we'd be getting the "raw" token.

Thinking about that more, I think it's OK to miss updates because I don't think there should be any updates. Changing the doctype seems very dangerous because it has the potential to alter the document's compat mode which could affect how it's parsed.

I do have concerns about the failure mode. What happens if we don't get a valid doctype string? We can return a quirks mode doctype instance with all empty strings, but it would be nice to indicate somehow that the provided string wasn't a valid doctype. Would it be OK to throw an exception in that case given this constructor is @access private and we only expect to ever receive a valid DOCTYPE string from one of the processor classes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thankfully we don't currently support setting modifiable text for doctype declarations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have concerns about the failure mode

it seems like if we can't parse it then we can't make any judgement about the document mode because it's not a real DOCTYPE declaration token. returning null and ignoring it seems fitting.

$this->name = $name;
$this->public_identifier = $public_identifier;
$this->system_identifier = $system_identifier;
$this->force_quirks_flag = $force_quirks_flag;
}

/**
* Gets the name of the DOCTYPE.
*
* @since 6.7.0
*
* @return string The name of the DOCTYPE.
*/
public function get_name(): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost think it would be more valuable to represent the absence of these identifiers, just as get_attribute() does. the doctype declaration can have an empty name and it can lack a name, I guess. maybe this is more about the system and public identifier.

if I were writing a method to copy the token from a source document into a normalized output, however, I would want to know the difference.

I also find it unlikely that userspace code is going to be doing much more with this than possibly copying HTML. I would like to see what people want or expect to do with the information, but it's not like we see people parsing DOCTYPE tokens already using the existing tools available

return $this->name ?? '';
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that this basically serves as a record class with an intelligent constructor, do you see much value in hiding these behind private?

what if the properties were all public and set upon instantiation, including empty strings when not set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's OK to set these properties, that's one reason to use getters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also the null/"" difference in some of the quirks mode rules, but that's less relevant since I extracted the quirks mode algorithm to another method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not okay to set them, but also seems needless to recreate state-oriented interfaces that do nothing. I know "once public always public" occurs here, but I also see these as being closed to extension or updates.

we can also document them saying "don't change these" but I don't know why someone would when the class doesn't do anything else. it doesn't support serialization and it shouldn't be re-parsing itself to change the suggested document mode.

in essence, they are read only unless someone changes them and re-reads them, and I'm not sure what the point of that would be.


/**
* Gets the public identifier of the DOCTYPE.
*
* @since 6.7.0
*
* @return string The public identifier of the DOCTYPE.
*/
public function get_public_identifier(): string {
return $this->public_identifier ?? '';
}

/**
* Gets the system identifier of the DOCTYPE.
*
* @since 6.7.0
*
* @return string The system identifier of the DOCTYPE.
*/
public function get_system_identifier(): string {
return $this->system_identifier ?? '';
}

/**
* Determines the resulting document compatibility mode for this DOCTYPE.
*
* When a DOCTYPE appears in the appropriate place in a document, its contents determine
* the compatibility mode of the document. This implements an algorithm described in the
* HTML standard for handling a DOCTYPE token in the "initial" insertion mode.
*
* @see https://html.spec.whatwg.org/multipage/parsing.html#the-initial-insertion-mode
*
* @since 6.7.0
*
* @return string A string indicating the resulting quirks mode. One of "quirks",
* "limited-quirks", or "no-quirks".
*/
public function get_compatibility_mode(): string {
/*
* > A system identifier whose value is the empty string is not considered missing for the
* > purposes of the conditions above.
*/
$system_identifier_is_missing = null === $this->system_identifier;

/*
* > The system identifier and public identifier strings must be compared to the values
* > given in the lists above in an ASCII case-insensitive manner. A system identifier whose
* > value is the empty string is not considered missing for the purposes of the conditions above.
*/
$public_identifier = null === $this->public_identifier ? '' : strtolower( $this->public_identifier );
$system_identifier = null === $this->system_identifier ? '' : strtolower( $this->system_identifier );

/*
* > [If] the DOCTYPE token matches one of the conditions in the following list, then set
* > the Document to quirks mode:
*/

// > The force-quirks flag is set to on.
if ( $this->force_quirks_flag ) {
return 'quirks';
}

// > The name is not "html".
if ( 'html' !== $this->name ) {
return 'quirks';
}

// > The public identifier is set to…
if (
'-//w3o//dtd w3 html strict 3.0//en//' === $public_identifier ||
'-/w3c/dtd html 4.0 transitional/en' === $public_identifier ||
'html' === $public_identifier
) {
return 'quirks';
}

// > The system identifier is set to…
if ( 'http://www.ibm.com/data/dtd/v11/ibmxhtml1-transitional.dtd' === $system_identifier ) {
return 'quirks';
}

/*
* All of the following conditions depend on matching the public identifier.
* If the public identifier is falsy, none of the following conditions will match.
*/
if ( '' === $public_identifier ) {
return 'no-quirks';
}

// > The public identifier starts with…
if (
str_starts_with( $public_identifier, '+//silmaril//dtd html pro v0r11 19970101//' ) ||
str_starts_with( $public_identifier, '-//as//dtd html 3.0 aswedit + extensions//' ) ||
str_starts_with( $public_identifier, '-//advasoft ltd//dtd html 3.0 aswedit + extensions//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.0 level 1//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.0 level 2//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.0 strict level 1//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.0 strict level 2//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.0 strict//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.0//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 2.1e//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 3.0//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 3.2 final//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 3.2//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html 3//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html level 0//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html level 1//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html level 2//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html level 3//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html strict level 0//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html strict level 1//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html strict level 2//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html strict level 3//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html strict//' ) ||
str_starts_with( $public_identifier, '-//ietf//dtd html//' ) ||
str_starts_with( $public_identifier, '-//metrius//dtd metrius presentational//' ) ||
str_starts_with( $public_identifier, '-//microsoft//dtd internet explorer 2.0 html strict//' ) ||
str_starts_with( $public_identifier, '-//microsoft//dtd internet explorer 2.0 html//' ) ||
str_starts_with( $public_identifier, '-//microsoft//dtd internet explorer 2.0 tables//' ) ||
str_starts_with( $public_identifier, '-//microsoft//dtd internet explorer 3.0 html strict//' ) ||
str_starts_with( $public_identifier, '-//microsoft//dtd internet explorer 3.0 html//' ) ||
str_starts_with( $public_identifier, '-//microsoft//dtd internet explorer 3.0 tables//' ) ||
str_starts_with( $public_identifier, '-//netscape comm. corp.//dtd html//' ) ||
str_starts_with( $public_identifier, '-//netscape comm. corp.//dtd strict html//' ) ||
str_starts_with( $public_identifier, "-//o'reilly and associates//dtd html 2.0//" ) ||
str_starts_with( $public_identifier, "-//o'reilly and associates//dtd html extended 1.0//" ) ||
str_starts_with( $public_identifier, "-//o'reilly and associates//dtd html extended relaxed 1.0//" ) ||
str_starts_with( $public_identifier, '-//sq//dtd html 2.0 hotmetal + extensions//' ) ||
str_starts_with( $public_identifier, '-//softquad software//dtd hotmetal pro 6.0::19990601::extensions to html 4.0//' ) ||
str_starts_with( $public_identifier, '-//softquad//dtd hotmetal pro 4.0::19971010::extensions to html 4.0//' ) ||
str_starts_with( $public_identifier, '-//spyglass//dtd html 2.0 extended//' ) ||
str_starts_with( $public_identifier, '-//sun microsystems corp.//dtd hotjava html//' ) ||
str_starts_with( $public_identifier, '-//sun microsystems corp.//dtd hotjava strict html//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 3 1995-03-24//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 3.2 draft//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 3.2 final//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 3.2//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 3.2s draft//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 4.0 frameset//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 4.0 transitional//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html experimental 19960712//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html experimental 970421//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd w3 html//' ) ||
str_starts_with( $public_identifier, '-//w3o//dtd w3 html 3.0//' ) ||
str_starts_with( $public_identifier, '-//webtechs//dtd mozilla html 2.0//' ) ||
str_starts_with( $public_identifier, '-//webtechs//dtd mozilla html//' )
) {
return 'quirks';
}

// > The system identifier is missing and the public identifier starts with…
if (
$system_identifier_is_missing && (
str_starts_with( $public_identifier, '-//w3c//dtd html 4.01 frameset//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 4.01 transitional//' )
)
) {
return 'quirks';
}

/*
* > Otherwise, [if] the DOCTYPE token matches one of the conditions in
* > the following list, then set the Document to limited-quirks mode.
*/

// > The public identifier starts with…
if (
str_starts_with( $public_identifier, '-//w3c//dtd xhtml 1.0 frameset//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd xhtml 1.0 transitional//' )
) {
return 'limited-quirks';
}

// > The system identifier is not missing and the public identifier starts with…
if (
! $system_identifier_is_missing && (
str_starts_with( $public_identifier, '-//w3c//dtd html 4.01 frameset//' ) ||
str_starts_with( $public_identifier, '-//w3c//dtd html 4.01 transitional//' )
)
) {
return 'limited-quirks';
}

return 'no-quirks';
}
}
29 changes: 20 additions & 9 deletions src/wp-includes/html-api/class-wp-html-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,24 @@ private function bail( string $message ) {
throw $this->unsupported_exception;
}

/**
* Sets the document compatibility mode (quirks or no-quirks) based on a DOCTYPE declaration.
*
* @see https://html.spec.whatwg.org/multipage/parsing.html#parser-cannot-change-the-mode-flag
*
* @since 6.7.0
*/
private function update_document_mode_from_doctype(): void {
$doctype = $this->get_doctype_info();
if ( null === $doctype ) {
return;
}

if ( 'quirks' === $doctype->get_compatibility_mode() ) {
Copy link
Member

@dmsnell dmsnell Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reads a bit awkward to me. also one of the reasons I find direct property access helpful. the DOCTYPE token doesn't specify a document's compatibility mode - it can only suggest it when none is already provided or forced, right?

so we might say "this token indicates quirks mode" or "this token indicates no-quirks mode"

if ( 'quirks' === $doctype->indicated_compatability_mode ) { }

but on this note do we want to constrain this function to only operate on a DOCTYPE token? I would imagine this could lead to people needing to try and push their own doctype declaration into existing HTML, when they could instead set this instead

$processor->set_compatability_mode( WP_HTML_Tag_Processor::QUIRKS_MODE );
$processor->set_compatability_mode( WP_HTML_Tag_Processor::NO_QUIRKS_MODE );

Although this does in a sense expose an opportunity for a foot-gun, for someone to change the compatibility mode randomly, it doesn't force new rules onto the Tag Processor that I don't think fully exist there. we're not going to have a token in the fragment case you mention, so maybe we analyze the parsed DOCTYPE in the HTML Processor and then change modes in the Tag Processor via method call.

we can even _doing_it_wrong() if setting the compatibility mode after having already set it, which will send an error in debug modes but won't take down the site.

$this->state->document_mode = WP_HTML_Processor_State::QUIRKS_MODE;
}
}

/**
* Returns the last error, if any.
*
Expand Down Expand Up @@ -1076,19 +1094,12 @@ private function step_initial(): bool {
* > A DOCTYPE token
*/
case 'html':
$contents = $this->get_modifiable_text();
if ( ' html' !== $contents ) {
/*
* @todo When the HTML Tag Processor fully parses the DOCTYPE declaration,
* this code should examine the contents to set the compatability mode.
*/
$this->bail( 'Cannot process any DOCTYPE other than a normative HTML5 doctype.' );
}

$this->update_document_mode_from_doctype();
/*
* > Then, switch the insertion mode to "before html".
*/
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
$this->insert_html_element( $this->state->current_token );
return true;
}

Expand Down
Loading
Loading