Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EZP-29289: Migrating ezxmltext with invalid name or id attributes #47

Merged
merged 2 commits into from
Jun 21, 2018

Conversation

vidarl
Copy link
Member

@vidarl vidarl commented Jun 15, 2018

This PR fixes EZP-29289

It is not 100% bulletproof though;

  • It checks if first character in the attribute's value is white listed. It does not check the remaining characters. It would be possible to fix that, but it would require yet another XPath pass I believe.
  • Also, the whitelist may not be 100% complete

But I think this is good enought for now

@vidarl vidarl changed the base branch from master to move_xsl_from_kernel June 15, 2018 12:09
@vidarl vidarl changed the base branch from move_xsl_from_kernel to master June 15, 2018 12:12
@vidarl vidarl requested review from adamwojs and andrerom June 15, 2018 12:15
'disable-id-value-check',
null,
InputOption::VALUE_NONE,
'Disable the check for non-validating id/name values. This might increase execution time on large databases'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decrease? As in it might speeds up execution by skipping this right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


/*
// For debugging
$records = $logHandler->getRecords();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Member Author

@vidarl vidarl Jun 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For convenience, I would prefer it to remain there, and this is in a test so I thought it might be okay.

But I may also remove it if needed

Copy link
Contributor

@andrerom andrerom Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we instead mock, that would cover the need for that afaik.

Copy link
Contributor

@andrerom andrerom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 besides the two nitpicks above

@@ -17,6 +17,8 @@
use eZ\Publish\API\Repository\LocationService;
use eZ\Publish\API\Repository\Values\Content\ContentInfo;
use eZ\Publish\API\Repository\Values\Content\Location;
use Monolog\Handler\TestHandler;
use Monolog\Logger;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test/null logger available in PSR package to avoid relying on implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a null logger, but no test logger in PSR

Copy link
Contributor

@andrerom andrerom Jun 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you could mock LoggerInterface here as well right? or similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added mock as suggested and testes that the log messages are actually correct.

It got a bit more complicated that you might suspected as the log message in some cases contained ids generated randomly by XSLT...

$this->expectedLogWarningWithWildcard[] = $line;
$loggerStub->expects($this->at($logNo++))
->method('warning')
->with($this->callback(array($this, 'validatelogWarningWithWildcard')));
Copy link
Contributor

@andrerom andrerom Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what the * log lines do here, but might be relevant:
This could have been also the following to avoid using stageful $this->expectedLogWarningWithWildcard property:

->with($this->callback(function($logMessage) use($expectedLogLines) {
    $this->assert(...);

    return true;
}));

Copy link
Contributor

@andrerom andrerom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, but might be improvement potential on the callback usage.

@vidarl
Copy link
Member Author

vidarl commented Jun 20, 2018

Fixed callback usage as suggested. Thanx

/**
* @var []
*/
private $expectedLogWarningWithWildcard = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can remove this now then ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

continue;
}
if (strpos($expectedLogLine, '*') !== false) {
//$this->expectedLogWarningWithWildcard[] = $line;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@andrerom
Copy link
Contributor

andrerom commented Jun 20, 2018

This looks great now to me.
@adamwojs or @kmadejski, could one of you have a look and review? ;)

$newValue = 'rewrite_' . $node->attributes->getNamedItem('id')->nodeValue;
$newValue = preg_replace("/[^$whitelist]/", '_', $newValue);
$node->attributes->getNamedItem('id')->nodeValue = $newValue;
if ($this->logger !== null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this condition. That's why we using NullLogger 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Previosly, the logger could actually be null, but that is not the case anymore.

Fixed.

@@ -163,6 +163,34 @@ protected function reportNonUniqueIds(DOMDocument $document, $contentFieldId)
}
}

protected function ValidateAttributeValues(DOMDocument $document, $contentFieldId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CS: Please use camelCase for method names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -320,7 +348,7 @@ protected function checkEmptyEmbedTags(DOMDocument $inputDocument)
* @param null|int $contentFieldId
* @return string
*/
public function convert(DOMDocument $inputDocument, $checkDuplicateIds = false, $contentFieldId = null)
public function convert(DOMDocument $inputDocument, $checkDuplicateIds = false, $checkIdValues = false, $contentFieldId = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Missing checkIdValues parameter in docblock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!. Fixed


$whitelist = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_-';
$replaceStr = '';
$nodes = $xpath->query("//*[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a comment explaining whats going on here 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe... explanation given...

@vidarl
Copy link
Member Author

vidarl commented Jun 21, 2018

@adam: All comments accommodated I believe.. Please have a look

I'll rebase PR once you are done

* #1 not starting with a..z or '_'
* #2 not a..z, '0..9', '_' or '-' after 1st character
* So, no xpath v2 to our disposal...
* 1st line : we check the 1st char(substring) in id, converts it to 'a' if it in whitelist(translate), then check if it string now starts with 'a'(starts-with), then we invert result(not)
Copy link
Member

@kmadejski kmadejski Jun 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this comment! 🙂

@vidarl vidarl merged commit 63b05cd into master Jun 21, 2018
@vidarl vidarl deleted the fix_ezp29289 branch June 21, 2018 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants