Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: response values encoding to support compatibility with XML when generating the results #404

Merged
merged 2 commits into from
Nov 6, 2024

Conversation

kilatib
Copy link
Contributor

@kilatib kilatib commented Oct 28, 2024

Allow special HTML symbols TR-6347

Description

This PR decoded special HTML symbols in responses before transforming them into XML object

to avoid error

SimpleXMLElement::__construct(): <value>160\u00b4\b<\/value>

Note

These changes not been fully tested that's why it is a draft

@kilatib kilatib requested review from wazelin and poyuki October 28, 2024 15:23
@kilatib kilatib marked this pull request as ready for review October 30, 2024 14:01
@kilatib kilatib force-pushed the fix/TR-6347/allow-special-html-symbols branch 2 times, most recently from f9dcdf8 to ff02fe6 Compare October 30, 2024 14:53
Copy link
Member

@wazelin wazelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Escaping the XML when it's already built and stored is too late.
Please do that in here instead

return htmlspecialchars((string)$value, ENT_XML1, 'UTF-8');

Also, please consider applying something like this instead of the HTML-entity-based solution.

function isInCharacterRange(int $char): bool
{
    return $char == 0x09
        || $char == 0x0A
        || $char == 0x0D
        || $char >= 0x20 && $char <= 0xDF77
        || $char >= 0xE000 && $char <= 0xFFFD
        || $char >= 0x10000 && $char <= 0x10FFFF;
}

function xmlSpecialChars(string $value): string
{
    $result = '';

    $last = 0;
    $length = strlen($value);
    $i = 0;

    while ($i < $length) {
        $r = mb_substr(substr($value, $i), 0, 1);
        $width = strlen($r);
        $i += $width;
        switch ($r) {
            case '"':
                $esc = '&#34;';
                break;
            case "'":
                $esc = '&#39;';
                break;
            case '&':
                $esc = '&amp;';
                break;
            case '<':
                $esc = '&lt;';
                break;
            case '>':
                $esc = '&gt;';
                break;
            case "\t":
                $esc = '&#x9;';
                break;
            case "\n":
                $esc = '&#xA;';
                break;
            case "\r":
                $esc = '&#xD;';
                break;
            default:
                if (!isInCharacterRange(mb_ord($r)) || (mb_ord($r) === 0xFFFD && $width === 1)) {
                    $esc = "\u{FFFD}";
                    break;
                }

                continue 2;
        }
        $result .= substr($value, $last, $i - $last - $width) . $esc;
        $last = $i;
    }
    return $result . substr($value, $last);
}

The solution was copied from the Go sources: https://go.dev/src/encoding/xml/xml.go#L1916.

src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
test/qtismtest/data/storage/xml/XmlUtilsTest.php Outdated Show resolved Hide resolved
src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
@kilatib kilatib force-pushed the fix/TR-6347/allow-special-html-symbols branch 2 times, most recently from 8dfdb0f to d93c7e0 Compare November 6, 2024 08:55
@kilatib kilatib requested a review from wazelin November 6, 2024 08:55
@kilatib
Copy link
Contributor Author

kilatib commented Nov 6, 2024

Thank you for advice now, looks like a working well this part

src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved
@kilatib kilatib requested a review from wazelin November 6, 2024 11:34
@kilatib
Copy link
Contributor Author

kilatib commented Nov 6, 2024

replaced to suggested solution

@wazelin wazelin changed the base branch from master to develop November 6, 2024 11:52
@wazelin wazelin closed this Nov 6, 2024
@wazelin wazelin reopened this Nov 6, 2024
test/qtismtest/data/storage/xml/XmlUtilsTest.php Outdated Show resolved Hide resolved
test/qtismtest/data/storage/xml/XmlUtilsTest.php Outdated Show resolved Hide resolved
@kilatib kilatib requested a review from wazelin November 6, 2024 12:15
src/qtism/data/storage/xml/Utils.php Outdated Show resolved Hide resolved

public function testProcessSpecialCharsetWithoutError(): void
{
$xml = ('<?xml version="1.0" encoding="UTF-8"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you need the parenthesis here.

</itemResult>
</assessmentResult>
');
$this->assertNotNull(Utils::findExternalNamespaces(sprintf($xml, Utils::valueAsString("160\b\u{0008}"))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$this->assertNotNull(Utils::findExternalNamespaces(sprintf($xml, Utils::valueAsString("160\b\u{0008}"))));
$this->assertNotNull(Utils::findExternalNamespaces(sprintf($xml, Utils::valueAsString("160\u{0008}"))));

@kilatib kilatib force-pushed the fix/TR-6347/allow-special-html-symbols branch from 000c35c to f43572f Compare November 6, 2024 12:46
@kilatib kilatib force-pushed the fix/TR-6347/allow-special-html-symbols branch from f43572f to 314a30a Compare November 6, 2024 12:47
@kilatib kilatib requested a review from wazelin November 6, 2024 12:56
@wazelin wazelin changed the title fix: allow html symbols in xml fix: response values encoding to support compatibility with XML when generating the results Nov 6, 2024
Copy link
Member

@wazelin wazelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • New code is covered by tests (if applicable)
  • Tests are running successfully (old and new ones) on my local machine (if applicable)
  • New code is respecting code style rules
  • New code is respecting best practices
  • New code is not subject to concurrency issues (if applicable)
  • Feature is working correctly on my local machine (if applicable)
  • Acceptance criteria are respected
  • Pull request title and description are meaningful
  • Pull request's target is not master

@wazelin wazelin merged commit 6508bc9 into develop Nov 6, 2024
5 of 6 checks passed
@wazelin wazelin deleted the fix/TR-6347/allow-special-html-symbols branch November 6, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants