Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utf8JsonWriter.WriteString do not honor escape=false #28567

Closed
ycrumeyrolle opened this issue Jan 30, 2019 · 8 comments
Closed

Utf8JsonWriter.WriteString do not honor escape=false #28567

ycrumeyrolle opened this issue Jan 30, 2019 · 8 comments
Labels
area-System.Text.Json json-functionality-doc Missing JSON specific functionality that needs documenting
Milestone

Comments

@ycrumeyrolle
Copy link

When writing an unescaped value like "jwt+secevent", the Utf8JsonWriter is honoring the contract by not escaping the value, as illustrated in the following test:

        [Fact]
        public void JsonWriter_UnescapedValue()
        {
            var output = new FixedSizedBufferWriter(100);
            var jsonUtf8 = new Utf8JsonWriter(output);
            jsonUtf8.WriteStringValue("jwt+secevent", false);
            jsonUtf8.Flush();

            string actualStr = Encoding.UTF8.GetString(output.Formatted);
            Assert.Equal(@"""jwt+secevent""", actualStr); 
            // actualStr == "jwt+secevent"
        }

When writing an unescaped value, the Utf8JsonWriter is not honoring the contract by escaping the value:

        [Fact]
        public void JsonWriter_UnescapedProperty()
        {
            var output = new FixedSizedBufferWriter(100);
            var jsonUtf8 = new Utf8JsonWriter(output);
            jsonUtf8.WriteStartObject();
            jsonUtf8.WriteString("unescaped", "jwt+secevent", false);
            jsonUtf8.WriteEndObject();
            jsonUtf8.Flush();

            string actualStr = Encoding.UTF8.GetString(output.Formatted);
            Assert.Equal(@"{""unescaped"":""jwt+secevent""}", actualStr); 
            // actualStr == "{"unescaped":"jwt\u002bsecevent"}"
        }

In this test, the '+' character is escaped to '\u002b'.

The expected behavior is to have an unescaped string when it is a JSON value and a value of a JSON property.

@ycrumeyrolle
Copy link
Author

ycrumeyrolle commented Jan 30, 2019

I haven't seen the comment:
/// The value is always escaped
It is inconsistent to always escape a value from a property, but not from an array.

@ycrumeyrolle
Copy link
Author

In addition, I do not understand the reason that the '+' is escaped. Same remark for '&', '<', '>' and '`'.
This characters are legit (I assume...)

@ahsonkhan
Copy link
Member

WriteString(string propertyName, string value, bool escape) <= the escape bool only applies to the property name, not to the value.

The value is always escaped

Right. This is by design. Usually, property names are known/constant values, so the bool escape applies to the property name only, not to the value. We decided not to provide an additional bool escapeValue argument as it didn't seem useful for common uses and complicated the api surface.

It is inconsistent to always escape a value from a property, but not from an array.

That's a good point. I don't know if there is a strong use case for writing unescaped string arrays, so for consistency, we should consider removing the escape argument on WriteStringValue.

In addition, I do not understand the reason that the '+' is escaped. Same remark for '&', '<', '>' and '`'.
This characters are legit (I assume...)

These are common Html characters that need to be escaped since JSON can end up embedded within html web pages.

cc @GrabYourPitchforks

@stephentoub
Copy link
Member

the escape bool only applies to the property name, not to the value.

Seems like the parameter should be renamed then to avoid any ambiguity? e.g. escape => escapeName

@ycrumeyrolle
Copy link
Author

ycrumeyrolle commented Feb 5, 2019

Right. This is by design. Usually, property names are known/constant values, so the bool escape applies to the property name only, not to the value. We decided not to provide an additional bool escapeValue argument as it didn't seem useful for common uses and complicated the api surface.

Can't we imagine a new property on the JsonWriterOptions like IgnoreValueEscaping ?
It won't change the surface API, and may provide more performance if the values are safe.
Nevertheless I'd rather prefer to have the control over each property. This would just add a bool to 9 methods:

        public void WriteString(string propertyName, string value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(string propertyName, ReadOnlySpan<char> value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(string propertyName, ReadOnlySpan<byte> utf8Value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(ReadOnlySpan<char> propertyName, string value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(ReadOnlySpan<char> propertyName, ReadOnlySpan<char> value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(ReadOnlySpan<char> propertyName, ReadOnlySpan<byte> utf8Value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(ReadOnlySpan<byte> utf8PropertyName, string value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(ReadOnlySpan<byte> utf8PropertyName, ReadOnlySpan<char> value, bool escapeName = true, bool escapeValue = true);
        public void WriteString(ReadOnlySpan<byte> utf8PropertyName, ReadOnlySpan<byte> utf8Value, bool escapeName = true, bool escapeValue = true);

These are common Html characters that need to be escaped since JSON can end up embedded within html web pages.

The security consideration is no clear to me. Utf8JsonWriter a low-level component. You cannot assume how is consumed the produced JSON. And escaping the JSON is not a very efficient way to protect against XSS.
And if I would like to escape, may be that HTML escaping is not enough. URL escaping, CSS escaping, JavaScript escaping, Attributes escaping....
IMHO, this JSON writer should only do JSON escaping as defined by the RFC.

@ccnxpt
Copy link

ccnxpt commented Feb 10, 2019

I just tried Utf8JsonWriter.WriteString(name, value, false), then

Original value = "春夏秋冬" (12bytes)
Escaped value = "\u6625\u590f\u79cb\u51ac" (24bytes)

Is this by design? Too long and unreadable.

@ahsonkhan ahsonkhan self-assigned this Jun 4, 2019
@steveharter steveharter self-assigned this Jul 1, 2019
@steveharter
Copy link
Member

Closing this as https://github.com/dotnet/corefx/issues/37192 addresses this by passing a custom escaper to the writer.

@taadis
Copy link

taadis commented Sep 27, 2019

it work!!!

object jsonObject = new { symbol = @"~`!@#$%^&*()_-+={}[]:;'<>,.?/ " };
string aJsonString = Newtonsoft.Json.JsonConvert.SerializeObject(value: jsonObject);
string bJsonString = System.Text.Json.JsonSerializer.Serialize(
     value: jsonObject,
     options: new System.Text.Json.JsonSerializerOptions
     { 
           Encoder = System.Text.Encodings.Web.JavaScriptEncoder.UnsafeRelaxedJsonEscaping
      });
Assert.AreEqual(expected: aJsonString, actual: bJsonString);

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.Json json-functionality-doc Missing JSON specific functionality that needs documenting
Projects
None yet
Development

No branches or pull requests

7 participants