Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal json converters cannot function independently of their original options #50205

Open
NinoFloris opened this issue Mar 24, 2021 · 14 comments

Comments

@NinoFloris
Copy link
Contributor

Internal json collection converters for instance assume JsonClassInfo.ElementInfo is not null, however it can actually be null if the converter has been pre-aquired from another options instance.

JsonClassInfo elementClassInfo = state.Current.JsonClassInfo.ElementClassInfo!;

ElementInfo returns null if ElementType is null and ElementType will only be filled under specific circumstances.

When we start at the point of our custom converter's Read() method (see below) we get the following steps to an NRE:

  • pre-aquired JsonConverter.Read will initialize its own state (readstack) with among others a call to options.GetOrAddClassForRootType(type)
  • This will then do a resolve for the type and its converter in the constructor of JsonClassInfo(type).
  • Converter, which is resolved back again to the custom converter here will have ClassType.None because all custom converters do.
  • State is now initialized with ElementType = null because it didn't fall into the ClassType.Collection arm
  • Call to OnTryRead commences with state that will never return anything but null for JsonClassInfo.ElementInfo
  • NRE thrown
using System;
using System.Collections.Immutable;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace StjRepro
{
    public enum Cases
    {
        One,
        Two,
        Three,
        Four
    }

    class JsonImmutableArrayConverter<T> : JsonConverter<ImmutableArray<T>>
    {
        JsonConverter<ImmutableArray<T>> _originalConverter;

        public JsonImmutableArrayConverter()
            => _originalConverter = (JsonConverter<ImmutableArray<T>>)new JsonSerializerOptions().GetConverter(typeof(ImmutableArray<T>));

        public override ImmutableArray<T> Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
            // We're passing the current options and not the default options because we do want our value converters to work
            // in this example, from strings to the enum 'Cases'.
            => _originalConverter.Read(ref reader, typeToConvert, options);

        public override void Write(Utf8JsonWriter writer, ImmutableArray<T> value, JsonSerializerOptions options)
        {
            throw new NotSupportedException();
        }
    }

    class JsonImmutableArrayConverter : JsonConverterFactory
    {
        public override bool CanConvert(Type typeToConvert)
            => typeToConvert.IsGenericType && typeToConvert.GetGenericTypeDefinition() == typeof(ImmutableArray<>);

        public override JsonConverter CreateConverter(Type typeToConvert, JsonSerializerOptions options)
            => (JsonConverter)Activator.CreateInstance(typeof(JsonImmutableArrayConverter<>).MakeGenericType(typeToConvert.GenericTypeArguments[0]));
    }

    class Program
    {
        static void Main(string[] args)
        {
            var options = new JsonSerializerOptions();
            options.Converters.Add(new JsonImmutableArrayConverter());
            options.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase));
            JsonSerializer.Deserialize<ImmutableArray<Cases>>(@"[""one"",""two"",""three"",""four""]", options);
        }
    }
}
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at System.Text.Json.Serialization.Converters.IEnumerableDefaultConverter`2.GetElementConverter(JsonClassInfo elementClassInfo)
   at System.Text.Json.Serialization.Converters.IEnumerableDefaultConverter`2.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, TCollection& value)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonResumableConverter`1.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at StjRepro.JsonImmutableArrayConverter`1.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](JsonConverter jsonConverter, Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](Utf8JsonReader& reader, Type returnType, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializer.Deserialize[TValue](ReadOnlySpan`1 json, Type returnType, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializer.Deserialize[TValue](String json, JsonSerializerOptions options)
   at StjRepro.Program.Main(String[] args)

The reason for pulling out an 'original' converter like this is because some code paths in a custom converter should just be able to default to existing converters, merely wrapping over them.

The only way I see to hack around this without a proper fix is passing a specially crafted options instance into the framework converter that has this custom converter removed, it will resolve the correct JsonClassInfo and includes the required custom value converters but in terms of perf (and usability) it seems far from ideal.

        public override ImmutableArray<T> Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
        {
            var optionsDiff = new JsonSerializerOptions(options);
            JsonConverter factory = null;
            foreach (var converter in optionsDiff.Converters)
            {
                if (converter.GetType() == typeof(JsonImmutableArrayConverter))
                    factory = converter;
            }
            if (factory != null)
                optionsDiff.Converters.Remove(factory);

            // We're passing the current options and not the default options because we do want our value converters to work
            // in this case from strings to the enum 'Cases'.
            return _originalConverter.Read(ref reader, typeToConvert, optionsDiff);
        }

If there is some other method by which to achieve what I want, I'd be glad to use it. In any case I think its good to add some documentation on how to call into the framework converters from a custom converter, this has been something I've wanted to do (and done to various degrees of success) multiple times now.

/cc @layomia

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Text.Json untriaged New issue has not been triaged by the area owner labels Mar 24, 2021
@ghost
Copy link

ghost commented Mar 24, 2021

Tagging subscribers to this area: @eiriktsarpalis, @layomia
See info in area-owners.md if you want to be subscribed.

Issue Details

Internal json collection converters for instance assume JsonClassInfo.ElementInfo is not null, however it can actually be null if the converter has been pre-aquired from another options instance.

JsonClassInfo elementClassInfo = state.Current.JsonClassInfo.ElementClassInfo!;

ElementInfo returns null if ElementType is null and ElementType will only be filled under specific circumstances.

When we start at the point of our custom converter's Read() method (see below) we get the following steps to an NRE:

  • pre-aquired JsonConverter.Read will initialize its own state (readstack) with among others a call to options.GetOrAddClassForRootType(type)
  • This will then do a resolve for the type and its converter in the constructor of JsonClassInfo(type).
  • Converter, which is resolved back again to the custom converter here will have ClassType.None because all custom converters do.
  • State is now initialized with ElementType = null because it didn't fall into the ClassType.Collection arm
  • Call to OnTryRead commences with state that will never return anything but null for JsonClassInfo.ElementInfo
  • NRE thrown
using System;
using System.Collections.Immutable;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace StjRepro
{
    public enum Cases
    {
        One,
        Two,
        Three,
        Four
    }

    class JsonImmutableArrayConverter<T> : JsonConverter<ImmutableArray<T>>
    {
        JsonConverter<ImmutableArray<T>> _originalConverter;

        public JsonImmutableArrayConverter()
            => _originalConverter = (JsonConverter<ImmutableArray<T>>)new JsonSerializerOptions().GetConverter(typeof(ImmutableArray<T>));

        public override ImmutableArray<T> Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
            // We're passing the current options and not the default options because we do want our value converters to work
            // in this example, from strings to the enum 'Cases'.
            => _originalConverter.Read(ref reader, typeToConvert, options);

        public override void Write(Utf8JsonWriter writer, ImmutableArray<T> value, JsonSerializerOptions options)
        {
            throw new NotSupportedException();
        }
    }

    class JsonImmutableArrayConverter : JsonConverterFactory
    {
        public override bool CanConvert(Type typeToConvert)
            => typeToConvert.IsGenericType && typeToConvert.GetGenericTypeDefinition() == typeof(ImmutableArray<>);

        public override JsonConverter CreateConverter(Type typeToConvert, JsonSerializerOptions options)
            => (JsonConverter)Activator.CreateInstance(typeof(JsonImmutableArrayConverter<>).MakeGenericType(typeToConvert.GenericTypeArguments[0]));
    }

    class Program
    {
        static void Main(string[] args)
        {
            var options = new JsonSerializerOptions();
            options.Converters.Add(new JsonImmutableArrayConverter());
            options.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase));
            JsonSerializer.Deserialize<ImmutableArray<Cases>>(@"[""one"",""two"",""three"",""four""]", options);
        }
    }
}
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at System.Text.Json.Serialization.Converters.IEnumerableDefaultConverter`2.GetElementConverter(JsonClassInfo elementClassInfo)
   at System.Text.Json.Serialization.Converters.IEnumerableDefaultConverter`2.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, TCollection& value)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonResumableConverter`1.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at StjRepro.JsonImmutableArrayConverter`1.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](JsonConverter jsonConverter, Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](Utf8JsonReader& reader, Type returnType, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializer.Deserialize[TValue](ReadOnlySpan`1 json, Type returnType, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializer.Deserialize[TValue](String json, JsonSerializerOptions options)
   at StjRepro.Program.Main(String[] args)

The reason for pulling out an 'original' converter like this is because some code paths in a custom converter should just be able to default to existing converters, merely wrapping over them.

The only way I see to hack around this without a proper fix is passing a specially crafted options instance into the framework converter that has this custom converter removed, it will resolve the correct JsonClassInfo and includes the required custom value converters but in terms of perf (and usability) it seems far from ideal.

        public override ImmutableArray<T> Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
        {
            var optionsDiff = new JsonSerializerOptions(options);
            JsonConverter factory = null;
            foreach (var converter in optionsDiff.Converters)
            {
                if (converter.GetType() == typeof(JsonImmutableArrayConverter))
                    factory = converter;
            }
            if (factory != null)
                optionsDiff.Converters.Remove(factory);

            // We're passing the current options and not the default options because we do want our value converters to work
            // in this case from strings to the enum 'Cases'.
            return _originalConverter.Read(ref reader, typeToConvert, optionsDiff);
        }

If there is some other method by which to achieve what I want, I'd be glad to use it. In any case I think its good to add some documentation on how to call into the framework converters from a custom converter, this has been something I've wanted to do (and done to various degrees of success) multiple times now.

/cc @layomia

Author: NinoFloris
Assignees: -
Labels:

area-System.Text.Json, untriaged

Milestone: -

@layomia
Copy link
Contributor

layomia commented Mar 24, 2021

@NinoFloris thanks! I haven't fully grokked this yet but @eiriktsarpalis and I discussed potential issues related to this just yesterday. I'll take a look.

@layomia layomia added this to the 6.0.0 milestone Mar 24, 2021
@layomia layomia self-assigned this Mar 24, 2021
@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Mar 24, 2021
@eiriktsarpalis
Copy link
Member

@NinoFloris I bumped into the very same issue yesterday and agree with your analysis. I couldn't really work around it without making changes to System.Text.Json internals so we need to fix the issue in the general case.

I don't think relying on a "default" options instance is a good solution, since while it may (in some cases) ensure that JsonClassInfo.ElementClassInfo is populated, that might not be compatible with the element converter being used by the active options instance.

Fundamentally this stems from weakness in the JsonClassInfo design, which is assumed to be converter-agnostic but whose shape depends on the current converter being used. I don't know what the best solution would be here, but it will probably require a bit of refactoring :-)

@NinoFloris
Copy link
Contributor Author

@NinoFloris I bumped into the very same issue yesterday and agree with your analysis

That's such an unlikely coincidence, amusing! ^_^

I don't think relying on a "default" options instance is a good solution

Evidently it's a hack to get what I want, though I'm not sure what your suggested alternative is? The converter override and caching process won't suddenly give me the behavior I want — getting the converter that would have been resolved if not for the custom one overriding it and getting cached — without api additions.

To 'quickly' fix the JsonClassInfo issue, I've considered a few options:

  • JsonClassInfo gets augmented with an extra constructor taking a JsonConverter, the calling converter can then flow itself down all the way starting from stack.Initialize, fixing the incorrect ClassType determination. The problem I see is these JsonClassInfo's get cached as well as far as I understand, so this won't work reliably without another fix...
  • To fix the caching issue, would it make sense to cache on not just type but a composite key of JsonClassInfo.ClassType and type? With that in place the cache can slowly fill with permutations when these edge cases show up without having instances get in the way of one another.

Obviously when going down the composite key route the slower lookup will have some unknown effect on overall perf — I'm expecting tiny? — which still needs careful analysis.

If we expect it to be unfruitful it might make sense to keep most as-is and to push resolution of these differences into JsonClassInfo itself, have it function uniformly regardless of ClassType, say starting at a certain ClassType and synthesizing data for other ClassTypes on demand, not quite sure what this would mean for JsonClassInfo.ClassType and its uses though.

Anything else seems like it would require a larger overhaul of the internals, do you have any suggestions?

Either way thanks for the quick reply!

@layomia
Copy link
Contributor

layomia commented Mar 25, 2021

A workaround here based your original approach is to create and cache a separate options instance, populated with the JsonStringEnumConverter, for the original/default ImmutableArray<T> converter to use (see https://dotnetfiddle.net/tbnB9f for full program):

class JsonImmutableArrayConverter<T> : JsonConverter<ImmutableArray<T>>
{
    JsonSerializerOptions _originalOptions;
    JsonConverter<ImmutableArray<T>> _originalConverter;

    public JsonImmutableArrayConverter()
    {
	_originalOptions = new JsonSerializerOptions()
	{
            Converters = { new JsonStringEnumConverter(JsonNamingPolicy.CamelCase) }
	};
	_originalConverter = (JsonConverter<ImmutableArray<T>>)_originalOptions.GetConverter(typeof(ImmutableArray<T>));
    }

    public override ImmutableArray<T> Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
        => _originalConverter.Read(ref reader, typeToConvert, _originalOptions);

    public override void Write(Utf8JsonWriter writer, ImmutableArray<T> value, JsonSerializerOptions options)
    {
        _originalConverter.Write(writer, value, _originalOptions);
    }
}

Then, the enum converter doesn't need to be specified at the root call to the serializer:

static void Main(string[] args)
{
    var options = new JsonSerializerOptions();
    options.Converters.Add(new JsonImmutableArrayConverter());
    //options.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase));
            
    ImmutableArray<Cases> deserialized = JsonSerializer.Deserialize<ImmutableArray<Cases>>(@"[""one"",""two"",""three"",""four""]", options);
			
    string serialized = JsonSerializer.Serialize(deserialized, options);
    Console.WriteLine(serialized);
}

Still investigating the issue here, but generally, multiple options instances shouldn't be passed arbitrarily as inputs to converters or the serializer. The mitigation here might be for the serializer to detect and guard against unsupported options patterns, and throw meaningful exceptions.

In any case I think its good to add some documentation on how to call into the framework converters from a custom converter, this has been something I've wanted to do (and done to various degrees of success) multiple times now.

Yes, we'll provide some documentation for this. We can add it to this page - https://docs.microsoft.com/dotnet/standard/serialization/system-text-json-converters-how-to?pivots=dotnet-5-0#error-handling. cc @tdykstra

@layomia
Copy link
Contributor

layomia commented Mar 25, 2021

The reason for pulling out an 'original' converter like this is because some code paths in a custom converter should just be able to default to existing converters, merely wrapping over them.

In principle, I do agree that custom converters should be able to compose nicely with framework/STJ internal converters. I'm just curious about this specific scenario. What motivated the use of a custom converter here? Why not just:

static void Main(string[] args)
{
    var options = new JsonSerializerOptions();
    options.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase));
            
    ImmutableArray<Cases> deserialized = JsonSerializer.Deserialize<ImmutableArray<Cases>>(@"[""one"",""two"",""three"",""four""]", options);
    Console.WriteLine(JsonSerializer.Serialize(deserialized, options));
}

@NinoFloris
Copy link
Contributor Author

NinoFloris commented Mar 25, 2021

@layomia obviously this is not a tenable solution, the ImmutableArrayConverter should not have to know about all Ts (and the right type of converter) for the entire application. It's also common enough to create converters that could get shipped as a package like this https://github.com/Tarmil/FSharp.SystemTextJson.

The mitigation here might be for the serializer to detect and guard against unsupported options patterns, and throw meaningful exceptions.

Proper compositionality is important, doing the easy thing here seems like it undermines the custom converter model of STJ a lot (and it's already — no doubt with good intentions — much more limited than the internal converters).

What motivated the use of a custom converter here? Why not just:

ImmutableArray is slow to serialize due to it falling into the IEnumerable path, yet as we like immutability we have many instances where we use it, so we have a converter that optimizes writes (via GetMemory and MemoryMarshal.TryGetArray which is entirely safe), but leaves reads as-is. I've also had cases around discriminated unions, custom collections or other functorial types where I just want to fall back to the framework converters when some pattern match or condition is hit, most of the times this is in the read path, as going from a reader to arbitrary .net objects is a pain I try to avoid (at least without the public metadata apis that were talked about).

Anyway, I have at least somewhat optimized the 'hacky fix' I showed in my original comment by caching the diffed options on an instance field. We can then do an Object.ReferenceEquals on it against the options passed into Read. If they don't match we diff the passed in instance but otherwise we can move on straight away.

@layomia
Copy link
Contributor

layomia commented Mar 26, 2021

The value converter model based on directly deriving from JsonConverter<T> was really intended for simple/primitive types. The limitation with the JsonConverter<T>.Read method is that it doesn't have a ReadStack state parameter to properly flow type metadata (same issue with writing). We've talked a few times about a richer converter model for objects and collections which is based on passing state around (e.g. #2259). This is the ideal direction for composability, and it would be great to get to this soon.

A couple of workarounds have been mentioned so I hope for now your scenario is unblocked. In the meantime, we're evaluating adding a safe-guard in the code that bridges the JsonConverter<T>.Read method call with the internal logic for objects/collections to ensure that the state metadata we initialized is correct:

ReadStack state = default;
state.Initialize(typeToConvert, options, supportContinuation: false);

It may look like this:

ReadStack state = default;
state.Initialize(typeToConvert, options, supportContinuation: false);
if (state.Current.JsonPropertyInfo.ConverterBase != this)
{
    throw new InvalidOperationException();
}
TryRead(ref reader, typeToConvert, options, ref state, out T? value);

The original repro would fail with this exception. This exception may help uncover more interesting patterns/dependencies that our callers have.

cc @steveharter


We'll look into improving the performance of (de)serializing immutable collections. They are currently based on the CreateRange pattern which involves allocating a temporary collection. Using the builder pattern has better perf (https://dotnetfiddle.net/d5n55f).

@NinoFloris
Copy link
Contributor Author

Thanks!

We've talked a few times about a richer converter model for objects and collections which is based on passing state around

I'll be sure to subscribe to #36785 :)

We'll look into improving the performance of (de)serializing immutable collections. They are currently based on the CreateRange pattern which involves allocating a temporary collection. Using the builder pattern has better perf (https://dotnetfiddle.net/d5n55f).

Nice to see some love for deserializing into immutable collections! For us writing them fast is important as we have http read heavy workloads. Hopefully you could look into using MemoryMarshal as well for ImmutableArray. This is 100% binary compatible as long as you error out or fall back to IEnumerable when TryGetArray returns false. In the highly unlikely event ImmutableArray suddenly starts using native memory ;)

@layomia
Copy link
Contributor

layomia commented Jul 23, 2021

We don't have work planned to address this for .NET 6.0, we should consider in 7.

@eiriktsarpalis
Copy link
Member

One possible fix is to have the internal converters encapsulate their corresponding JsonTypeInfo metadata. We don't do it currently since converters are meant to be decoupled from JsonSerializerOptions and any metadata it may carry. Our built-in converters however are special, since they are instantiated using internal converter factories. As such, every instance is tightly coupled to the JsonSerializerOptions instance that originated it, and this bug is a consequence of us not observing this invariant.

Tagging @krwq who might be interested in this topic.

@eiriktsarpalis eiriktsarpalis modified the milestones: 7.0.0, Future May 10, 2022
@eiriktsarpalis
Copy link
Member

We won't have time to work on this for .NET 7, moving to Future.

@eiriktsarpalis eiriktsarpalis modified the milestones: Future, 8.0.0 Jul 8, 2022
@krwq
Copy link
Member

krwq commented Sep 28, 2022

We should tackle this in conjunction with #63791 and #54189

@gregsdennis
Copy link
Contributor

As part of a recent AOT-compatibility update, Json.More.Net now includes several JsonSerializerOptions extensions that help working around this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants