Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use System.Text.Json #1075

Merged
merged 35 commits into from
Sep 17, 2021
Merged

Use System.Text.Json #1075

merged 35 commits into from
Sep 17, 2021

Conversation

bart-degreed
Copy link
Contributor

@bart-degreed bart-degreed commented Sep 7, 2021

This PR replaces the Newtonsoft.Json serialization library used by JADNC with System.Text.Json (STJ).

Fixes #664.
Fixes #999.
Fixes #1077.
Fixes #1078.

Aside from the actual conversion, this PR also contains several prerequisite refactorings and bug fixes, which I tried to put in the preceding commits.

Refactorings

  • Breaking: Removed Serialization.Client.Internal (proposed in Optimize IIdentifiable to ResourceObject conversion #1028)
  • Breaking: Removed undocumented ?nulls and ?defaults query string support (proposed in Support System.Text.Json #664 (comment))
  • Always use a string value for ID in request bodies. Newtonsoft would auto-convert, but STJ does not.
  • Pascal casing in STJ means setting the naming policies to null, which should not crash, so added integration tests for pascal casing.
  • Optimized lookups in the resource graph by using dictionaries instead of iterating over collections. This is now heavily used when deserializing request bodies. We need the resource graph to lookup attribute CLR types and feed that to the deserializer. In contrast to Newtonsoft, STJ does not attempt to infer the CLR type for a JSON string.
  • Breaking: Made IResourceContextProvider.GetResourceContext() throw when not found; added TryGetResourceContext() that returns null when not found.
  • Breaking: Merged IResourceContextProvider into IResourceGraph
  • Several changes to the data structures in JsonApiDotNetCore.Serialization.Objects namespace. These objects represent the low-level JSON:API object shape. While technically a breaking change, we don't expect anyone to depend on these directly.
    • Renamed RelationshipEntry to RelationshipObject, Error to ErrorObject
    • Merged ErrorDocument and AtomicOperationsDocument into Document. An observable change is that we used to put meta at the top for successful responses, but at the bottom for error responses. We now always put meta at the end. The JSON:API spec leaves room for different interpretations about meta order in the different object types, so this is a judgment call. meta is typically used for comments and secondary information, which we believe should not be transmitted before the more important information. We moved the jsonapi object to the top, which enables clients to vary parsing the rest of the body, based on the JSON:API version and extensions.
    • Simplified error objects, so they are similar to the other serialization objects. This means no default instances, constructors (exception: ErrorObject) or conditional serialization logic. And explicit names to overrule naming conventions. And annotations to skip serialization when null.
    • Added missing members from JSON:API v1.1 spec: ErrorDocument.Meta, ErrorLinks.Type, ErrorSource.Header, ResourceIdentifierObject.Meta
    • Normalized collection types
    • Updated documentation: Link to v1.1 of JSON:API spec instead of copy/pasted text.
    • Removed inheritance in these objects. STJ has no way to specify JSON element order, it always uses the property declaration order. What's worse, it puts base members at the end, which is impractical. Flattening the inheritance tree enables us to control JSON element order.
    • Consistently set IgnoreCondition (the equivalent of Newtonsofts NullValueHandling) on all properties, so we don't need to override global options anymore.
  • Fill error.source.header in error response for Accept, Content-Type and If-Match headers.
  • Breaking: Renamed "total-resources" meta key to "total" because that's what Ember.js expects it to be named (see https://guides.emberjs.com/release/models/handling-metadata/)
  • Don't add JsonStringEnumConverter to the list of converters in global options in tests anymore, because it hides the problem when we forget to explicitly put it on a serialized object.

Conversion to STJ

Switching to STJ enabled us to remove some workarounds for async calls and oddities around date/time and floating-point numbers handling, but also introduced several limitations. For low-level reading and writing, STJ provides Utf8JsonReader and Utf8JsonWriter. They have methods like ReadStartObject, ReadInt32, WriteString, WriteEndObject etc. At a higher level, there's JsonSerializer, a serializer/deserializer for object graphs. It heavily delegates to JsonConverter instances to process parts of the tree. In the endless list of unsupported features compared to Newtonsoft, the answer is almost always "create a custom converter". This sounds doable at first, but it turns out the position reporting for errors becomes invalid when custom converters come into play. Over time, something has gone terribly wrong! As explained here, converters were never designed to be used like that. There exist two kinds of converters: the public ones for primitive values (the leaves in a JSON tree: int, string, DateTime etc.) and the internal ones for intermediate nodes, such as List, Dictionary, Object etc. These internal converters have their own internal location tracking (the position in the text stream, as well as the depth and the JSON path, such as $.data[0].attributes.firstName) using an internal ReadStack type. Now as long as the whole tree is read by internal converters, they pass along their ReadStack and everything is fine. But when a custom converter is invoked somewhere in the middle of the tree, the ReadStack cannot be passed along and any subsequent internal converters can no longer access it. The reported position is then what the ReadStack contains, which is only a part of the path, so incorrect. The lack of correct position information, combined with the terrible error messages is a big downside that we cannot solve (other than discarding the whole JsonSerializer/Converter model and using only Utf8JsonReader, which we quickly discarded as an option). The best we can do here is first try to minimize the use of custom converters. A converter cannot be read-only or write-only, so we keep separate settings for reading and writing internally, which are copies of the basic settings, but with different converters added to them. (Note: although the trick of cloning options works (see #664 (comment)), it makes the serializer 100x slower compared to Newtonsoft, so a no-go.) Another way to reduce the pain is by not throwing from a converter when unable to read due to an unknown resource type, an invalid attribute value, etc. So instead we put a sentinel value in the tree, which we'll detect in a post-processing phase (there we know the location in the tree), and throw from there.

Reading the JSON:API data element, which can be null, an object, or an array of objects requires the use of a custom converter. Likewise, the JSON:API dictates some rules when to write "data:null" vs when to omit it entirely, which can only be solved using custom converters. Instead of a base class, this is now a readonly struct (we never want it to be null), which has an object Value property, along with SingleData/ManyData calculated properties.

Another challenge is that type may occur after attributes in a JSON body. But we need to know the type and find it in the resource graph before we can convert the attribute values to CLR objects. Therefore we read the subtree twice: in the first pass, we extract the type, then in the second pass we handle everything else.

We found that the built-in JsonStringEnumConverter from STJ has various problems, so we use the one from Macross in our tests. For the library itself, the issues don't apply. Likewise, we use code from https://github.com/J0rgeSerran0/JsonNamingPolicy to support casing conventions other than the built-in pascal case and camel case. Same story for TimeSpan, which is completely unsupported until .NET 6.

Conclusion

I've gone back and forth several times on aborting the whole STJ effort. Initially, I hoped to get rid of most post-processing and instead validate in-place using converters. But the limited built-in validation capabilities, combined with the useless error reporting make that a no-go, so we'll still need heavy post-processing. What's worse, in many bug reports and feature requests on GitHub, the owners are hesitant to address them, saying: "we'll wait to see if more people are asking for this". But they have locked all conversations, which makes it impossible to even upvote! I explored if moving to .NET 6 would take away some pain, but it won't and developers are very upset about it (https://devblogs.microsoft.com/dotnet/try-the-new-system-text-json-source-generator/#comment-9821, dotnet/runtime#31619 (comment)). During my investigations, I've read numerous users reporting to cancel their migration efforts. I sincerely hope the many problems will be addressed soon. They killed Newtonsoft innovation and replaced it with something that's full of bugs and limitations.

On the plus side, STJ eliminates the NuGet package dependency hell around conflicting Newtonsoft versions.

Performance

The snapshots below were taken using #1023 on my Windows 10 laptop running Kestrel on a Release build (without the debugger attached) of the JsonApiDotNetCoreExample project with a small amount of seed data in a PostgreSQL database. Both libraries are slow on first calls because they initialize their caches, so I've done multiple requests and picked one that looks representative. The percentages are what they amount to in the entire request duration, excluding the time spent in database calls.

Measurement results for GET http://localhost:14140/api/v1/todoItems?include=owner,assignee,tags:
    Newtonsoft.Serialize ..........................  0:00:00:00.0003092 ...  12.99%
    JsonSerializer.Serialize ......................  0:00:00:00.0001114 ...   5.26%

Measurement results for POST http://localhost:14140/api/v1/todoItems:
    Newtonsoft.Deserialize ........................  0:00:00:00.0000974 ...   7.16%
    JsonSerializer.Deserialize ....................  0:00:00:00.0000598 ...   4.69%
    Newtonsoft.Serialize ..........................  0:00:00:00.0000457 ...   3.36%
    JsonSerializer.Serialize ......................  0:00:00:00.0000215 ...   1.69%

QUALITY CHECKLIST

  • Changes implemented in code
  • Complies with our contributing guidelines
  • Adapted tests
  • Documentation updated
  • N/A: Created issue to update Templates: {ISSUE_NUMBER}

@codecov
Copy link

codecov bot commented Sep 7, 2021

Codecov Report

Merging #1075 (2e0c811) into master (69bb428) will increase coverage by 0.18%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1075      +/-   ##
==========================================
+ Coverage   88.48%   88.66%   +0.18%     
==========================================
  Files         256      248       -8     
  Lines        7024     7086      +62     
==========================================
+ Hits         6215     6283      +68     
+ Misses        809      803       -6     
Impacted Files Coverage Δ
...NetCoreExample/Controllers/NonJsonApiController.cs 0.00% <0.00%> (ø)
src/Examples/JsonApiDotNetCoreExample/Startup.cs 0.00% <0.00%> (ø)
...tCore/Configuration/ServiceCollectionExtensions.cs 100.00% <ø> (+15.38%) ⬆️
src/JsonApiDotNetCore/Configuration/TypeLocator.cs 86.48% <0.00%> (ø)
...rrors/ResourcesInRelationshipsNotFoundException.cs 100.00% <ø> (ø)
...NetCore/Serialization/Objects/RelationshipLinks.cs 100.00% <ø> (ø)
...iDotNetCore/Serialization/Objects/ResourceLinks.cs 100.00% <ø> (ø)
src/JsonApiDotNetCore/Queries/QueryLayer.cs 87.23% <50.00%> (ø)
...ore/Serialization/Objects/AtomicOperationObject.cs 80.00% <50.00%> (+5.00%) ⬆️
...NetCore/Resources/Internal/RuntimeTypeConverter.cs 88.57% <60.00%> (-11.43%) ⬇️
... and 109 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69bb428...2e0c811. Read the comment docs.

@bart-degreed bart-degreed force-pushed the system-text-json branch 2 times, most recently from 446034b to 572a521 Compare September 7, 2021 15:59
Bart Koelman added 15 commits September 8, 2021 09:36
…en not found; added TryGetResourceContext() that returns null
Note we need JsonDateTimeOffsetFormatSpecifier now, because STJ never tries to infer the CLR type from JSON values between quotes, while Newtonsoft does. So Newtonsoft would convert both values to date/time, effectively hiding the textual difference that was always there.
…erly handles self-referencing EF Core objects when enabling reference tracking, as opposed to Newtonsoft.
…ns into account, which is unneeded because we only care about whether there's a diff, not so much what that diff looks like. And we don't expect self-references here (it would have crashed in the past, and will now too).
…sts, because we already use indenting in TestableStartup, so this is no longer needed.
@bart-degreed bart-degreed force-pushed the system-text-json branch 4 times, most recently from bca9e8b to 85e2abe Compare September 10, 2021 09:10
@bart-degreed bart-degreed force-pushed the system-text-json branch 2 times, most recently from 0fb80f0 to 070c98b Compare September 10, 2021 10:23
Bart Koelman added 3 commits September 14, 2021 10:26
- Added various converters to steer JsonSerializer in the right direction
- JsonApiDotNetCore.Serialization.Objects
  - Removed inheritance in JsonApiDotNetCore.Serialization.Objects, so we're in control of element write order
  - Moved "meta" to the end in all types (it is secondary information)
  - Consistently set IgnoreCondition on all properties, so we don't need to override global options anymore
Bart Koelman added 2 commits September 14, 2021 14:16
By default, this produces:
```
The JSON value could not be converted to JsonApiDotNetCore.Serialization.Objects.SingleOrManyData`1[JsonApiDotNetCore.Serialization.Objects.ResourceObject]. Path: $.data | LineNumber: 3 | BytePositionInLine: 11.
```
which is totally unhelpful. Because this is so likely to hit users, we special-case here to produce a better error.
@maurei maurei merged commit c933e8b into master Sep 17, 2021
@maurei maurei deleted the system-text-json branch September 17, 2021 15:18
bart-degreed pushed a commit that referenced this pull request Sep 20, 2021
* Removed temporary workaround to start postgres service in cibuild (#1083)

* Restored docs for HasManyThrough, which was removed in #1037 (#1084)

* Use System.Text.Json (#1075)

* Removed Serialization.Client.Internal

* Removed undocumented ?nulls and ?defaults query string support

* Refactor: use interpolated strings instead of concatenation

* Updated tests to use string value for IDs; centralized pseudo-constants

* Added tests for pascal casing

* Optimized attribute/relationship lookups

* Breaking: Made IResourceContextProvider.GetResourceContext() throw when not found; added TryGetResourceContext() that returns null

* Optimized resource graph lookups

* Breaking: Merged IResourceContextProvider into IResourceGraph

* Switched to STJ in assertions

Note we need JsonDateTimeOffsetFormatSpecifier now, because STJ never tries to infer the CLR type from JSON values between quotes, while Newtonsoft does. So Newtonsoft would convert both values to date/time, effectively hiding the textual difference that was always there.

* Switched to STJ in rendering exception stack traces

* Switched to STJ in rendering CLR objects as part of tracing. STJ properly handles self-referencing EF Core objects when enabling reference tracking, as opposed to Newtonsoft.

* Switched to STJ in attribute change tracking. This used to take options into account, which is unneeded because we only care about whether there's a diff, not so much what that diff looks like. And we don't expect self-references here (it would have crashed in the past, and will now too).

* Switched to STJ in Microservices example

* Removed re-indent of response body on HTTP status code mismatch in tests, because we already use indenting in TestableStartup, so this is no longer needed.

* Use STJ naming convention on special-cased code paths

* Renamed RelationshipEntry to RelationshipObject, Error to ErrorObject

* Fix broken test in cibuild

* Fixed broken tests in cibuild due to different line endings

* Package updates

* Refactor serialization objects
- Simplified error objects, so they are similar to the other serialization objects. This means no default instances, constructors (exception: ErrorObject) or conditional serialization logic. And explicit names to overrule naming conventions. And annotations to skip serialization when null.
- Added missing members from JSON:API v1.1 spec: ErrorDocument.Meta, ErrorLinks.Type, ErrorSource.Header, ResourceIdentifierObject.Meta
- Normalized collection types
- Updated documentation: Link to v1.1 of JSON:API spec instead of copy/pasted text

* Merged ErrorDocument and AtomicOperationsDocument into Document
Bugfix: jsonapi/version was missing in error responses

* Fill error.source.header where applicable

* Breaking: Renamed "total-resources" meta key to "total" because thats what Ember.js expects it to be named (see https://guides.emberjs.com/release/models/handling-metadata/)

* Removed unneeded StringEnumConverter usage. Also removed it from the defaults for tests, because that hides the problem when we forget to put it on a member that needs it.

* Use configured STJ options for null/default value inclusion
Bugfix: do not break out of method on first attribute

* Fixed data type in json request body

* Added missing type, which is a required element

* Converted core code to use System.Text.Json
- Added various converters to steer JsonSerializer in the right direction
- JsonApiDotNetCore.Serialization.Objects
  - Removed inheritance in JsonApiDotNetCore.Serialization.Objects, so we're in control of element write order
  - Moved "meta" to the end in all types (it is secondary information)
  - Consistently set IgnoreCondition on all properties, so we don't need to override global options anymore

* Updated documentation

* Fixed broken example-generation. Set launchBrowser to true, so it shows sample data on F5.

* Inlined properties on serializable objects

* Add test for incompatible ID value.
By default, this produces:
```
The JSON value could not be converted to JsonApiDotNetCore.Serialization.Objects.SingleOrManyData`1[JsonApiDotNetCore.Serialization.Objects.ResourceObject]. Path: $.data | LineNumber: 3 | BytePositionInLine: 11.
```
which is totally unhelpful. Because this is so likely to hit users, we special-case here to produce a better error.

* Removed misplaced launchsettings.json

* Review feedback: use base class instead of static helper

* Update ROADMAP.md

* Post-merge fixes

* Replace NewtonsoftDataContractResolver with JsonSerializerDataContractResolver

* Small cleanups

Co-authored-by: maurei <maurits.moeys@gmail.com>
bart-degreed pushed a commit that referenced this pull request Dec 3, 2021
* Removed Serialization.Client.Internal

* Removed undocumented ?nulls and ?defaults query string support

* Refactor: use interpolated strings instead of concatenation

* Updated tests to use string value for IDs; centralized pseudo-constants

* Added tests for pascal casing

* Optimized attribute/relationship lookups

* Breaking: Made IResourceContextProvider.GetResourceContext() throw when not found; added TryGetResourceContext() that returns null

* Optimized resource graph lookups

* Breaking: Merged IResourceContextProvider into IResourceGraph

* Switched to STJ in assertions

Note we need JsonDateTimeOffsetFormatSpecifier now, because STJ never tries to infer the CLR type from JSON values between quotes, while Newtonsoft does. So Newtonsoft would convert both values to date/time, effectively hiding the textual difference that was always there.

* Switched to STJ in rendering exception stack traces

* Switched to STJ in rendering CLR objects as part of tracing. STJ properly handles self-referencing EF Core objects when enabling reference tracking, as opposed to Newtonsoft.

* Switched to STJ in attribute change tracking. This used to take options into account, which is unneeded because we only care about whether there's a diff, not so much what that diff looks like. And we don't expect self-references here (it would have crashed in the past, and will now too).

* Switched to STJ in Microservices example

* Removed re-indent of response body on HTTP status code mismatch in tests, because we already use indenting in TestableStartup, so this is no longer needed.

* Use STJ naming convention on special-cased code paths

* Renamed RelationshipEntry to RelationshipObject, Error to ErrorObject

* Fix broken test in cibuild

* Fixed broken tests in cibuild due to different line endings

* Package updates

* Refactor serialization objects
- Simplified error objects, so they are similar to the other serialization objects. This means no default instances, constructors (exception: ErrorObject) or conditional serialization logic. And explicit names to overrule naming conventions. And annotations to skip serialization when null.
- Added missing members from JSON:API v1.1 spec: ErrorDocument.Meta, ErrorLinks.Type, ErrorSource.Header, ResourceIdentifierObject.Meta
- Normalized collection types
- Updated documentation: Link to v1.1 of JSON:API spec instead of copy/pasted text

* Merged ErrorDocument and AtomicOperationsDocument into Document
Bugfix: jsonapi/version was missing in error responses

* Fill error.source.header where applicable

* Breaking: Renamed "total-resources" meta key to "total" because thats what Ember.js expects it to be named (see https://guides.emberjs.com/release/models/handling-metadata/)

* Removed unneeded StringEnumConverter usage. Also removed it from the defaults for tests, because that hides the problem when we forget to put it on a member that needs it.

* Use configured STJ options for null/default value inclusion
Bugfix: do not break out of method on first attribute

* Fixed data type in json request body

* Added missing type, which is a required element

* Converted core code to use System.Text.Json
- Added various converters to steer JsonSerializer in the right direction
- JsonApiDotNetCore.Serialization.Objects
  - Removed inheritance in JsonApiDotNetCore.Serialization.Objects, so we're in control of element write order
  - Moved "meta" to the end in all types (it is secondary information)
  - Consistently set IgnoreCondition on all properties, so we don't need to override global options anymore

* Updated documentation

* Fixed broken example-generation. Set launchBrowser to true, so it shows sample data on F5.

* Inlined properties on serializable objects

* Add test for incompatible ID value.
By default, this produces:
```
The JSON value could not be converted to JsonApiDotNetCore.Serialization.Objects.SingleOrManyData`1[JsonApiDotNetCore.Serialization.Objects.ResourceObject]. Path: $.data | LineNumber: 3 | BytePositionInLine: 11.
```
which is totally unhelpful. Because this is so likely to hit users, we special-case here to produce a better error.

* Removed misplaced launchsettings.json

* Review feedback: use base class instead of static helper
bart-degreed pushed a commit that referenced this pull request Dec 3, 2021
* Removed Serialization.Client.Internal

* Removed undocumented ?nulls and ?defaults query string support

* Refactor: use interpolated strings instead of concatenation

* Updated tests to use string value for IDs; centralized pseudo-constants

* Added tests for pascal casing

* Optimized attribute/relationship lookups

* Breaking: Made IResourceContextProvider.GetResourceContext() throw when not found; added TryGetResourceContext() that returns null

* Optimized resource graph lookups

* Breaking: Merged IResourceContextProvider into IResourceGraph

* Switched to STJ in assertions

Note we need JsonDateTimeOffsetFormatSpecifier now, because STJ never tries to infer the CLR type from JSON values between quotes, while Newtonsoft does. So Newtonsoft would convert both values to date/time, effectively hiding the textual difference that was always there.

* Switched to STJ in rendering exception stack traces

* Switched to STJ in rendering CLR objects as part of tracing. STJ properly handles self-referencing EF Core objects when enabling reference tracking, as opposed to Newtonsoft.

* Switched to STJ in attribute change tracking. This used to take options into account, which is unneeded because we only care about whether there's a diff, not so much what that diff looks like. And we don't expect self-references here (it would have crashed in the past, and will now too).

* Switched to STJ in Microservices example

* Removed re-indent of response body on HTTP status code mismatch in tests, because we already use indenting in TestableStartup, so this is no longer needed.

* Use STJ naming convention on special-cased code paths

* Renamed RelationshipEntry to RelationshipObject, Error to ErrorObject

* Fix broken test in cibuild

* Fixed broken tests in cibuild due to different line endings

* Package updates

* Refactor serialization objects
- Simplified error objects, so they are similar to the other serialization objects. This means no default instances, constructors (exception: ErrorObject) or conditional serialization logic. And explicit names to overrule naming conventions. And annotations to skip serialization when null.
- Added missing members from JSON:API v1.1 spec: ErrorDocument.Meta, ErrorLinks.Type, ErrorSource.Header, ResourceIdentifierObject.Meta
- Normalized collection types
- Updated documentation: Link to v1.1 of JSON:API spec instead of copy/pasted text

* Merged ErrorDocument and AtomicOperationsDocument into Document
Bugfix: jsonapi/version was missing in error responses

* Fill error.source.header where applicable

* Breaking: Renamed "total-resources" meta key to "total" because thats what Ember.js expects it to be named (see https://guides.emberjs.com/release/models/handling-metadata/)

* Removed unneeded StringEnumConverter usage. Also removed it from the defaults for tests, because that hides the problem when we forget to put it on a member that needs it.

* Use configured STJ options for null/default value inclusion
Bugfix: do not break out of method on first attribute

* Fixed data type in json request body

* Added missing type, which is a required element

* Converted core code to use System.Text.Json
- Added various converters to steer JsonSerializer in the right direction
- JsonApiDotNetCore.Serialization.Objects
  - Removed inheritance in JsonApiDotNetCore.Serialization.Objects, so we're in control of element write order
  - Moved "meta" to the end in all types (it is secondary information)
  - Consistently set IgnoreCondition on all properties, so we don't need to override global options anymore

* Updated documentation

* Fixed broken example-generation. Set launchBrowser to true, so it shows sample data on F5.

* Inlined properties on serializable objects

* Add test for incompatible ID value.
By default, this produces:
```
The JSON value could not be converted to JsonApiDotNetCore.Serialization.Objects.SingleOrManyData`1[JsonApiDotNetCore.Serialization.Objects.ResourceObject]. Path: $.data | LineNumber: 3 | BytePositionInLine: 11.
```
which is totally unhelpful. Because this is so likely to hit users, we special-case here to produce a better error.

* Removed misplaced launchsettings.json

* Review feedback: use base class instead of static helper
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants