Deserializing Polymorphic Types with System.Text.Json

TL;DR – If you are considering System.Text.Json Converters to solve your polymorphic DTOs serialization issues,it is possible but you will be adding coupling to implementation details between systems – the descriminator key of types that need to exist in the client and in the server of that API.

System.Text.Json

System.Text.Json is the json serializer that .NET team as develop and integrated into the corefx framework. The main features that comparing to the newtonsoft serializer are related to performance, taking advantages to the usage of the System.Memory namespace types, such as Span<T> and Memory<T>. It also adds a more simple design, and it comes by default since ASP.NET Core 3.1, With this new serializer the dependency to Newtonsoft json serializer no longer exist when it comes to json serialization. Besides the performance gains it came with a cost, and some features that we have in Newtonsoft are no longer available on System.Text.Json out of the box, such as Polymorphic serialization. According to the team that manage this namespace, Polymorphic serialization was removed, and the reason they have mention so far is not clear, but the good news are that the issue is open and the comunity is asking for this feature some while ago now. But not everything is lost, because, similiar to Newtonsoft Converters System.Text.Json namespace also includes a type definition that enable us to change the json that is written to the serialization result and that is the JsonConverters<T> type.

In this post Im going to drill down into the use case ( not the real one… but very similiar ) that brought this need to my application and then I will present the solution i have implemented and I will present some of my conclusions about this approach. hope it help the reader.

The Use Case

Very recently I found the need to have polymorphic types on my aspnet core 3.1 web api contracts, due to some business requirements. Indeed there is not much popular design consideration but it makes the APIs simpler to be used, instead of breaking an API for diferent controller and action methods to deal with concrete types. Consider the following Model:

The description of the use case says that we need to have a mobile application that will need to have all the stocks information updated to be used by the warehouse operator, that way he can read the information about the current stocks.

A warehouse stock instance holds the information about the stocks that are needed to be refilled in some stores and what is the criticity level that store currently is facing.

Each store as a set of departments stocks to be solved and each department has their own products.

Making this data available in a cohesive way, it requires to provides all this data in one call to the API, if we would split this in diferent calls each for one department and diferent api for diferent types of resources it would consume development type, it would required the client application to manage all the diferent calls needed, this way the client application executes an HTTP GET operations returning a Stocks instance with everything inside of it.

As we see in the image the polymorphic types consists on concrete definitions for store products, they are reused accross different use cases, not only here, but instead of having one enumerable for each diferent type definition we take advantage of the OOP polymorphic support of the server and client application language to have everything on the same container of objects.

The Problem

To correctly Serialize this graph of objects consists on making sure that the exact instance type data is preserved during serialization to be available during deserialization to extract the exact properties and their values, but this is where we came across the limitation. When we serialize a graph of objects using Text.Json JsonSerializer it is not written anywhere in the json the type definition that was used to generate that json, and the JsonSerializerOptions does not have any property that explicity allow is to serialize Polymorphic graphs of instances of objects, and that is important because a especific instance of Product type can now be represented with diferent type definitions.

To understand this problem lets see an example:

If you want to try out this code check my github repository. This code you can find it in the unit tests. here it is possible to see that diferent types of products are being added to our departmentStocks. To serialize such code using the Text.Json namespace the documentation sugest to use one of the JsonSerializer methods that basically allow us to serialize an instance of an object in three diferent flavors:

  1. Serialize an instance of object using a Utf8JsonWriter
  2. Serialize an instance of object to a Stream
  3. Serialize an instance of object just with the instance of the object

Dispite the diferent ways that are available to serialize this information to json, System.Text.Json does not know how to do polymorphic serialization, and the main reason for that according to Git Repository Issue is performance and security reason about type naming handling. After some long discussion at github and arguments presentation looks like .NET team has openened that issue with some relevant priority wich is nice.

The solution

For now until the oficial solution became available custom extensions can be implemented to provide such features, and one of those examples are Converters. An important JsonSerializer parameter are the options available that allow us to configure how the serialization will behave, such as adding a JsonConverter. In fact the JsonConverter is part of our soultion.

JsonConverter implementation

To change the behaviour of the serialization of any graph of objects to consider concrete type information, the simplest solution to consider is to explicitly write on the json a marker that will tell the deserialization about what was the concrete type serialized, that way we can have a cohesive serialize and deserialize library to deal with our Api models serialization. This way will also maintain the C# type definition clean because the types will not have redundant information about the exact type.

To have a reusable JsonConverter to all of our Types definitions that have polymorphic relationships, our implementation that inherits from JsonConvert is a Generic one. and the type to instatiate can be one for each polymorphic relationship we have in our application. The onply thing that this type definition client can define, is indeed the type descriminator marker that is going to be written in the json and the possible values that can be consider there, all the other behaviour is common independently of the objects being serialized

public class PolymorphicJsonConverter : JsonConverter where T : class 
{
private Dictionary<string, Func<string, Type>> typeResolvers;
public PolymorphicJsonConverter(Dictionary<string, Func<string, Type>> typeResolver, string descriminator)
{
this.typeResolvers = typeResolver; this.TypeDescriminator = descriminator;
}
}

JsonConverter contains three methods to override:

  1. CanConvert – It will work as evaluator if the converter knows how to deserialize concrete types that may inherit from the one on the method parameter
  2. Read – Gives access to a Utf8JsonReader to read the json from the point where the CanConvert returns true, wich means the property that was of the type that matches.
  3. Write – Gives access to a Utf8JsonWriter to write to the stream a json string that will represent the json ob that instance type.

CanConvert

The method CanConvert implementation is straitghforward because the BCL reflection namespace gives us a nice method from the Type class that evaluate if a variable with the type T (the generic in the implementation) can be setted from variables that have another type.

public override bool CanConvert(Type typeToConvert)
{
return typeof(T).IsAssignableFrom(typeToConvert);
}

Read

The method read will be responsable to deserialize a previously json serialization with a System.Text.Json using the same converters. It is important to use the same serialization mechanism to serialize and deserialize a json representation of our objects graph, that way we can make sure the deserialization will not break between parties.

public override T Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
        {
          

            var beginnerReader = reader;
            if (reader.TokenType != JsonTokenType.StartObject)
            {
                throw new JsonException();
            }

            if (!reader.Read()
                    || reader.TokenType != JsonTokenType.PropertyName
                    || reader.GetString() != TypeDescriminator)
            {
                throw new JsonException();
            }

            T baseClass;
            if( !((string)reader.GetString()).Equals(this.TypeDescriminator) 
                 || !reader.Read())
                throw new JsonException();
            
            var typeDescriminatorValue = reader.GetString();
            baseClass = Deserialize(ref beginnerReader, typeDescriminatorValue);

            if (beginnerReader.TokenType != JsonTokenType.EndObject)
            {
                throw new JsonException();
            }

            reader = beginnerReader;

            return baseClass;
        }

Reading a json string to deserialize a graph of objects considering polymorphic, the main change that exists prior to the default implementation is the type descriminator that will exist on the json that is about to be deserialized. The main consideration about this is that the naming being used is type safety, so that way we can make sure the deserilization can happen with sucess. To support that there is the need to maintain a structure in the converter implementation to map those two things: discriminator and Type information.

        protected virtual void SetAllKnownTypes()
        {
            var allTypes = FindDerivedTypes(typeof(T).Assembly, typeof(T));

            foreach (var type in allTypes)
            {
                this.typeResolvers.Add(type.FullName, (descriminator) => type);
            }
        }

To make sure that the user does not insert wrongly mappings of objects, this implementation loads all the well known types from the base type used in the Type parameter of the converter instance to the one that is going to be serialized, this way we can make suure of two things:

  1. Correct Types are going to be used available on the concrete application implementation
  2. the descriminator map will not break due to not existing types representation

But there are some limitations. The types involved must be the same that are available on both ends of this mechanism. This could be a limitation if you are enabling comunication between two systems that are not owned by the same entities and then code must be shared.

Write

Writing the object graph representation to json is something that consumes memory, and that is something that we do not want, because it will impact the performance of our application and it can result in a more expensive one. creating the diferent json tokens in the correct order by adding the type descriminator in the beginning of an object serialization representation without extra price on memory consumption is a must have. to achieve that, i have used the Span<T> that enables me to go throught an array of bytes and parse that with the minimum impact of parsing the string and slice it and by consequence create other strings. the following algorithm does that in this way:

  1. Serialize the instance using the System.Text.Json.Serializer.SerializeToUtf8Bytes using the concrete type information and create a Span over that byte array.
  2. Creates the descriminator representation ( with the initial braket ) and appends it in the begining of that json object representation
  3. allocate the necessary memory to have the object as it is with the discriminator ( line 112 and 113 )
  4. append the descriminator ( from line 115 to 117 ) to the memory that will have the discriminator and the object
  5. append the object json data without the initial braket to the end result span ( line 119 to 121 )
  6. write the result to the json writter that is operating the serialization.

Conclusion

Serialization is a concern that we should implement as simple as possible to minimize the coupling to frameworks and tweeks fo specific use cases and to solve hotpaths on the application. Serialization should be agnostic to the implementation of any framework and should support as much as possible independence of any specifics, such as application needs. With this in mind we can have a truely reusable API between services that should be technology and framework agnostic. This mechanism that i present here indeed as help to solve a problem with use cases similiar to this, because the System.Text.Json framework allows me to extend it in my own way, that does not mean it is the solution for this problem, or even if the application design in the API contract shoud have this design of polymorphic DTOs. With this solution Im already adding some constraints, so if you are considering to you use case read this:

  1. The type descriminator existance – The deserialization application should know how the serialization happen and should consider know the type descriminator.
  2. The type descriminator position – The deserialization application should know that in that position there is now a descriminator instead of the first element of the json representation of my object.
  3. Hight coupling to the type name – API DTOs Type names are something that should not be exposed, that should be considered as a implementation detail, what is important is indeed the public fields of that DTO that are part of the API. using this discriminator coupled with the tyme name restricts my services with this name and that brings coupling that we do not want between systems that we should not need to create.
  4. Another consideration is the performance. The serialization is, on must of the use cases, on the hotpath of your use cases specialy when it is a solution based on a distributed architecture, that means that the boundries of each service should be simple and small as possible so that the serialization of messages should not consume to much CPU or memory, and instead that should be spent on where the money is, on the domain.

Esta entrada foi publicada em Uncategorized. ligação permanente.

Uma resposta a Deserializing Polymorphic Types with System.Text.Json

  1. Thank you for this package! What a great article. I left a message in the repo. I know this is an older post, but it’s still great to find/

Deixe um comentário