Leaving the confines of your own process's safe little memory space is always a potentially painful moment when you're coding up an up. Whether it's receiving data from the outside world, passing a message over RabbitMQ to an other in house service, or writing an audit trail that needs to be accessible for the next 20 years, there's a bunch of considerations that need to be taken into account when you hit the joys of serialization and deserialization.
Mirror, mirror on the wall
First up on the list is whether or not to use reflection. For those of you who aren't aware, reflection is a way of inspecting the properties of your .net code at runtime, and various serialization libraries take advantage of this to try and make your life easier.
The best known example of this is Json.NET - which will take any .net object and try and create a Json representation of it - or vice versa.
1 2 3 4 5 6 7 8 9 10 11 12 13
This is a common technique, but it does have a few problems.
The "schema" here is actually the underlying .net type that you are serializing - this can be great for quick to implement communications between .net services which can share a "messages" dll with the type in.
Unfortunately, this also plays merry havoc in any scenario where your messages might persist between versions of your messaging dll. Saving these objects to a document store for example, or trying to run two versions of a service at once (required for seamless deployment) connected to a messaging bus.
Because .net will only allow you to have one version of an assembly loaded at once, you can't easily build a way of deserializing the old format.
JsonConvert.SerializeObject method up there: it's generic. Which means it will
take any .net object you want to throw at it.
Unfortunately, the number of .net objects it can actually serialize is quite a bit more restricted than "any .net object you want to throw at it". And it has no way of telling you that until run time, when it will just throw an exception, which can be pretty painful.
When to use?
Only use reflection based serialization in situations where you know the serialized representation will be transient, and where you can test your serialization works at runtime in advance. Additional, be very suspicious of using reflection based serialization in any scenario where anything outside the .net ecosystem will need to access the data.
- Inter-process communication in distributed systems
- Message bus communications (only if you can guarantee the messages are transient)
- Permanent persistence
- Defining APIs to be used from outside .net
If you are doing reflection based serialization in .net, you want to use FsPickler. It covers binary, json, bson and xml serialization in a single library, is faster than Json.NET and successfully serializes more types than Json.NET. What's not to like?
Attribute all the things!
As well as just trying to guess how to serialize things with reflection in .net,
there is also the Serializable attribute that allows you to then serialize to a variety
of formats using the
System.Runtime.Serialization name space. This a few advantages
over the raw reflection technique in theory (you can mark specific fields not to be serialized, for example) but to be blunt if you're going to go to this amount of effort you may
as well go for one of the safer options below.
When to use?
When you're using a Microsoft library that requires you to.
Safety with (type) class
If you happen to be working in F#, then member constraints allow you to try a more flexible and type safe way of expressing serialization - in exchange for a little more work.
As an example, the Chiron library allows you to do things like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
A few interesting things to note here. Firstly, as you've probably guessed, the magic
of telling Chiron how to serialize and deserialize things happens in the
FromJson methods. What might not be so obvious is that if these methods are not
implemented with the correct signature, than
Json.serialize will not compile when
fed the erroneous object. Which does wonders for eliminating run time errors!
Also, if you look carefully at the To and From methods you'll see that there's no need for the fields in the Json and the .net object to have the same name. In fact, you can apply what ever logic you want within them, allowing you to match a specific schema, allow for different versions of the serialized data to be deserialized or just generally make the serialized version prettier to use from other places.
You do, of course, have to actually write the From and To methods, which is definitely
more work. Having said that, it's not quite as bad as it sounds though: as you can see
OuterRecord type with it's
InnerRecord field, and can nest the To and
From methods nicely - and the type system will check that your object is serializable
all the way down. Nice.
When to use?
Any time when you might have to persist data between versions, or process data during the serialization/deserialization process - and you can specify your data types in F#.
Fleece and Chiron both implement these techniques. Fleece is a more established library that has been tested for longer, but I have had some performance issues with it in libraries with a lot of types that implement To and From methods. Chiron is a little bit… cutting edge in age, but has always been fast and reliable for me so far.
We currently use Fleece in one of our projects, and have provided some support to improving the testing of Chiron as it looks like a hopeful alternative.
Edit: Eirik points out in the comments that FsPickler (mentioned above) also has a mechanism for defining type safe (de)serialization using Picklers. I couldn't possibly guess where the library got it's name.
All the rest
I'm sure that there are other ways of tackling these problems - for example, for our customer facing "business" APIs we're moving to specifying our API using hand crafted XSD and WSDL files and then autogenerating code behind. This assumes, of course, you have some reason to be using SOAP. But if you do, it works an awful lot better than trying to autogenerate the schema from the code - a path that's lead me to worlds of pain both as the consumer and producer of the schemas.
But the routes above are the most common paths that I've come across as a .net developer and I've often discovered them being used in the "wrong" places. Retroactively having to version reflection based APIs is a particular pain point that I'd recommend avoiding if you possibly can!