Serialization is the process of converting an object into a form that can be persisted in a storage medium or transported across a communication layer. In the .NET world the easiest way to use it is via binary serializer, but even the powerful binary serializer has its pitfalls if not used properly. I'll try to cover a few things to watch for when using the binary serialization in .NET
A short overview and a few references
When we are talking about the serialization in the .NET context we will sooner or later mention the BinaryFormatter
which is used to serialize the object in a stream of bytes. It is good to know that BinaryFormatter will persist the type information of an object being serialized inside the byte stream along with the type information of the sub-objects contained within the object being serialized. There are few articles that cover this subject in more detail:
Clean it up
Your are in control
- Always be sure that you have flushed the output buffer of a stream you are using so you can get all the data from the buffer.
- Close and dispose the stream so you are sure that no unnecessary data has been left behind you.
Watch out for surrogates
- Do remember that you control the serialization flow, because serialization automatically persists the entire object state. It is good to know that you can use various mechanisms to control the serialization flow. Some of them are NonSerialized, ISerializable, IDeserializationCallback, so take a good look at them if you need to control the flow. There is a good article about the flow control here.
Circular references can make you suffer
- This is a little-known feature of the serialization, and technically speaking, a serialization surrogate is a class that implements the ISerializationSurrogate interface. The interface consists of two members: GetObjectData and SetObjectData. To put it simply, surrogates are used in a scenario when one class needs to control the serialization of another. You can read a few sections on this subject in one of the MSDN articlese and here.
- If you have an object that has a circular reference passed by ref (C# reference) you need to know that binary serialization will properly handle serialization of every reference. One exception from this rule is when you are using the LLBLGen's entities that have circular references (but not in all cases). In such scenarios you may end up with the large amount of data because circularly referenced object is, for some reason (probably a custom serialization code inside the LLBLGen framework), being serialized more than once.
- One tip regarding the stream to byte array transaction (if you are using the MemoryStream), use MemoryStream.ToArray method so you don't get extra unused bytes (If ToArray is used instead GetBuffer, the returned array size may be up to 30-40% smaller, with a downside that ToArray is creating a new byte while GetBuffer is returning pointer to array - WCF and Web services may require a smaller data footprint.).
- You may get into troubles with Value Types, Interfaces and Boxing - read more on this subject here (in the comments area).