Mar 6 2011

Fluent XML Serialization–Introduction

Category: FluentlyXMLMatt @ 09:14

The System.Xml.XmlSerializer class enables .NET applications to serialize/deserialize most types to and from XML using only a few lines of code.  This is a great capability and provides an easy API for simple persistence and interoperability scenarios.  As a developer, you have some degree of control over the XML that’s generated, but the process is mostly rigid and not easy to extend or customize.  There are also numerous “gotchas” around XML serialization, such as the inability to serialize IDictionary types, the inability to serialize and deserialize interfaces, and no support for the concept of “identity” when deserializing object graphs.  Usually one can find a way around these limitations, but on a recent project I found that the pain of working around them was too great to bear.  Out of that pain was born a new flexible XML serialization framework that overcomes the limitations of the XmlSerializer class.  Read on to find out more.

XmlSerializer’s Abilities

XmlSerializer is a useful class that all .NET developers should be at least somewhat familiar with.  Using it, you can easily transform most types to XML and back again, like this example from MSDN illustrates:

private void SerializeObject(string filename)
{
   Console.WriteLine("Writing With Stream");

   XmlSerializer serializer = 
   new XmlSerializer(typeof(OrderedItem));
   OrderedItem i = new OrderedItem();
   i.ItemName = "Widget";
   i.Description = "Regular Widget";
   i.Quantity = 10;
   i.UnitPrice = (decimal) 2.30;
   i.Calculate();

   // Create a FileStream to write with.
   Stream writer = new FileStream(filename, FileMode.Create);
   // Serialize the object, and close the TextWriter
   serializer.Serialize(writer, i);
   writer.Close();
}

You can convert almost any object to XML with a single line of code thanks to this extension method:

public static class XmlExtensions
{
    public static string ToXml(this object obj)
    {
        var serializer = new XmlSerializer(obj.GetType());

        using (var writer = new StringWriter())
        {
            serializer.Serialize(writer, obj);
            return writer.ToString();
        }
    }

    //Usage: string xml = myObject.ToXml();
}

This is great for simple persistence scenarios, for making simple configuration settings files, and for providing interoperability with other systems.  It is also very performant.  The first time you construct a new XmlSerializer for a type, a custom serializer is emitted that can process the type without using reflection each time.  This custom serializer is cached for the life of the application domain, and you can actually generate a serialization assembly if you want to avoid the performance hit the first time you serialize a new type each time your app runs.

XmlSerializer’s Inabilities

XmlSerializer is really quite useful for basic scenarios.  However, like many things in the .NET BCL, XmlSerializer was not built with extensibility in mind.  To customize how it serializes a type, you have two options: use XML attributes or implement the IXmlSerializable interface and write your own serialization logic.  If you try to use attributes to customize the behavior of XmlSerializer, you’ll quickly find that your control is very limited.  You can ignore or rename properties and perform other such simple transformations, but that’s about it.  The attribute-based approach also requires that you dirty up your objects with attributes, which some people consider a violation of the Single Responsibility Principle.  If you want to customize something that isn’t supported by the very limited set of attributes, you’re out of luck unless you want to implement IXmlSerializable, and at that point you’re basically on your own for serializing your type. 

Another weakness of XmlSerializer is that it has no concept of object identity.  When deserializing an object graph, XmlSerializer will create a new instance for each object in the graph. It has no way of knowing that an object might appear in multiple locations in the XML. 

XmlSerializer also doesn’t support serializing object graphs that contain cycles.  You can use the XmlIgnore attribute to ignore properties that cause cycles, but that property will also be ignored when deserializing the object graph, which means you’ll have to manually rebuild properties in the object graph after XmlSerializer finishes.

Introducing Fluently-XML

One of the many projects I’m working on is a cost modeling system known as InGauge.  InGauge uses XML to provide interoperability with other cost modeling tools.  We’re dealing with very complex object graphs that must be correctly serialized and deserialized in order to maintain the integrity of the data as it passes from one system to another.  Our team found that the limitations of XmlSerializer proved to be too painful to work around, so we created a custom XML serialization framework that gave us the control and flexibility we needed.  Fluently-XML (that’s what I’m calling this framework) provides the same basic serialization/deserialization capabilities right out of the box as .NET’s XmlSerializer, but it also sports a fluent domain specific language (DSL) that can be used to customize the serialization and/or deserialization process for any type.  It gives us complete control over how our object graphs are handled in a lightweight manner.  Here are just a couple of things it currently provides beyond XmlSerializer:

Object Identity

[Test]
public void Deserialization_respects_object_identity()
{
    var bar1 = new Bar { BarId = 1 };
    var bar2 = new Bar { BarId = 2 };
    
    var bars = new[] { bar1, bar2, bar 1};
    
    string xml = _serializer.Serialize(bars).ToString();
    
    var deserializedBars = _deserializer.Deserialize<Bar[]>(xml);
    
    //This passes!
    Assert.That(deserializedBars[0], Is.SameAs(deserializedBars[2]);
}

Serializing Proxies

[Test]
public void Proxied_types_can_be_serialized_and_deserialized()
{
    var generator = new ProxyGenerator();
    var bar = (Bar)generator.CreateClassProxy(typeof(Bar));
    bar.Name = "Test!";

    string xml = _serializer.Serialize(bar).ToString();
    
    //Even though it's a proxied type, it still gets serialized 
    //as the correct underlying type!
    Assert.That(xml, Is.StringContaining("<Bar>");
    
    var deserializedBar = _deserializer.Deserialize<Bar>(xml);
    
    //We can deserialize the XML back to a normal bar!
    Assert.That(deserializedBar.Name, Is.EqualTo("Test!"));
}

Cycles and Parent/Child Relationships

[Test]
public void Cycles_and_parent_child_relationships_are_supported()
{
    var parent = new Foo { ID=1, Name = "Parent" };
    var child = new Foo { ID=2, Name = "Child" };
    parent.Children.Add(child);
    child.Parent = parent;

    string xml = _serializer.Serialize(parent).ToString();
    
    //The parent is serialized by it's ID!
    Assert.That(xml, Is.StringContaining("<Parent>1</Parent>");
    
    var deserializedParent = _deserializer.Deserialize<Foo>(xml);
    
    //Even though it was serialized as a single integer, 
    //the Parent property is still deserialized correctly!
    Assert.That(deserializedParent.Children[0].Parent, Is.SameAs(deserializedParent));
}

Coming up next…

This is the first of hopefully many posts on this framework.  We’ll start in the next post by looking at the fluent DSL that can be used to customize the serialization/deserialization process and how it’s used to build up custom serialization behavior at runtime. 

Tags:

blog comments powered by Disqus