Thursday, 18 July 2013

Slimming down your JSON

Newtonsoft JSON.Net is the JSON serialization library that is so good, Microsoft use it over their own.
While converting an existing project from Linq to SQL to EF5 Code First, I hit an issue with the Unit Tests, which use test objects serialized to XML files as the basis of the tests.  This upset the XML Serializer as collections in EF are ICollection<T> as opposed to Linq to SQL's EntitySet<T> – and the XML Serializer can’t handle interfaces.

JSON.Net to the rescue - fortunately it can handle interfaces so I chose to convert the test data to serialized Json instead. This was a relatively trivial task, accomplished by the below Linqpad script (if you're not using Linqpad, stop reading this blog and go and download it now).

void Main()
{
       var filePath = @"C:\users\Alan\Downloads\";
       var filename = "OrderItemTestData.json";
       var deSerializationMode =2;
       //1 for Json deserialize to object, 2 for cast (use for derived types of abstract classes where return type is the base class).
       var update=false;

var result = LoadData<List<OrderItemBase>>(filePath + filename,deSerializationMode);

       var updated = JsonConvert.SerializeObject(result, Newtonsoft.Json.Formatting.Indented,
       new JsonSerializerSettings() { ReferenceLoopHandling = ReferenceLoopHandling.Ignore, 
TypeNameHandling = TypeNameHandling.All, 
NullValueHandling= NullValueHandling.Ignore});
      
       updated.Dump();
if(update){  
       File.WriteAllText(filePath + filename,updated);
}
}

This simply takes the XML files, deserializes it to the source objects (that’s all LoadData does), and then reserializes it to JSON.
This solved my immediate problem but I was surprised to see the size of the file, in some cases 3 times larger than the equivalent XML file. But JSON is less characters, so how is that possible? A simple comparison of the files shows the problem:
Excerpt from XML file: 
  <Company>
    <Id>201</Id>
    <Name>Company 201</Name>
    <Code>foo</Code>
    <PlaquePrice>5</PlaquePrice>
    <FreeSampleCount>10</FreeSampleCount>
    <CompanyDispensers>
      <CompanyDispenser>
        <Dispenser>
          <CompanyId>101</CompanyId>
        </Dispenser>
      </CompanyDispenser>
    </CompanyDispensers>
  </Company>
Excerpt from Json File:
 "$type": "MyCompany.Model.Customer, MyCompany.Model",
      "IsInternal": false,
      "AvailableWorkFlows": {
        "$type": "System.Collections.Generic.List`1[[System.String, mscorlib]], mscorlib",
        "$values": [
          "StandardWorkFlow"
        ]
      },
      "BillingAddress": null,
      "BillingContact": null,
      "PrimaryContact": null,
      "ActiveUsers": {
        "$type": "MyCompany.Model.User[], MyCompany.Model",
        "$values": []
      },
      "Id": 201,
      "Name": "Company 201",
      "Code": "foo",
      "RecordUpdate": "0001-01-01T00:00:00",
      "RecordCreate": "0001-01-01T00:00:00",
      "Orders": {
        "$type": "System.Collections.Generic.List`1[[MyCompany.Model.Order, MyCompany.Model]], mscorlib",
        "$values": []
      },
      "Products": {
        "$type": "System.Collections.Generic.List`1[[MyCompany.Model.Product, MyCompany.Model]], mscorlib",
        "$values": []
      },
      "Users": {
        "$type": "System.Collections.Generic.List`1[[MyCompany.Model.User, MyCompany.Model]], mscorlib",
        "$values": []
      },
      "IsValid": true
    } 

In this case, the XML file comes out at 1617 characters, the Json equivalent is 48833 characters
The JSON.Net serializer has gone over the objects and serialized every property, even ones that were null or default for their type and also empty collections. This can be easily solved by setting the appropriate properties on the serializer:
new JsonSerializerSettings() { ReferenceLoopHandling = ReferenceLoopHandling.Ignore, 
          TypeNameHandling = TypeNameHandling.All, 
          NullValueHandling= NullValueHandling.Ignore, 
          DefaultValueHandling = DefaultValueHandling.Ignore}
 Setting NullValueHandling and DefaultValueHandling to Ignore solves the problem of properties that are at the default for their type, such as datetimes, and null properties. However, this still leaves us with all of the collections, which are initialised in the object’s constructor to new List<T>.
By default, Json.Net can’t be instructed to ignore these empty lists, because ignoring them may not be the correct action in everyone’s case. In ours it is, so we need to tell Json.Net that it is okay to ignore them to reduce our file size.
To do this we need to create a custom DefaultContractResolver, the code is below:
public class IgnoreEmptyCollectionsContractResolver : DefaultContractResolver
{
    public new static readonly IgnoreEmptyCollectionsContractResolver Instance = new IgnoreEmptyCollectionsContractResolver();

  protected override JsonProperty CreateProperty(MemberInfo member, MemberSerialization memberSerialization)
  {
    JsonProperty property = base.CreateProperty(member, memberSerialization);

    if ((property.PropertyType.Name.Contains("IEnumerable") || property.PropertyType.Name.Contains("ICollection")) && property.PropertyType.GenericTypeArguments.Count() == 1)
    {
      property.ShouldSerialize = instance =>
         {
         try{
              var cnt = instance.GetType().GetProperty("Count").GetValue(instance,null);
              return (int)cnt > 0;
              }
         catch(NullReferenceException){
         return false;}
         };
    }
    return property;
  }
}
 The <catchyName>IgnoreEmptyCollectionsContractResolver</catchyName> simply checks if the current property is an ICollection or IEnumerable and that it has a single generic argument. It then checks the Count property and instructs Json.Net to serialize/deserialize that property depending on whether or not count is greater than 0. I’m sure this can be done a lot neater, but it solves my problem.
We then simply instruct Json.Net to use this as part of the JsonSerializerSettings object:
new JsonSerializerSettings() { ReferenceLoopHandling = ReferenceLoopHandling.Ignore, 
    TypeNameHandling = TypeNameHandling.All, 
    NullValueHandling= NullValueHandling.Ignore, 
    DefaultValueHandling = DefaultValueHandling.Ignore, 
    ContractResolver = new IgnoreEmptyCollectionsContractResolver()});
 Now the serialized Json looks like this:
{
      "$type": "MyCompany.Model.Customer, MyCompany.Model",
      "FreeSampleCount": 10,
      "PlaquePrice": 5.0,
      "AllDispensers": {
        "$type": "MyCompany.Model.Dispenser[], MyCompany.Model",
        "$values": []
      },
      "Id": 201,
      "Name": "Company 201",
      "Code": "foo",
      "IsValid": true
    }
 The total size has dropped from nearly 50000 characters to 1495, a much more acceptable size.

I hope this is of use to someone. If the resolver can be done in a better way, use the comments.

Monday, 1 July 2013

A simple WebCache Helper

As mentioned in most of my previous posts, the main project I work on is due to move to Azure in the future. Among the many gems of Azure is their caching infrastructure, which can either be hosted on a dedicated worker role or instructed to use spare memory on your web roles.
More information and pricing for Azure Caching can be found at http://www.windowsazure.com/en-us/services/caching/

I fully intend to make use of Azure’s in-built caching when we get there, but I can’t wait to start implementing some sort of caching and I don’t want to have to do a big find-replace in the code when we do get there, so I wrote a simple WebCacheHelper which provides easy access to caching anywhere in the application but is also easy to replace when I move to Azure.

The code is below.
    public static class 
        WebCacheHelper
    {
        public static T TryGetFromCache<T>(string cacheName, string itemKey) where T:class
        {
            if (HttpContext.Current == null || HttpContext.Current.Session == null) { return null; }
 
            var sessionResult = TryGetFromSessionCache<T>(cacheName, itemKey);
            if (sessionResult != null){return sessionResult;}
 
            var applicationResult = TryGetFromApplicationCache<T>(cacheName, itemKey);
            return applicationResult;
        }
 
        public static T TryGetFromCache<T>(string cacheName, string itemKey,CachingLevel cachingLevel) where T:class
        {
            if (HttpContext.Current == null || HttpContext.Current.Session == null) { return null; }
 
            switch (cachingLevel)
            {
                case CachingLevel.Session:
                    return TryGetFromSessionCache<T>(cacheName, itemKey);
                case CachingLevel.Application:
                    return TryGetFromApplicationCache<T>(cacheName, itemKey);
            }
            return null;
        }
 
        public static void AddToCache(string cacheName, string itemKey, object cacheItem, CachingLevel cachingLevel=CachingLevel.Application)
        {
            if (HttpContext.Current == null || HttpContext.Current.Session == null) { return; }
 
            switch (cachingLevel)
            {
                case CachingLevel.Session: AddToSessionCache(cacheName, itemKey, cacheItem);
                    break;
                case CachingLevel.Application: AddToApplicationCache(cacheName, itemKey, cacheItem);
                    break;
            }
        }
 
        public static void RemoveFromCache(string cacheName, string itemKey)
        {
            if (HttpContext.Current == null || HttpContext.Current.Session == null) { return; }
 
            RemoveFromApplicationCache(cacheName,itemKey);
            RemoveFromSessionCache(cacheName,itemKey);
        }
 
        public static void RemoveFromCache(string cacheName, string itemKey, CachingLevel cachingLevel)
        {
            if (HttpContext.Current == null || HttpContext.Current.Session == null) { return; }
 
            switch (cachingLevel)
            {
                case CachingLevel.Session: RemoveFromSessionCache(cacheName, itemKey);
                    break;
                case CachingLevel.Application: RemoveFromApplicationCache(cacheName, itemKey);
                    break;
            }
        }
 
        private static string GetCacheKey(string cacheName, string itemKey)
        {
            return cacheName + "-" + itemKey;
        }
 
        #region Session Cache
 
        private static void AddToSessionCache(string cacheName, string itemKey, object cacheItem)
        {
            HttpContext.Current.Session[GetCacheKey(cacheName, itemKey)] = cacheItem;
        }
 
        private static void RemoveFromSessionCache(string cacheName, string itemKey)
        {
            var result = HttpContext.Current.Session[GetCacheKey(cacheName, itemKey)];
            if (result != null)
            {
                HttpContext.Current.Session.Remove(GetCacheKey(cacheName, itemKey));
            }
        }
        
        public static T TryGetFromSessionCache<T>(string cacheName, string itemKey) where T:class
        {
            var result = HttpContext.Current.Session[GetCacheKey(cacheName, itemKey)];
            if (result == null)
            {
                return null;
            }
            return (T) result;
        }
 
        #endregion
 
        #region Application Cache
 
        private static void AddToApplicationCache(string cacheName, string itemKey, object cacheItem)
        {
            HttpContext.Current.Application[GetCacheKey(cacheName, itemKey)] = cacheItem;
        }
 
        private static void RemoveFromApplicationCache(string cacheName, string itemKey)
        {
            var result = HttpContext.Current.Application[GetCacheKey(cacheName, itemKey)];
            if (result != null)
            {
                HttpContext.Current.Application.Remove(GetCacheKey(cacheName, itemKey));
            }
        }
 
        public static T TryGetFromApplicationCache<T>(string cacheName, string itemKey) where T:class
        {
            var result = HttpContext.Current.Application[GetCacheKey(cacheName, itemKey)];
            if (result == null)
            {
                return null;
            }
            return (T) result;
        }
 
        #endregion
 
    }

As you can see there is nothing clever going on here, there is a single AddToCache method for adding data to the cache, with a default parameter for cachingLevel so the calling code can override the default application-level caching to cache at the session level.

 There are a couple of RemoveFromCache methods, one where the calling code can specify which cache to remove the item from, the other removes the data from whichever cache it resides in.
 There are also two TryGet methods for retrieving from a specified caching level or from any that has a matching key.

CachingLevel is just an enum with items for Session and Application
     public enum CachingLevel
    {
        Application,
        Session
    }
I also use the following static string class to save having magic strings peppered through the application
    public static class WebCacheKeys    {
        public const string Users = "Users";
    }
I can just add strings as required and change the backing value if I need to without having to make a large number of changes elsewhere. If I had IOC available, I probably wouldn't have made this static and would've abstracted the functionality behind an ICacheHelper interface. The way I look at it, this is the next best thing in terms of ability to make changes in the future. When I move to Azure, I'll post the Azure-centric version of this helper.