Thursday, June 26, 2008

Implementing a .NET SortedKeyedCollection

I've been looking for a solution for this one for a while. The .NET Framework v2 provides a generic KeyedCollection<TKey, TItem>, that stores objects using a key which is a property on the object being stored, which is a fantastic concept. However, the objects are stored in a standard dictionary, in the order in which they were inserted, and there's no provision in the class for sorting the collection.

So I turned to the web for a solution - surely, someone somewhere must have tried to do the same thing as me. As it turns out, they had. This wasn't the only such example, but unfortunately, the advice given was always the same - add a Sort method, which is clunky, and forces the implementor to make an extra step to get their collection sorted; or use SortedList<TKey, TItem> or SortedDictionary<TKey, TItem>. These collections are full featured, containing all sorts of helpful methods and properties. I see similar arguments made when suggesting people use List<T> for pretty much any collection. All well and good for 90% of developers, but what do you do about the 10% who are API developers?

The reason why these classes are not so good for API developers is that these helpful classes are not easily extended - few, if any, of the methods or properties are virtual. Microsoft's own recommendation for API developers and developers of third-party libraries is that they use the Collection<T>, KeyedCollection<TKey, TItem>, and ReadOnlyCollection<T>, as these classes deliberately provide virtual methods which can be overridden to provide customisation.

So, here is my implementation of a SortedKeyedCollection<TKey, TItem>, which extends the KeyedCollection and sorts the objects at the point at which they are inserted. I'm adding this code to Google Code, more as a safe repository than to start an open source project - the project's just a shell at the moment, seeing as I can't get an SVN connection to it from work.

   1:  using System.Collections.Generic;
   2:  using System.Collections.ObjectModel;
   3:   
   4:  public abstract class SortedKeyedCollection<TKey, TItem> : KeyedCollection<TKey, TItem>
   5:  {
   6:      protected virtual IComparer<TKey> KeyComparer
   7:      {
   8:          get 
   9:          {
  10:              return Comparer<TKey>.Default;
  11:          }
  12:      }
  13:   
  14:      protected override void InsertItem(int index, TItem item)
  15:      {
  16:          int insertIndex = index;
  17:   
  18:          for (int i = 0; i < Count; i++)
  19:          {
  20:              TItem retrievedItem = this[i];
  21:              if (KeyComparer.Compare(GetKeyForItem(item), GetKeyForItem(retrievedItem)) < 0)
  22:              {
  23:                  insertIndex = i;
  24:                  break;
  25:              }
  26:          }
  27:   
  28:          base.InsertItem(insertIndex, item);
  29:      }
  30:  }

The InsertItem override does most of the work - if you're adding a new object, it intercepts that call, and works out the index that you should be inserting the object, based on its key. By default, it uses the Comparer<TKey>.Default, so string keys are compared using the default (case-insensitive) comparer, and other primitives such as ints and decimals are compared in a similar fashion. By providing a virtual property KeyComparer, I've allowed the user of the class to provide their own IComparer<TKey>, should they want to use a specific comparer, for instance if the key is a custom class.

2 comments:

Unknown said...

First of all, thank you for your posting.

However, by your implementation, I have to compare the new item with all the existed ones, EVERY TIME I want to insert a new item.

Suppose I want to insert 1000 items, the performance is highly doubted. From this point of view, maybe a manual call of sort after inserting all items will be better.

David Keaveny said...

Well, we could probably maintain an internal array of keys, and use Array.BinarySearch<T> on that, but we'd have to do some performance testing to see how that measures up.

Changing the implementation for adding multiple items is a little harder, as since the class inherits from KeyedCollection, there are relatively few methods to override. I could shadow the Add methods, but it's not quite so tidy.