Scenario

What the doohickey am I talking about? Well, if you’re a SharePoint Developer that is tasked with using SharePoint to house referential metadata, you used to have 1 option. Prior to SharePoint Server 2010, SharePoint Lists were typically used to store this information. By using Lists as the datastore for referential information, we were able to easily provide end users with the ability to manage this information without having to spend any time creating a UI for them. This also enabled the end users to create and manage values which could then be consumed by various applications whether it be basic SharePoint List Forms via Lookup columns, InfoPath forms or custom Web Parts and ASP.NET forms.

SharePoint Server 2010 takes this concept to the next level by introducing the Managed Metadata Service that enables users ranging from Site Administrators to Enterprise Information Architects the ability to manage this information at an enterprise level. However, this blog post will not be used to delve into the features of the Managed Metadata Service. To learn more about the features of Managed Metadata Service and why you would want to use a metadata and taxonomy store in the first place, please see here. In this blog post, I will be discussing query performance between the two options (SharePoint Lists vs. Managed Metadata Service Term Stores).

Testing

So how did I run these tests… First I set out to create a simple Windows Form application as my testing vehicle. In this application, I have a variety of fields and buttons that let me toggle back and forth between querying a specific SharePoint list for values via CAML and being able to query the MMS Term Store for specific terms.

Here’s some sample code that I used to access the goodies:

List CAML Query

string queryString = "<Where><And><Eq><FieldRef Name=\"Platforms\" /><Value Type=\"LookupMulti\">" + 
InputTerm.Text + "</Value></Eq><Eq><FieldRef Name=\"Status\" /><Value Type=\"Choice\">Active</Value></Eq></And></Where></Query>";
bool found = false;
using (SPSite site = new SPSite("https://sharepoint/")) 
{  
   using (SPWeb web = site.OpenWeb())         
   {          
      SPList productsList = web.Lists["Products"];                 
      SPQuery query = new SPQuery();                 
      query.Query = queryString;                 
      SPListItemCollection products = productsList.GetItems(query);
      if (products.Count > 0)                 
      {
         found = true;
      }
      foreach (SPListItem product in products)
      {
         Results.Text += product["Title"].ToString() + "\n";
         Results.Text += found.ToString() + "\n";         
      }
   }
}

MMS Query  

using (SPSite site = new SPSite(siteURL))
{
   TaxonomySession session = new TaxonomySession(site);
   TermCollection terms = session.GetTerms(InputTerm.Text, true);
   foreach (Term term in terms)
   {
      Results.Text += term.Name + "\n";
      if (term.Terms.Count > 0)
      {
         found = true;
         foreach(Term childTerm in term.Terms)
         {
            Results.Text += childTerm.Name + "\n";
         }
      }
   }
}

As a side note, looking at the sample code above, we can see that it is much easier to query for a Term from a MMS Term Store when compared to querying a SPList via CAML because, simply, there’s no CAML involved! You may have also noticed that there’s no need to access the SPWeb objects, so that saves us some memory as well.

The results of these tests were performed against two datasets of relative size. In the first round of testing, I had an SPList with about 40 List Items up against a Term Store with also about 40 Terms. The result set size was configured for 10 each. In the 2nd round of testing, I had increased the number of objects in each container to almost 500 each with a result set size of 100.

Test Results

Small Query Sets
Query Objects =< 40
Result Set Size = 10

Medium Query Sets
Query Objects =< 500
Result Set Size = 100

Summary

As you can see from the test results above, retrieving metadata via the MMS is faster, uses less memory and the code is simpler in comparison to performing CAML lookups on lists. Query times against the MMS seem to be stable/unchanged when the datasets are increased from 40 to almost 500 objects. The CAML query times against lists with similar number of objects were roughly doubled or more on result sets of 100. So CAML query times increased as the list size increased vs. MMS query times did not increase with an increase in the number of terms in the term store. What do you think? 🙂

Advertisements