Why I love SSIS’ xml task

You too love it, right?? Yes I know, SSIS is such a great tool that’s pretty easy and fast to learn. And, XML’s been a revolution in the way data is being stored, transmitted and consumed. Ultimately, SSIS and XML make a cool combo, don’t they??

In this post, I would like to explore how SSIS’ XML task component could be used to compute Null Ratio Percentage as part of data source profiling and cleansing. I chose this topic because I had seen a lot of blogs out there that were using script task instead to do this job, because of some constraints with namespaces, I guess. I do not say that this is wrong, but just that SSIS being such a popular tool, any developer who is not into VB/ VC++ should be able to use it. I myself being an Object Oriented Guy (Object Oriented Programming + Programming Guy = Object Oriented Guy), find it a bit difficult to straight  away write VC++ code. So just to give those ‘other developers’ a seamless experience, here is my attempt.  So, Ready, let’s dive in…

Here is what my demonstration package looks like

Address Line 2 Profiling and XML Task container


Yes, I set up a Null Ratio Data profiling task for Addressline2 column in Adventure Work’s Person table. This as you know creates a .xml ouput, which looks like

xml version=”1.0″?>
<DataProfile xmlns:xsd=”http://www.w3.org/2001/XMLSchema&#8221; xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance&#8221; xmlns=”http://schemas.microsoft.com/sqlserver/2008/DataDebugger/”&gt;
<DataSources>  …  </DataSources>
<DataProfileInput> ..  </DataProfileInput>
<ColumnNullRatioProfile ProfileRequestID=”NullRatioReq” IsExact=”true”>
RowCount=”19614″ />
<Column Name=”AddressLine2″ SqlDbType=”NVarChar” MaxLength=”60″ Precision=”255″ Scale=”255″ LCID=”1033″ CodePage=”0″ IsNullable=”true” StringCompareOptions=”0″ />

Our interest is the total RowCount; 19614 and the NullCount which is 19252. Based on these results, I would like to compute the Null Ratio as 19252/19614 which is 98%+, yup, pretty good.

Next comes the XML Task, and shown below is a detailed task screen grab

XML Task Editor SSISHere is what the XML task does

  • Read the output .xml of the data profiling task
  • Execute the following XPath query against the profile output to get Null Count value


Note: If you had noticed, the *[local-name()=..] is the trick to read nodes using default namespaces.

  • Populate the Null Count value into a package variable for later use

Yup, the third is a script task, and I know, I know, I myself am using a script task here, but just to show you guys what the variable’s value is after the execution, and here it is

XPath Result : 19252

XPath Result : 19252

Cool, so we have what we need. Similarly we could even read the Row Count value and compute the Null Ratio Percentage. However, I would leave to it to you further, to compute the ratio. By the way, here is the XPath expression to read Row Count though,


So, how do you plan to do the Null Ratio Percentage further?? Write to me.. Till I catch you again, Adios..


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s