Using the Pivot transformation in SQL Server Integration Services

Using the Pivot transformation in SQL Server Integration Services

Date: Nov 20, 2009

Pivoting is a useful way to transform data in SQL Server Integration Services (SSIS) -- but it's hardly intuitive.

Related videos:

Saving time with text data sources in SSIS

Using package configurations in SQL Server Integration Services (SSIS)


Read the full transcript from this video below

Using the Pivot transformation in SQL Server Integration Services

Eric Johnson: Hello. Welcome to this screencast on Using
the Pivot Transformation on SSIS. My name is Eric Johnson.
I am a SQL server MVP and a consultant with Consortio
Services, in Colorado Springs. Today we are going to look
at the pivot transformation because it is one of those tasks
in SSIS that you will often find that you have a use for, but
it is very non-intuitive, I guess is the best way to say it, to
configure it. So I am going to walk you through an example
and show you how it works.

To get started, the first thing we need is a data flow. I have
is a nice blank SSIS package that I have added nothing to,
and I want to drop a new data flow task onto our SSIS
package, and we are going to go into there. I have a couple
of different tables set up, and I will show you the layout of
those tables, as I link them. I am going to start with a OLE
DB data source, and double click it. As you can see, this
allows me to define my properties of that OLE DB connection.
We are going to click New, and when this come up, were going
to click My Local Host, which is ejohnsonmobile, and the
Adventurework data base. Click OK. It is a Table or View. I
have a table in here that is called 'Questions.' That is going
to be our source data.

If I preview this, you will see this is 4 questions that were
asked of 3 different individuals. You can see they were
each asked for their name, height, favorite color, and age.
Also, this is the Survey ID column, so where you see survey
ID 1, this was all one person, so it relates back to the Eric
Johnson record, 2 relates to the Josh Jones record, and so
on. What we want to do is take this data and pivot it around
so we have a favorite color, an age, a name and a height
column that contains the answer they gave. When we are
done, we want 3 rows, 1 for each survey that was
conducted. That is our source data.

I am going to click OK to accept that. I am going to drag
down to the bottom of my toolbox and pick an OLE DB
destination. Basically, for the OLE DB destination, it is
going to yell at us about input columns, but we will fix that
in a second. We are going to go to a table in the same
database, but this time called 'Questions Pivot', basically
just a pivoted version of that table. As you can see, it has
an ID, name, height, favorite color, and age. Those are the
3 columns that we are going to go ahead and populate.
Since we have not done column mappings, this is not
going to let me save these changes until we have some
incoming data, so we are going to come back to this
transformation in a minute, but I just wanted you to
see how that table looks. For now I will click Cancel.

We are just going to drop a pivot transformation task onto
our data flow, and it is just called 'Pivot,' and just drop it in
the middle. We will take the output from the source data and
feed it into the pivot. Now we are going to go in and actually
configure the pivot. This is where things get just a little bit
tricky, as the pivot is not exactly, as I said earlier,
straightforward.

We are going to double-click the Pivot Transformation, and
that is going to open up a fairly typical editor window. If you
have worked with SSIS, this will look fairly similar to any of
the other transformations. First thing we have to do is go to
the Import Columns" tab and tell it which columns we want
available to our pivot. We do not want ID, because we really
do not care, that is just an identity column for the data in that
table, so we are just going to ignore that one. I do want
another Question and Answer, and Survey ID. Here is where
it gets a little bit tricky. You have to come over to Input and
Output Properties. Here, you are going to see a whole bunch
of listings of columns; input, output, and the error output,
again, very similar to other transformations. We actually have
to go into these properties and edit each of the columns
individually in order to pivot them.

There are a few things we need to change and the first thing is
the option for the pivot. If you come down here, you will see a
property called Pivot Usage; this can be a 0, 1, 2, or 3. 0 means
that the column is not pivoted. 1 means it is part of the key that
identifies the row. 2 means the column is the pivot column, and
a column will be created from that row of data. Lastly, is where
the values from where the new columns are coming from, as
far as the data is concerned, as it rests now.

This first column, Question, this is the column that we are
actually going to pivot, and Question is going to become our
columns in our new table. We will set this to a 2. Answer, this
is where our values come from, and this is option 3. This is
all documented in BooksOnline if you need to look up these
options. I did run through them quickly, but we will just keep
going. Question 2, Answer 3. Survey ID is part of the set that
identifies our key, so we are going to make that a 1. If you
had other columns that you wanted to just pass through and
use, you could set them to 0. In our case, we do not, so that
is enough.

Step 2: You need to configure your pivot output. As you see,
there are no output columns. For the output columns, we are
going to need to add one output column for every column
that is in our destination. I am just going to click Add Column
a few times. Actually, I am going to need 5 columns in order
to support my data. The first column is going to contain survey
ID, so I am going to call it ID. The next column will end up
containing the name of the person surveyed. The next column
we will use as age. The next column we will use as favorite
color, I will just call it FavColor. The last column we will go
ahead and use as height.

This is not quite done, we have to map the lineage from the
input columns to the output columns. The survey ID is going
to be read from the value of SurveyID. In order to do that,
you will see we have a Source Column value. What we need
to do is to make sure that Source Column value has the
lineage ID of the column that it is reading from. If I click up
to SurveyID in our input, you will see it has a lineage ID of
26. I come back down to my output; I need to tell it my source
column is 26. For name, age, favorite color, and height, these
are all sourced from the answer, that is where the value for
this is coming from, the Answer column. If you look in Answer,
it has a lineage of 23. For name and age, favorite color, and
height, we put 23.

Last but not least, we have to tell these 4 columns how they
know they are pulling the correct value from the source data.
That is what the pivot key value property is for. For name, we
have to tell it that the pivot key value is Name. We will take a
look at the source data before we run this, but that is telling it
that when it finds Name in the pivot column, which we have
already told it that Question the pivot key. When it finds
Name and that pivot key, it pulls its data from the Answer
column. Same for age, so the pivot key value for age was
Age. Favorite color, it was actually Favorite Color with a
space. Height was Height.

What that basically means is search the pivot column, which
is Question, for the value Height, and then pull the answer,
that is why the source is 23, which is the Answer column and
the pivot key value is Height. That should pretty much do it. I
have got all that done. I am going to go ahead and click OK,
now I have my pivot.

I am going to go ahead and join the pivot up to the OLE DB
destination, which we configured earlier. We are going to
come back in and reselect the Questions pivot, again, we
want to go into ID, Name, Height, Favorite color, and Age.
We now have those columns, because we have accomplished
the pivot. You will see, here is Name, Height, and Age.
Favorite Color did not map because I mapped it funny, I
typed FavColor. We want to use SurveyID in the ID
column of our destination table.

I am going to go ahead and click OK. It is saying,
'Truncation may occur because the name column has
a length of 100, but in the data source it is less than that.'
That is just because the source column in our pivot came
in from the Answer column, which was 100, so all the
output columns inherited the exact same data type length.
The data base name is only 50, the Height is 10, Favorite
color is 50, and Age is an integer.

Last but not least, to show this working, I am going to put a
couple of data viewers on. Go to the Data Viewer tab. If you
do not know how to use data viewers, they are really convenient,
you should really check those out, and it might be a topic we
talk about at a later time in another screencast. I am going to
add a grid data viewer and I am going to add a grid data viewer
after the pivot. What this will do is stop our package and allow
me to see what the data looks like in-flight before it gets
written to the destination.

I am just going to right click Package1, select Execute, and it
will just take one second to validate, then it should start running.
You can see I already got this window that has popped up in my
face, and I am going to rearrange some of the windows so you
can see where we are at. We have read the data from the
source and retrieved the 12 rows. This should look very familiar;
this is the same data we previewed earlier, that is our source
data. We are going to go ahead and let the package continue
by clicking the green arrow. Here is the data after it has been
pivoted. I will just put these two side-by-side. You can see I
now have one row for SurveyID 1, which was these 4 rows previously,
and its, name Eric Johnson, Age, Favorite Color and as I scroll over,
Height; same thing for this Survey 2, and Survey 3. I have taken these
12 rows of data, collapsed them to 3 rows of data, and put the various
values into 2 columns. That has basically pivoted the data. If I were to
let this play all the way through, I might get an error, depending on
how it functions with the truncation warning. In fact, it did not, it went
ahead and finished successfully. That is how the data now looks in
the new Questions pivot table and this is how it looked in the
source table.

That is basically it. We have gone through and pivoted our data.
Like I said, it is a fairly manual process. There is a lot out there on
it, if you want to look up some manual references. Hopefully this
gave you a little bit of information and insight as to how to get
this done. Thanks a lot.

More on Microsoft SQL Server Integration Services (SSIS)

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: