Part of any data mining project is learning about and understanding the nature
of your data. By leveraging controls from Office Web Components (OWC), the
DSV Designer provides the functionality to explore your data in four different
views. By right-clicking a DSV table and selecting Explore Data, you can view
your data as a table, pivot table, simple charts, and a pivot chart. By default, the
Explore Data component will sample 5,000 points of your data. The option buttons
in the upper left of the Explore Data window allow you to change this setting
to a maximum of 20,000 points, due to a limitation of the OWC controls.
The tabular views allow you to do a simple exploration of your data. Clever
use of the pivot table will allow you to get a better understanding of the data
by arranging, slicing, and aggregating your data in different ways. For example,
by exploring a pivot chart on the Customers table, you can find the average
Age and its standard deviation by using the Bedrooms column we created
previously. (See Figure 3.8.) This is possible because we are exploring the DSV
table and not the actual source table as it is in the data. We can explore Named
Queries in the DSV in precisely the same manner.
The graphical exploration offers a page of simple column, pie, and bar
charts plus a pivot chart view. Using the simple charts you can see histograms
and pies of various attributes side by side. If your data is continuous, the chart
divides the continuous range into 10 buckets. The pivot chart, on the contrary,
provides a wealth of graphing controls to analyze your data, from your standard
line, bar, scatter, column, and pie charts, to more exotic types such as
doughnut and radar charts, as shown in Figure 3.9.
The pivot table and chart have many configuration options to help you
analyze your data in different ways. Many of these are available through the
context-sensitive Command and Options dialog box, from the Context menu,
or from embedded toolbars. Virtually every aspect of the tables and charts can
be modified, either by graphically selecting the object or by using the selection
box on the General tab of the dialog. Describing the full feature set of the OWC
could easily fill another book and mastering the OWC controls for best value
will take some practice, but with experience you will be able to manipulate the
controls to find exactly the right view for you. Additionally, the pivot table and
chart are linked, so you can switch back and forth, make edits, and see how the
change affected the other view.

Figure 3.8 Exploring data with the pivot chart

Figure 3.9 Radar chart showing Age by Bedrooms and Home Ownership
One additional feature of the pivot chart that is important for data exploration is graphical named query generation. By clicking the Named Query button on the toolbar, you can use elements of the chart to define a named query. For instance you could select only those homeowners with one bedroom and renters with four or more on the chart and add them to the query. This named query becomes like any other and can be used as a source for exploring data.
Note: Although the Explore Data window looks like other document windows, it
is, in fact, a tool window like the Solution Explorer and Properties windows. By
right-clicking the Window tab you can change the Explore Data window into a
floating or dockable window. You can also open up many Explore Data
windows on different DSV tables to display charts and tables side by side.
Click here to return to the complete list of book excerpts from Chapter 3, 'Using SQL Server 2005 data mining,' from the book Data Mining with SQL Server 2005.