In Chapter 2, we described how to train a model using an INSERT INTO statement.
Using the tools to train models on the server is called processing. Analysis
Services. Data Mining has the ability to process all the models in a structure
in parallel on a single data read. It does this by creating a compressed cache of
the data that is used to train each of the models in the structure. This functionality
requires several processing options to control exactly what is processed
when, and how to clean up after you're done. The mechanism is described in
more detail in Chapter 13.
Note: Before processing a newly created or edited structure or model, you
must first send the object to the server. In immediate mode, simply saving your
work deploys the object. However, in offline mode, you must first deploy the
project. To do so select Deploy Solution from the Build menu. When you use the
default settings, deploying the project will also cause any objects in the project
to be processed.
Mining Models and Structures can have three states in regard to processing:
processed, partially processed, and unprocessed. A processed object is completely
finished and ready to go. Partially processed is an ambiguous state that
indicates that part of the object is processed and other parts are not. This may
be acceptable for your circumstances — for example, you may have a mining
structure with several mining models. At the current time, you may only want
to process one of the models within — the structure would then be partially
processed. Unprocessed implies that the object contains absolutely no data
whatsoever.
The processing options for Mining Structures and Mining Models are as
follows:
Process Full: Process Full causes the object to be completely reprocessed
from the source data. When this option is sent to a mining structure,
the structure is processed and then each model within is processed in
parallel. When sent to a model, the source data is only read if the structure
has not been processed.
Process Default: Processing an object with Process Default causes the
server to do whatever it takes to bring the object to a fully processed
state. For example, if the object is already processed, the server will
perform no action or if you edit a model within a structure and send
Process Default to the structure, the server will process that one model
without rereading the source data.
Unprocess: Unprocess causes the object to be completely unprocessed,
dropping all data associated with that object. Sending this command to
a structure causes any caches to be cleared and contained models to be
unprocessed.
Process Structure: Process Structure is only valid on a mining structure
and causes the structure to read and cache the source data without processing
the contained models. Executing subsequent Process Full and
Process Default commands on the models will process information
from this cache.
Process Clear Structure: Using this option on a structure causes the
structure to drop any cached source data while leaving the contained
models processed. This greatly reduces the disk footprint of your mining
structure at the cost of having to reread the data on the next process
command. Additionally, drill-through functionality on any contained
models will be disabled until the models are reprocessed.
Processing the MovieClick Mining Structure
Here, we will process the MovieClick Mining Structure.
In Immediate mode:
1. Save your structure by clicking the Save button on the toolbar.
2. Select Process Mining Structure and All Models from the Mining Model
menu, or click the Process button on the Designer toolbar.
3. Click Run in the processing dialog.
In Offline mode:
1. Select the Deploy option from the Build menu. By default, deploying
the solution will process all objects.
2. If the default has changed, deploy the solution and follow the instructions
for Immediate mode.
At this point, the Processing Progress dialog will appear, providing status
information for the processing operation. When the process is complete, you
can view details about each step, including the processing time.
Click here to return to the complete list of book excerpts from Chapter 3, 'Using SQL Server 2005 data mining,' from the book Data Mining with SQL Server 2005.