Definition

data preprocessing

Data preprocessing describes any type of processing performed on raw data to prepare it for another processing procedure. Commonly used as a preliminary data mining practice, data preprocessing transforms the data into a format that will be more easily and effectively processed for the purpose of the user -- for example, in a neural network. There are a number of different tools and methods used for preprocessing, including: sampling, which selects a representative subset from a large population of data; transformation, which manipulates raw data to produce a single input; denoising, which removes noise from data; normalization, which organizes data for more efficient access; and feature extraction, which pulls out specified data that is significant in some particular context.

In a customer relationship management (CRM) context, data preprocessing is a component of Web mining. Web usage logs may be preprocessed to extract meaningful sets of data called user transactions, which consist of groups of URL references. User sessions may be tracked to identify the user, the Web sites requested and their order, and the length of time spent on each one. Once these have been pulled out of the raw data, they yield more useful information that can be put to the user's purposes, such as consumer research, marketing, or personalization.

This was last updated in September 2005

Continue Reading About data preprocessing

Dig Deeper on SQL Server Business Intelligence (BI) and Data Warehousing

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

Dateiendungen und Dateiformate

Gesponsert von:

SearchBusinessAnalytics

SearchDataCenter

  • How do I size a UPS unit?

    Your data center UPS sizing needs are dependent on a variety of factors. Develop configurations and determine the estimated UPS ...

  • How to enhance FTP server security

    If you still use FTP servers in your organization, use IP address whitelists, login restrictions and data encryption -- and just ...

  • 3 ways to approach cloud bursting

    With different cloud bursting techniques and tools from Amazon, Zerto, VMware and Oracle, admins can bolster cloud connections ...

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close