BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
PowerPivot is a Microsoft Excel add-in that lets you incorporate business intelligence (BI) into the Excel environment. You can create calculated columns and measures based on Data Analysis Expressions (DAX), a new formula language designed specifically for analyzing data in related tables. DAX formulas are similar to traditional Excel formulas but have far more functions, and while Excel formulas limit you to working with ranges of data, DAX lets you work with much larger data sets.
To create a calculated column or measure, you define a DAX formula that incorporates the functions, operators and column references necessary to carry out the expression’s logic. You must start the formula with an equal (=) sign and enclose all column names in brackets, as in [SalesAmount]. If you’re referencing a table other than the one in which you’re creating the column or measure, precede the column name with the table name, as in FactInternetSales[SalesAmount].
The DAX language supports operators for performing mathematical operations, such as addition (+) and subtraction (-); operators for comparing values, such as greater than (>) and lesser than (<); a concatenation operator (&) to connect two string values together; and the logical operators double ampersand (&&) to create an AND test condition and a double pipe (||) to create an OR condition.
Using DAX formulas for calculated columns
When you create a calculated column, you must define a DAX expression. In the following example, I define a DAX expression that includes both column references and arithmetic operators:
=FactInternetSales[SalesAmount] + FactInternetSales[TaxAmt] + FactInternetSales[Freight]
Notice that I precede the expression with an equal sign and use the addition operator to add the SalesAmount, TaxAmt and Freight columns together. Figure 1 shows the calculated column (TotalSales1) and its expression.
Because I reference columns in the same table in which I’m also creating the calculated column, I could have omitted the table name, as in the following example:
=[SalesAmount] + [TaxAmt] + [Freight]
Notice how much this simplifies the expression and, as you can see in Figure 2, the new column (TotalSales2) returns the same data as TotalSales1.
DAX supports more than 130 functions. The principles for using each function are the same; that is, when you call the function, you include one or more arguments (separated with commas) enclosed in parentheses. In the following DAX expression, I use the RELATED function to create a calculated column that retrieves data from the DimCustomer table:
=RELATED(DimCustomer[FirstName]) & " " & RELATED(DimCustomer[LastName])
For each RELATED function, I pass in only one argument -- the name of the target table and column. You can use this function to retrieve data from any table related to the one in which you’re creating the calculated column. I then use the concatenation operator to join the first name, a space (in quotes), and last name. Figure 3 shows the new calculated column (FullName) and the DAX formula, shown in the formula bar above the PivotTable.
Creating measures with DAX formulas
When used in a PivotTable, a measure is a column that includes data automatically aggregated across multiple dimensions - that is, across a PivotTable’s columns, rows and slicers. For instance, in the following DAX expression, I define a measure (in the FactInternetSales table) that adds together the values in the SalesAmount column:
After I include the column name as an argument to the SUM function, I can add the measure (TotalCost) to the Values section of my PivotTable, as shown in Figure 4.
The TotalCost measure provides totals for each product and for each region -- and for each product across all regions. If you were to scroll to the bottom, you would find totals for each region and for all regions and products.
A limitation of the SUM function in DAX and Excel is that you can include only one column as an argument. However, DAX provides a set of functions that let you include expressions with multiple columns. For example, in the following DAX expression, I use the SUMX function to add together the values returned by an expression that finds the difference between two column amounts:
=SUMX(FactInternetSales, [SalesAmount] - [TotalProductCost])
The SUMX function takes two arguments. The first is the name of the table that contains the source data, and the second is the function’s expression -- the SalesAmount value minus the TotalProductCost value. Figure 5 shows how the measure (NetCost) aggregates the data.
The measure aggregates data across different products and regions and provides totals for each category. The following DAX expression also returns the net sales amount, but only for those products with a Current status:
=SUMX(FILTER(FactInternetSales, RELATED(DimProduct[Status]) = "Current"), [SalesAmount] - [TotalProductCost])
The first argument of the SUMX function now includes the FILTER and RELATED functions. The FILTER function takes two arguments. The first is the source table and the second is the filter condition. In this case, the condition specifies that the measure should aggregate data only for those products in the DimProduct table whose Status value equals Current. Notice that I use the RELATED function to reference the DimProduct table. Figure 6 shows how the measure (CurrentSales) aggregates the data.
The examples I’ve shown here are only a sample of the types of formulas you can create with the DAX language in PowerPivot. You can analyze data based on date ranges, create conditions that further refine the data or include data taken from multiple tables. Before, performing these types of calculations would have required an in-depth knowledge of relational and online analytical processing data structures. With DAX, you can incorporate sophisticated analytics and comprehensive BI into your Excel environment, turning your spreadsheet world into a wide-open frontier.
ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books, articles and training materials related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. Check out his blog, Slipstream.