Retrieve XML data values with XQuery in SQL Server 2005

When you want to access specific XML data values -- instead of XML as a single data type value -- the XQuery is the scripting language you're looking for. Microsoft designed XQuery specifically to access XML data in SQL Server 2005, allowing you to retrieve XML columns, variables and parameters. In this tip, you'll learn how to use two XQuery methods within your T-SQL statements: Value() to retrieve a single XML value and Query() to retrieve multiple XML values.

When Microsoft released SQL Server 2005, it introduced a new data type: XML. As with other SQL Server data types, you can use the XML data type to define columns, variables and parameters for stored procedures and user-defined functions, and you can access XML data in its entirety -- that is, as a single value -- just like you would an XML document. However, as is the case with an XML document, there might be times when you want to...

access only specific values within the XML data. And that's where XQuery comes in.

XQuery is a powerful scripting language designed specifically to access XML data. SQL Server 2005 supports a subset of the XQuery language that lets you access values within an XML column, variable or parameter. You can use XQuery within your Transact-SQL statements by calling methods supported by the XML data type. Two of these methods -- value() and query() -- are particularly useful for retrieving specific elements from your XML data.

In this article, I introduce you to both methods and provide examples of each. Note that the article assumes you are already familiar with T-SQL and XML in SQL Server.

The XML value() method

When you call an XML method within your Transact-SQL statement, you specify the XML column, variable or parameter name, followed by a period, the method name and an XQuery expression within parentheses. For example, to call the value() method on an XML column named XmlInfo, you would use the following syntax:

XmlInfo.value(<XQuery expression>)

The XML value() method returns a scalar (single) value of the type specified when you call the method. The value() method takes two arguments. The first identifies the element and value to retrieve, and the second specifies the data type of the returned value. Let's look at an example to demonstrate how this works. The following statement retrieves data from the JobCandidateID and Resume columns in the HumanResources.JobCandidate table (part of the AdventureWorks sample database):

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
(/ns:Resume/ns:Name/ns:Name.Last)[1]',
'nvarchar(30)'
) AS LastName
FROM HumanResources.JobCandidate

The second column in the SELECT list (Resume) is defined with the XML data type. As a result, you can call any of the XML methods. In this case, I call the value() method and pass in two arguments. Each argument is enclosed in single quotes and they are separated with a comma. Now let's take a closer look at the first argument.

The first argument is divided into two parts and separated by a semicolon. The first part declares a namespace and assigns it to the "ns" alias. You must specify the namespace for any typed XML column. A typed column is one that is associated with a specific schema. In SQL Server Management Studio, you can determine the schema associated with an XML document by retrieving the entire value from the XML column and then clicking on the returned value. The XML document opens in a separate window, as shown in Figure 1. Notice that the schema is defined as an attribute of the root element.

Determining the schema associated with an XML document.
Figure 1: Determining the schema associated with an XML document.
(Click on image for enlarged view.)

The schema listed in the XML document is the one you use as your namespace when calling the value() method. If the XML column is untyped (not associated with a schema), you do not need to declare a namespace, and you can omit the first part of the argument.

The second part of the argument identifies the XML element containing the value you want to retrieve.

The element is essentially a path name separated by forward slashes. When working with a typed XML column, you must precede each node in the path name with the namespace alias, followed by a colon. In this case, I precede each node with the "ns:" preface. Refer again to Figure 1 and where you can see this element near the top of the document.

Notice that the path name is enclosed in parentheses and is followed by [1]. Because the value() method can return scalar values only, you must specify [1] after the closing parenthesis to ensure only one instance of the element can be returned, even if there is only one instance in the XML document. After you specify the path, you must specify the data type. The SQL statement will now retrieve the last name of each job candidate and return it as data type NVARCHAR.

Like any language, XQuery supports a wide variety of functions. For example, the following statement uses the concat function (in the second part of the first argument) to concatenate the first name with the last name:

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName
FROM HumanResources.JobCandidate

The concat function takes multiple arguments that are separated by commas. Each argument is the value you want to concatenate. Notice that the first and third arguments use the same path construction as the previous example. Whenever you refer to an element, you must use the complete path. For information about the concat function and all of the XQuery functions, see Microsoft SQL Server 2005 Books Online.

The XML query() method

Although the value() method is handy for retrieving a single value from an XML column, you'll often want to retrieve multiple values. In these cases, you should use the XML query() method. The query() method takes only one argument and returns the specified elements as XML. For example, the following SQL statement returns the Education elements and their child nodes for each job candidate:

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
/ns:Resume/ns:Education'
) AS Education
FROM HumanResources.JobCandidate

The query() method's argument is enclosed in single quotes and separated into two parts. Again, you must identify a namespace when retrieving data from a typed column. You declare the namespace in the same way you declared it for the value() method. After you declare the namespace, specify which elements to retrieve. In this case, the Education elements and all their child nodes are returned, as shown in Figure 2.

Once the namespace is declared, specify which XML elements to retrieve.
Figure 2: Once the namespace is declared, specify which XML elements to retrieve. (Click on image for enlarged view.)

You can achieve the same results by casting the second part of the argument as a FLWOR (pronounced flower) expression, which, by definition, is made up of "for, let, where, order by, and return" clauses. Note, however, that SQL Server currently does not support the let clause.

The XML query() method in the following example uses the "for" and "return" clauses to retrieve the Education elements and their child nodes:

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education
return $ed'
)AS Education
FROM HumanResources.JobCandidate

As you can see, the FOR clause includes the $ed variable. You do not have to explicitly declare this variable. Using it in the FOR clause is enough. The variable is used to iterate through the Education elements. (You can use any name for the variable, as long as you follow SQL Server naming conventions.) The RETURN clause then references the $ed variable. As a result, it returns each Education element and its child nodes.

Of course, there's no reason to use a FLWER expression simply to retrieve an element and its child nodes. However, you can include an expression in the path name of your FOR clause (in brackets) that limits the results returned. For example, the following statement limits the results to those Education elements that contain an Edu.Level value of "Bachelor":

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education[ns:Edu.Level="Bachelor"]
return $ed'
) AS Education
FROM HumanResources.JobCandidate

Notice I use an equal comparison operator (=) to compare the Edu.Level element value to the string value. This forms a Boolean expression that must evaluate to true in order for the element to be included in the results. XQuery supports a variety of operators for creating Boolean expressions. You can view the supported operators in Microsoft SQL Server 2005 Books online.

Rather than specify a Boolean expression in the FOR clause, you can define the same logic in a WHERE clause, as in the following example:

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education
where $ed/ns:Edu.Level="Bachelor"
return $ed'
) AS Education
FROM HumanResources.JobCandidate

In this case, the FOR clause includes only a simple path, and the WHERE clause includes the Boolean logic. Notice that the path name uses the $ed variable to point to the correct Education element that the FOR clause is iterating through.

Putting this logic in the WHERE clause makes it easier to read and write your code, particularly if you use complex Boolean expressions. For example, the next statement uses the "and" logical operator in the WHERE clause to limit the results to business majors with bachelor's degrees:

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName,
Resume.query(('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education
where $ed/ns:Edu.Level="Bachelor" and $ed/ns:Edu.Major="Business"
return $ed'
) AS Education
FROM HumanResources.JobCandidate

The two path names are joined together with the "and" operator. As a result, both conditions must evaluate to true in order for the Education element to be returned.

Now let's look at an example that includes an ORDER BY clause. The following statement retrieves employment information related to technology and production:

 SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)'
) AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $emp in /ns:Resume/ns:Employment
where $emp/ns:Emp.FunctionCategory = "Production" or
$emp/ns:Emp.FunctionCategory = "Technology"
order by $emp/ns:Emp.EndDate descending
return $emp'
) AS Employment
FROM HumanResources.JobCandidate

As in the preceding examples, the statement uses a FOR clause to identify the desired elements (in this case, Employment) and the WHERE clause to limit the results. Notice, however, that the statement also includes an ORDER BY clause. You can use this clause to sort your results based on a specified node. In this case, I'm sorting the results based on the Emp.EndDate element, and I'm sorting the information in descending order. Accordingly, the most recent employment information will appear at the top of the list.

As you can see, the value() and query() methods can be quite useful when retrieving XML data. However, what I've shown you here only scratches the surface. XQuery is a powerful language that allows you to write complex queries that return XML data in exactly the way you need that data to return. Because of the rich variety of functions, operators and expressions supported by XQuery, you can retrieve just about any element and attribute stored in your XML columns, variables and parameters – in just about any format necessary. Again, be sure to refer to SQL Server 2005 Books Online for additional information about XQuery and for more examples that demonstrate how it works.

ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. You can find more information at  http://www.rhsheldon.com.

This was first published in April 2008

Dig deeper on XML in SQL Server

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close