Problem solve Get help with specific problems with your technologies, process and projects.

XML support in SQL Server 2000

A detailed introduction to the three ways you can use XML in SQL Server 2000, with lots of code samples.

Microsoft came up with SQL Server version 2000 only 18 months after its immediate predecessor -- SQL Server 7. Still, the new and improved version has many exciting and long awaited features: arrays, support for larger integers, and improved administration and monitoring tools, to name a few. I wouldn't be the only one to admit that native XML support in SQL Server 2000 is one of the revolutionary steps towards making the software more appropriate for the web environment.

You can interact with SQL Server 2000 with XML in several different ways:

  1. extensions to Transact-SQL
  2. integration of SQL Server and Internet Information Server (IIS)
  3. using XML Updategrams

Let's step back for a minute and think about the reasons why XML support is such a big hit. First, all data is enclosed in a string, so passing it among the various tiers of applications is very efficient. Secondly, XML is supported by all platforms, again due to the fact that all data elements are presented as a string. In addition, it is very easy to apply an XSL (eXtensible Stylesheet Language) template to XML and convert the data into HTML, which is the basis of all web pages.

In this article I'd like to discuss some of the XML extensions to Transact-SQL that let you retrieve relational data in a XML format. SQL Server 2000 allows you to include a FOR XML clause with SELECT statements, which returns a XML hierarchy instead of a recordset. In addition, transact-SQL supports the OPENXML function which lets you read an XML document, convert it into a rowset, and manipulate the data as needed. Since each of these transact-SQL extensions are quite extensive, this article will only cover the FOR XML clause of the SELECT statement. Future articles will discuss the OPENXML function, integration of IIS and SQL Server, and the Updategrams.

The FOR XML clause has three mutually exclusive modes: AUTO, RAW and EXPLICIT. The first two modes are very easy to use, yet do not offer much flexibility. Let's look at the AUTO mode first. The AUTO mode shapes the query results into an XML hierarchy consisting of the table name as the parent tag and the column names as attributes. For example, if we were to retrieve the top row of customers table in the Northwind database using this query...

SELECT TOP 1 * FROM customers FOR XML AUTO

...we'd get the following results:

<customers
CustomerID="ALFKI" 
CompanyName="Alfreds Futterkiste" 
ContactName="Maria Anders" 
ContactTitle="Sales Representative" 
Address="Obere Str. 57" 
City="Berlin" 
PostalCode="12209" 
Country="Germany" 
Phone="030-0074321" 
Fax="030-0076545"/>

If we wish to return column names as elements instead of attributes we can override the default behavior by specifying the ELEMENTS option, as follows:

SELECT TOP 1 * FROM customers FOR XML AUTO, ELEMENTS

Results:

<customers>
<CustomerID>ALFKI</CustomerID>
<CompanyName>Alfreds Futterkiste</CompanyName>
<ContactName>Maria Anders</ContactName>
<ContactTitle>Sales Representative</ContactTitle>
<Address>Obere Str. 57</Address>
<City>Berlin</City>
<PostalCode> 12209</PostalCode>
<Country>Germany</Country>
<Phone> 030-0074321</Phone>
<Fax> 030-0076545</Fax>
</customers>

The AUTO mode works very well with the single-table queries, however, most of the queries will refer to multiple tables. In this case the AUTO mode returns each table name as a parent tag for the columns that appear in the table. The drawback is that the nesting of the XML structure might not be exactly what you expect. The AUTO mode simply treats the first table mentioned in the query as the parent of the second table, the second table will be the parent of the third, etc. Let's examine the following query:

SELECT TOP 1 OrderDate, ShipCity, ShipCountry, ContactName, ContactTitle,
ProductID, UnitPrice
FROM orders INNER JOIN customers ON Orders.CustomerID = Customers.CustomerID
INNER JOIN [order details] ON [order details].OrderID = orders.OrderID
WHERE orders.customerid = 'AROUT'
FOR XML AUTO

Results:

<orders OrderDate="1996-11-15T00:00:00" ShipCity="Colchester" ShipCountry="UK">
<customers ContactName="Thomas Hardy" ContactTitle="Sales Representative">
  <order_x0020_details ProductID="24" UnitPrice="3.6000"/>
</customers>
</orders>

Notice that the Orders tag is a parent of Customers, and Customers tag is a parent of Order Details (XML will automatically add an encoded representation of special characters, such as the space between the words "order" and "details"). Now if we change the query slightly by altering the order in which the columns (and tables) were mentioned, we'll get a very different result:

SELECT TOP 1  ContactName, ContactTitle, OrderDate, ShipCity, ShipCountry,
ProductID, UnitPrice
FROM Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID
INNER JOIN [order details] ON [order details].OrderID = orders.OrderID
WHERE orders.customerid = 'AROUT'
FOR XML AUTO

Results:

<Customers ContactName="Thomas Hardy" ContactTitle="Sales Representative">
<Orders OrderDate="1996-11-15T00:00:00" ShipCity="Colchester" ShipCountry="UK">
  <order_x0020_details ProductID="24" UnitPrice="3.6000"/>
</Orders>
</Customers>

Now the Customers tag appears to be the parent of the Orders tag, which in turn is the parent of [order details].

Another drawback of using FOR XML AUTO is that it does not support GROUP BY clause and aggregate functions. If you try executing a query similar to the following you'll get an error:

SELECT AVG(quantity) FROM [order details]
FOR XML AUTO

Results:

Server: Msg 6821, Level 16, State 1, Line 1
GROUP BY and aggregate functions are currently not supported with FOR XML AUTO.

You can get somewhat "creative" if you still want to use FOR XML AUTO with aggregate functions, for instance the above query could be rewritten to return the average quantity using FOR XML AUTO as follows:

SELECT TOP 1 average_quantity = (
SELECT AVG(quantity) FROM [order details])
FROM [order details]
FOR XML AUTO

Results:

<order_x0020_details average_quantity="23"/>

But in general it's better to resort to RAW or EXPLICIT mode if you need to use GROUP BY clause or aggregate functions.

The RAW mode is useful if you don't really care about the table names in your output - you just want the data in one long XML string. The RAW mode simply appends the tag to the output and suppresses all table names. For instance, have a look at the top customer record in the Northwind database with the RAW mode:

SELECT TOP 1 * FROM customers FOR XML RAW

Results:

<row 
 CustomerID="ALFKI" 
 CompanyName="Alfreds Futterkiste" 
 ContactName="Maria Anders" 
 ContactTitle="Sales Representative" 
 Address="Obere Str. 57" 
 City="Berlin" PostalCode="12209" 
 Country="Germany" 
 Phone="030-0074321" 
 Fax="030-0076545"/>

The results are similar with multiple tables, for example:

SELECT TOP 1  ContactName, ContactTitle, OrderDate, ShipCity, ShipCountry,
ProductID, UnitPrice
FROM Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID
INNER JOIN [order details] ON [order details].OrderID = orders.OrderID
WHERE orders.customerid = 'AROUT'
FOR XML RAW

Results:

<row 
 ContactName="Thomas Hardy" 
 ContactTitle="Sales Representative" 
 OrderDate="1996-11-15T00:00:00" 
 ShipCity="Colchester" 
 ShipCountry="UK" ProductID="24" 
 UnitPrice="3.6000"/>

Note that the RAW mode does NOT support the ELEMENTS option, but does support aggregate functions and the GROUP BY clause. The following example retrieves the average quantity in RAW mode:

SELECT average_quantity = AVG(quantity) FROM [order details]
FOR XML RAW

Results:

<row average_quantity="23"/>

Still, the majority of your queries will require more control over the XML structure of the output. The middle tier components or the front end programs will expect the XML output in a particular format, not in a jumbled up mess. So although helpful and easy to use, the AUTO and RAW mode will not be your friend in most cases. If you need to have complete control of the output's XML structure you'll have to resort to the EXPLICIT mode.

The EXPLICIT mode builds a universal table (in memory) that contains the information about how the XML document needs to be built. That means, the universal table will know whether an entity is to be presented as an attribute or an element and the hierarchy in which the tags are to be nested. In addition, the EXPLICIT mode lets you read XML data which might already be contained in your relational tables and even generate the XML schema for your output.

For More Information

  • What do you think about this tip? E-mail us at tdichiara@techtarget.com with your feedback.
  • The Best SQL Server Web Links: tips, tutorials, scripts, and more.
  • Have an SQL Server tip to offer your fellow DBA's and developers? The best tips submitted will receive a cool prize--submit your tip today!
  • Ask your technical SQL Server questions--or help out your peers by answering them--in our live discussion forums.
  • Check out our Ask the Experts feature: Our SQL, Database Design, Oracle, SQL Server, DB2, metadata, and data warehousing gurus are waiting to answer your toughest questions.

Dig Deeper on XML in SQL Server

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close