FOR XML a la Mode –a study on Modes, especially using For XML Path Mode to Concatenate Data

Hello folks, Julie here with the as promised weekly Datachix blog—

Hup Hup—did you say just say  “Julie” here?

Yes I did.

You specifically said last week that you and Audrey would blog weekly and ALTERNATELY.  You posted last week. Isn’t it Audrey’s turn?

Yes, but Audrey is fiercely battling Reporting Services at a new client site–she can’t post this week.   Also, I think there were Ninjas…. and maybe some Bear-Sharks……… she’s really really….. busy.

Fine.

 Hello folks, Julie here with the as promised weekly Datachix blog.  😉   Recently a buddy of mine left a comment for us which went like this:

 Can you post an example of putting data from multiple rows into a single column?

 For example you have this:

Cat | meow

Cat | eat

Dog | bark

Dog | fetch

 

And you want to display it as this:

Cat | meow, eat

Dog | bark, fetch

It’s hard to find a good example of this online.

Thanks, Ben

 I didn’t have time to think it all the way through at the time and this is not something I know how to do from memory, but knew that Itzik Ben-Gan posted a great way to do it way back when, and I remembered that Audrey just had done a post which referenced the method.  I wanted to give Ben (my buddy, not Itzik) an answer quickly so I quickly referred him to the dynamic pivot post Audrey has written.  The answer is in her post, but her post also encompasses a lot more than just concatenating the values of a column.  So now I’ve had time to focus more specifically on concatenating the data.

Here  is the reader’s digest version answer to Ben’s question:

First the Setup:

use tempdb

go

if exists (select 1 from sysobjects where xtype = ‘u’ and name = ‘tempanimals’)

drop table tempanimals

create table tempanimals

(animal varchar(20) null, verb varchar(20)null)

insert into tempanimals

values( ‘cat’, ‘eat’),

(‘cat’, ‘meow’),

( ‘dog’, ‘bark’),

(‘dog’, ‘fetch’)

Here’s the result of select * from tempanimals

Figure 1

 

A quick, here’s ONE (not THE one, just one of many) answer, cut and paste, I’ll learn how it works later, answer to how to get this :

And you want to display it as this:

Cat | meow, eat

Dog | bark, fetch

is this Proc, which would accept the name of the animal you want to display as an input parameter and return the data in the desired format: 

CREATE PROC  GetAnimalVerbsbyAnimal (@animal varchar(20) )

AS

BEGIN

DECLARE @string varchar(max)

      SET @string = (SELECT stuff((SELECT ‘, ‘ + verb  AS [text()]

      FROM tempanimals

      WHERE animal = @animal FOR XML PATH(”)),1,2,”))

 SELECT DISTINCT animal, @string verbs FROM tempanimals WHERE animal = @animal

  END

  —-examples of using

  exec GetAnimalVerbsbyAnimal ‘cat’

Figure 2

Stop here if a quick answer is all you need.

———————————–

But you want a play by play don’t you?  How did that just happen?

 If you are new to the for XML expression inside SQL Server you might be wondering the following:

  • How does an XML expression concatenate the data?
  • Why aren’t there any tags in my result, shouldn’t XML create tags??, What kind of cockadoody XML is this?!!
  • How important is that “Path” thing?
  • Ditto on that “[text () ]” thing?
  • How important is that “Stuff” thing?
  •  Is this even SQL anymore, or Pure Wizardry?

If you break it down, the only thing that you might recognize if you are new to XML inside SQL Server, is the abbreviation XML itself (and maybe Stuff, Stuff is old).

SELECT STUFF((SELECT ‘, ‘ + verb  AS [text()]

      FROM tempanimals

      WHERE animal = @animal FOR XML PATH(”)),1,2,”)

If you’re like me, (and for your sake I kind of hope that’s not true on most matters, but in deciphering SQL it’s fine), you have to start highlighting parts of the SQL and running it in pieces, then combine your efforts with lots of research and sooner or later you figure out what is going on.   I’ve been intrigued by XML for a while now and have used it in my work, but  I still feel like I have only just begun to harness its utility. 

The first thing to know in this case is that For XML has four Modes, and Path is one of them.

 Raw, Auto, Explicit, Path  — modes of T-SQL For XML expressions.

Here are the nice Microsoft definitions:  ClickMe

Here are my definitions, along with some background:

  • Raw =  Simple XML.  One Line per row, the default tag value for a row element is <row></row> .  No hierarchy without writing nested XML and even then it’s clumsy.  Vanilla XML.

  • Auto =  (I pride myself on having a decent vocabulary.  The Microsoft definition of For XML Auto sent me straight to the online dictionary to define the keyword Heuristics.) For XML Auto means Heuristics are applied to create a document with nesting when it makes sense.  Which basically means “wing it based on your experience” the engine is telling us–  “Trust us, we’re going to make for you the bestest, nestedest XML we can based on everything we might ever know, and some things we don’t know but that sound good.  Love Microsoft”. 

    • The default behavior of Auto mode is for it to use the table names as the tag values for row elements and the column names as attributes of that row element.  It also creates a hierarchy for you, based on the relationships defined between tables.  So if you join tables in your sql statement, the Auto Mode will attempt to create that relationship in the XML as well.   I’ve used the Auto mode with success before; the one thing I didn’t like was that it would generate duplicate data inside the XML output in the case of more than two nested levels of data (ie I was creating a document from more than two related tables).

  • Explicit.  Explicit is intimidating.  It’s hard. 🙁   It gives developers sad faces.  It also gives the developer the most control over the shape of the document.  I’ve started learning explicit because I want to avoid the duplicate data sometimes created with auto mode.  I’ll post about that later, because although hard, I think it’s worth learning.  I’ve managed to get my XML nested properly and without duplicates using it and that gave me a happy face.  🙂

  • Path.  Path is supposed to be the happy median of Auto and Explicit.  Easier than explicit but more precise than Auto.  It gets its name and many of its conventions from the XPath Language.  Path Mode was added to SQL Server in 2005 edition.  Very shortly thereafter, people started using it to concatenate data from result sets (in other words–doing exactly what Ben is asking about).  This trick even has a cute nickname, the XML BlackBox Method.  Like Raw, the default value for a row element is <row></row> ; unlike raw however, the default behavior of Path when it comes to columns is to make column names elements as well, instead of attributes.  (this is important to remember for know how the blackbox trick works)

First let’s look at the data in its various basic forms (I’m going to skip Explicit for now):

Figure 5

 

The bottom 3 results hardly seem readable.  But just click on the blue text in your results pane and voila!  A new window opens with the XML formatted .  As you look at the results, be sure to notice the element tag values and the differences between how each mode renders the data.

(Note:  for this screenshot, I’ve pasted them all into the same results window, but each would open separately in the real world):

 

  What happens with Explicit?

See?  Explicit is hard 🙁  Let’s just forget about Explicit for now.

  So that’s all very good knowledge to have Julie, but back to our original question:  how do we get this:

Cat | meow, eat

I mean, all of those results have ugly tags in them <uglytags>.

Remember the [text()] statement in the original SQL that worked? 

SELECT verb  AS [text()]

      FROM tempanimals

      WHERE animal = @animal FOR XML PATH

The [text()] is a directive utilizing the concept of “nodes” specific to XPath.  In fact, I’ll save you some trouble and tell you that if you try and use [text()] with any of the other modes, you will get an error message. It tells the engine “this node (the verb element)  will be plain text, no need for tags”.     So in this example it will eliminate the <verb></verb> tags.  That is an important part in getting the results like we want them.

So let’s try the basic statement with the new trick:

Hup hup—–wait — doesn’t the Path Mode also wrap each element (row) in a <row> tag?  YES it does and that is what this empty string at the end is doing:

SELECT verb  AS [text()]

      FROM tempanimals

      WHERE animal = @animal FOR XML PATH (”) 

The empty string (”) is overwriting the default vaue of <row></row> with an empty string.  It therefore eliminates the <row> tags for each row.  Using [text()] in conjunction with (”) we have eliminated all of the tags!

So running this we get:

Results:

 Eatmeow

Hmmm.  We have made a lot of progress.  We’ve got text concatenated and removed all the tags.  But now of course we need to format it.  this:  EatMeow  is not quite what we want the finished result to look like.

Audrey’s post  really covers the STUFF function well, so I’ll refer you to it if you need to brush up on your “Stuffing”

But in a nutshell, we’re going to add a comma and a space to every value  in the select statement and then take out the very first one with the Stuff Function.

From

Eatmeow

to

 eat, meow

By adding the lighter text as seen here:

SELECT STUFF ((SELECT  ‘, ‘ + verb  AS [text()]

      FROM tempanimals

      WHERE animal = @animal FOR XML PATH(”))  ,1,2,”)

So the answer to the above questions.

  • How does an XML expression concatenate the data?
  • Answer:  because it takes the results of a query and places it into one xml object. 
  • Why aren’t there any tags in my result, shouldn’t XML create tags??, What kind of cockadoody XML is this?!!
  • Answer:  We eliminated the tags by using the node test [text ()] to eliminate the subelement row tags <verb> and expicitly named the root element (‘’) to eliminate the <row> tags.
    • (and This is “cockadoody- For- XML- Path” XML btw)
  • How important is that “Path” thing?
  • Answer:  Path is a Mode of For XML.
  • Ditto on that “[text () ]” thing?
  • Answer [text()] is the node test of Path Mode that eliminates each beginning and ending row  tags <verb>
  • How important is that “Stuff” thing?
  • Answer: here the Stuff function is eliminating the extra comma and space at the very beginning of our concatenated data.
  • Is this even SQL anymore, or Pure Wizardry?

Answer:  It’s SQL

—but I may or may not be a Wizard.

 

——————————————————————

Links to more Wizardry References Regarding For XML:

 http://pratchev.blogspot.com/2008/03/row-concatenation-with-for-xml-path.html

http://msdn.microsoft.com/en-us/library/ms345137.aspx

http://www.projectdmx.com/tsql/rowconcatenate.aspx#BBXML

Published by

Julie Smith

One half of the Datachix.com. ETL person. This picture is the best of me ever taken. Don't be too sad when you meet me for real :)

7 thoughts on “FOR XML a la Mode –a study on Modes, especially using For XML Path Mode to Concatenate Data”

  1. Damn Julie, this XML keyword opens up a whole new world of possibitity, I over looked it last time and didn’t realize what it could actually do, very nice!

  2. Thanks Ben. Thanks for asking the question. I think XML is very awesome. Please spread the word and help me evangelize on the subject. Let me know what you come up with in your work!

  3. for more control and obfuscation…

    select
    (
    (select case when (ROW_NUMBER() over (order by c.verb)) = 1 then c.verb else ‘, ‘ +c.verb end as ‘text()’ from tempanimals c for xml path(”))
    ) as verbs

Leave a Reply