Debugging SSIS Variables Part Two–The Locals Window

From the book of Bloggering, Chapter 59 Verse 11: Thou shalt humbly accepteth better solutions to thine posits as they appeareth in thine Comments Section.   Yea verily, although thou may fearest change, thou shalt learneth new and better coding in this manner and be pleasing in the sight of thine employer.

I had a blog post several weeks ago about debugging the values of SSIS Variables. I shared the method I have always used, which was simply (actually maybe it’s not simple at all) adding a script task into the control flow and placing a message box in the control flow, displaying the value of the variable. Continue reading Debugging SSIS Variables Part Two–The Locals Window

Adventures in MDX – Sets

Oy, Audrey has violated the Book of Bloggering! She failed to post on her designated Tuesday, and fellow Datachix Julie had to step in with back-to-back posts. (Reason #81 why I heart her.) In my defense, dear readers, I’m neck deep in work at a new client. And while the Book of Bloggering dictates the alternating schedule, the Book of BI Consulting Chapter 17, Verse 12 says: “Thou shalt keep thy client happy at all costs. Regardless of disgruntled users, tight deadlines, or processes in need of improvement, the BI Consultant shall deliver, and deliver well.” As much as I love to write blog posts, my Wine and Kindle budgets demand a regular paycheck.

So, with no further ado, I present to you my long-overdue post. It’ll be the second in what will hopefully become an MDX series. Today, we’ll be taking a look at Sets. Building upon the last post, Adventures in MDX – Tuples, we’re still getting a handle on the structure and concepts around querying a cube. But, I pinky-promise you; we WILL eventually begin writing some pretty darn cool queries.

As I’ve mentioned before, I’m just not good at MDX. No excuses… I’m just not. Rather than jumping in and memorizing functions and complex structures, I’m trying to train my over-saturated brain to comprehend how the data is structured, and subsequently, how to get said data out of the cube and into a result set. The first step was to understand Tuples. A quick recap: A Tuple is a data point – the intersection of all of the dimensions at a certain place. Imagine you have a nice, big, freshly baked cake. Maybe a chocolate layer cake with chocolate ganache. I’m just spit-balling… choose any flavor you like. This is our cube proxy. Anyway, stick a toothpick into that cake. The spot where the point of the toothpick stops: Tuple. It’s a single point in our cake, the intersection of eggs, flour, chocolate, sugar, butter, etc.

Now, take a knife and cut into the cake. You’ve defined a set. It’s a collection of Tuples. Cool, huh? Now, stop cutting! I have to get through a basic set before we cut (SELECT) a whole slice of data out of that decadent, delicious cake. As I mentioned before, I’m using the Microsoft SQL Server 2008 MDX Step-by-Step book as my primary resource. Much credit to these guys for their excellent tome.

The best way to illustrate a SET is to build up a SELECT statement in MDX. So, that’s just what we’re going to do.

Defining a set gives you a lot of power over the way that your result is presented to you as well as what’s included in it. In most cases, you’re going to define what’s on COLUMNS and ROWS, the first two of 128 possible axes you can define. I’d love to talk to the sadistic you-know-what at Microsoft that thought it would be funny to try to make my brain fry by encouraging me to even attempt to visualize how a result set would be presented on 128 different axes. It’s okay, though, somebody probably failed to tell him that SSMS only allows you to return two axes in a result set. If you try to define a third, PAGE, for those of you keeping track at home, you’ll get an error message instead of results. Ha! Take that, Mr. Microsoft Over-Achiever!

Anyway, let’s start building us a SELECT statement in Management Studio. First, make it as basic as possible:

SELECT
FROM [Adventure Works];

You get this:

Um, okay, that’s nice. 80 million dollars. That tells me… nothing useful. But, we have a syntactically correct MDX query, so I’m not complaining.

TANGENT:

By the way, what does that ~80 million represent? Reseller Sales Amount. Why? Because it’s the default measure for the Adventure Works cube. How do we know? Open up BIDS. Open the Analysis Services Database, Adventure Works. Open the Adventure Works cube, and go to the Cube Structure tab. Right-click on the Adventure Works cube in the Measures section (top-left corner) and select Properties. There’s a defaultmember property. It says Reseller Sales Amount. There you go.

END TANGENT

But, we can do better. Let’s define a set that will give us column headers:

SELECT
{
	 ([Sales Territory].[Sales Territory Country].[Australia])
	,([Sales Territory].[Sales Territory Country].[Canada])
	,([Sales Territory].[Sales Territory Country].[Germany])
	,([Sales Territory].[Sales Territory Country].[United Kingdom])
	,([Sales Territory].[Sales Territory Country].[United States])
} ON COLUMNS
FROM [Adventure Works];

That thing up there in the SELECT clause? A SET! Note that it’s enclosed in curly brackets ({}). Yes, I know they’re called braces. I call them curly brackets. More descriptive. Also note that each thing between the commas is a Tuple. Therefore, Collection of Tuples! There is an important, nay, vital rule that is being followed here: When I explicitly name a dimension in my tuple, each of the tuples in the set references the same hierarchy. Now, I don’t have to define the SAME LEVEL of the hierarchy in all of my tuples. I can do something like this:

SELECT
{
	 ([Sales Territory].[Sales Territory Country].[Australia])
	,([Sales Territory].[Sales Territory Country].[Canada])
	,([Sales Territory].[Sales Territory Country].[Germany])
	,([Sales Territory].[Sales Territory Country].[United Kingdom])
	,([Sales Territory].[Sales Territory Country].[United States])
	,([Sales Territory].[Sales Territory Country])
} ON COLUMNS
FROM [Adventure Works];

Cool. But, I can’t reference two different hierarchies from the Sales Territory dimension in one set. Check this out:

SELECT
{
	 ([Sales Territory].[Sales Territory Country].[Australia])
	,([Sales Territory].[Sales Territory Country].[Canada])
	,([Sales Territory].[Sales Territory Country].[Germany])
	,([Sales Territory].[Sales Territory Country].[United Kingdom])
	,([Sales Territory].[Sales Territory Country].[United States])
	,([Sales Territory].[Sales Territory Region].[Northeast])
} ON COLUMNS
FROM [Adventure Works];

Ooh, error. Back to the cake analogy… This would be sort of like starting to cut into the cake, and then picking up the knife and stabbing it into another part of the cake. You wouldn’t expect a clean slice, and the same goes for the query. It just doesn’t know how to pull this data back. By the same token, I can’t reference two different hierarchies either. Really, why would you do this to your lovely chocolate ganache, anyway?

Okay, there’s more we can do with these column headers that are being returned. We can define a more detailed tuple. Maybe I want to see why people bought products in Australia. Watch this:

SELECT
{
	 ([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Quality])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Price])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Magazine Advertisement])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Review])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Manufacturer])
} ON COLUMNS
FROM [Adventure Works];

Sweet. Remember the rules from the Tuple episode? When a tuple is defined, every single dimension is actually represented in the query, even if you don’t explicitly name it. It defines the tuple members used according to the Other Three Rules*: Default Member, then (All) Members, then First Member. Before, the Sales Reason dimension was accounted for, but it was using the (All) Members rule because a Default Member isn’t defined. This time around, we’re telling the query exactly which members from the Sales Reason dimension to return, as well as which order to return them in. I could go on. I could define this tuple out to my heart’s content. BUT, there is one big rule to follow: The Set requires that the dimensions are given in the same order in every tuple. The following query will return an error:

SELECT
{
	 ([Sales Reason].[Sales Reason].[Quality], [Sales Territory].[Sales Territory Country].[Australia])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Price])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Magazine Advertisement])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Review])
	,([Sales Territory].[Sales Territory Country].[Australia], [Sales Reason].[Sales Reason].[Manufacturer])
} ON COLUMNS
FROM [Adventure Works];

Again, this query is like stabbing your knife into the cake and expecting to come out with a beautiful slice. MDX likes clean cuts. So, it wants consistently defined tuples. Humor it.

Okay. Remember how I told you to quit after making the first cut into your cake? Go ahead, make the second cut. I’ll wait…… Oh good, you’re back. Hey, you have a little icing on your chin. Right there. No, there. To the left. There you go, got it. So, you cut twice (asked for two Sets) and ended up with a nice piece of cake (Data) didn’t you? Awesome. Let’s continue to wring the life out of this analogy and look at the MDX.

SELECT
{
	 ([Sales Territory].[Sales Territory Country].[Australia])
	,([Sales Territory].[Sales Territory Country].[Canada])
	,([Sales Territory].[Sales Territory Country].[Germany])
	,([Sales Territory].[Sales Territory Country].[United Kingdom])
	,([Sales Territory].[Sales Territory Country].[United States])
} ON COLUMNS
,
{
	 ([Date].[Calendar Year].[CY 2005])
	,([Date].[Calendar Year].[CY 2006])
	,([Date].[Calendar Year].[CY 2007])
} ON ROWS
FROM [Adventure Works];

Okay, so what’s this doing? Well, it’s saying, “Hey, MDX, I want you to go out and find the Reseller Sales Amount. Then, I want you to break it down for me. I want column headers that show the Countries I’ve specified. Then, I want row headers that show the years 2005 – 2007. Finally, I want the portion of the overall Reseller Sales Amount in a cell at the intersection of the Country and the Year.”

I said that we weren’t going to get into functions yet, but I do have one little thing I want to close with. The Members function. This is sort of like the “SELECT *” of MDX. You can tag a “.Members” onto the end of a [Dimension].[Hierarchy].[Level] reference (or even a [Dimension].[Hierarchy] reference) inside a tuple. I’m going to re-write the COLUMN set to return pretty much the same data, but with less carpal-tunnel syndrome.

SELECT
{
	 ([Sales Territory].[Sales Territory Country].Members)
} ON COLUMNS
,
{
	 ([Date].[Calendar Year].[CY 2005])
	,([Date].[Calendar Year].[CY 2006])
	,([Date].[Calendar Year].[CY 2007])
} ON ROWS
FROM [Adventure Works];

Check that out. It even gives us members we didn’t know to ask for, including an (All) Members summary. This function is great for a couple of reasons: 1) You don’t have to type so much. 2) If you don’t know all of the hierarchy members, you don’t have to go look them up.  And, if the members change down the road, you’re not slogging through MDX queries manually updating them. 

Alright, so maybe we’re not to awesome-ness yet, but you have to admit, not too shabby. There are about a bazillion other things you can do with these sets, and we’ll get to them. But for now, let’s take a break and enjoy the lovely piece of cake… er, data we’ve created.

Query on, my friends.

*Other Three Rules because the Three Rules are strictly reserved for references to Isaac Asimov’s I, Robot and the Foundation series. If you’ve only seen he movie, for the love of all that is good in this world, go read the books. While you’re at it, go read Starship Troopers by Robert Heinlein. That book got the short end of the movie stick too. Seriously, Denise Richards? Denise Richards?!? They should have made her shave her head to stay true to the story.

T-SQL Tuesday #13 – Make Nice with the Business

This month’s T-SQL Tuesday asks the following question:  What issues have you had in interacting with the business to get your job done?   First, much thanks to Steve Jones (Blog | Twitter) for hosting this month’s event. 

As the old joke goes, this job would be great if it weren’t for the users.  (insert rimshot here) Seriously folks, my job title has “Business” in it, so I’d better be able to figure out what the user is asking for.  Being a BI Consultant is one part developer, one part psychologist, one part archaeologist, and two parts translator.  If the user asks for a report, that’s great.  Actually, that’s more than I often get.  Sometimes I get vague, conflicting requests.  Sometimes I get requests that just confuse me. 

Take my Top Five Favorite requests from end users: 

5) “There’s this report that John used to produce in 1997.  It was great.  It took him 12 days to put it together, but it had everything I needed.  I want that, only sooner.”

It’s a shame that John checked himself into a mental facility in 1999 and is now spending his days creating lovely landscape paintings with non-toxic watercolor paint and taking meds on a rigorous schedule.  It’s also a shame that no one can remember what the report looked like, only that it was really good. 

4) “I want to slice and dice the data however I want.”

Not bad.  At least they’re referring to being able to filter the data in some way.  The scary part of this request:  “However I want”.  Do you want to slice the data by what color shirt you were wearing on that day?  Let’s narrow this down a bit. 

3) “I’m not sure what I want, but I’ll know it when I see it.” 

–Sobbing–  You’ll know it when you see it?  Okay.  Okay… let’s extend that deadline. 

2) “The data should be sexy.” 

Sexy?  Let’s define sexy.  I think beautifully-structured, properly normalized, and well-performing data is sexy.  Is this what you meant?  No?  Wow, I thought we were on the same page here. 

And my all time favorite user request….

1) “I want it to be like an iPhone.  You know, an Apple feel to it.”

What, you want the data to wear a black turtleneck?  You want to be able to swipe and pinch the data? 

Users, I love you.  You keep my mortgage paid and my kids in shoes.  Truly, if it weren’t for you, I’d be out of a job.  But, sometimes, you make for good happy hour stories. 

Just this past week, I ran into a situation that BI Consultant nightmares are made of.  Let me set the stage: 

I’ve been at a new client for about 3 weeks.  Let’s just say I’m not exactly the resident expert yet.  It’s a very large client, with a very challenging data environment.  It’s the beginning of the month, which means that end-of-month and some quarterly reports are due.  Most of these processes have not been automated yet.  Read:  We’re copying query results into Excel and e-mailing them.  The one guy who is the resident expert is on vacation.  I get an e-mail asking for a report that I’ve never seen before. 

I take a deep breath, gather myself, and respond, “Yes, I’ll get that to you.” 

I make a quick phone call to the guy who’s on vacation, get a bit of info, and create the queries to run the report.  I slap that data into Excel, e-mail it out, pay myself on the back, and go home.  Everything looks great from my end. 

Next morning, 7:24 AM, an e-mail is delivered to my inbox.  To paraphrase, it said, “I don’t trust these numbers.  There’s a huge variance in the 3Q numbers that we can’t explain.  I have a meeting at 9:30 about these results, and I’d like to definitively say whether they’re correct.” 

Crap.  I’m not even through my first cup of coffee yet.  I top off my coffee, and begin my investigation.  This is the archaeologist part of my job.  On the surface is a report that isn’t making sense to the business.  My job is to dig backwards until I either come up with an explanation or prove that the data is correct.  No easy task, considering that I honestly don’t know where much of this data is sourced from. 

Step 1:  Verify Your Own Work – First, I opened up the query I ran to produce the report.  Key point here.  I saved it.  I save everything.  My first move was to verify syntax.  Did I do something stupid like join a table to itself or create a funky WHERE condition?  Did I accidentally paste something into Excel improperly?  (Tangent:  This is why automation is a Good Thing.  Eliminates human error.)  Nope, All quiet on the Western Front. 

Step 2:  Verify the Data Load – This data was sourced from a report database that is populated via an SSIS package.  Luckily, the guy who wrote the package sits a few rows over from me.  I check in with him, and he confirms that nothing has changed since the last load.  I ask for the source files anyway so that I have some outlets for additional research. 

Meanwhile, my Key2 Consulting compadre, Josh Robinson (Blog), is doing something really cool to help me out… He pulls the data into Excel and fires up PowerPivot.  Using the graphing functionality he’s got with the tool, he can point out anomalies in the data by different dimensions to try to narrow down exactly where we’re seeing the suspect data.  I was writing manual PIVOT statements in T-SQL, which was much less efficient than what he was doing.  Lesson Learned:  PowerPivot ain’t just for end-users.  It’s a great diagnostic tool. 

Step 3:  Verify the Source Files – I take a look at the source files.  Ha!  There!  The source data has the same disparity that the business users are complaining about.  This is good news.  This is a lead.  Now, I just have to find out who created these files. 

Step 4:  Find the Source File Owner – I make some calls, do some checking, and voila!  I have a name and a phone number.  It’s a very large company, so he’s halfway across the country, but I’ll still be able to get in touch with him. 

Step 5:  Contact Source File Owner – I call the guy who creates the source file.  He doesn’t know me from Eve.  After the requisite introductions, I ask about the change in the data.  He responds, “Oh yeah, we changed the way we’re pulling this piece of data.  You should see a huge increase in the number of gizmos from this month to that month.”  I thank him profusely, and move on. 

Step 6:  Wrap it all up – I make a courtesy call to the woman who is probably drumming her fingers on the table waiting for the data.  Then, I write up an e-mail with our findings, and I send it out to anyone who might care.

Wait, you thought we were done?  No, we’re not done.  Let’s back up a bit.  Yes, we explained the questionable data.  But a good archaeologist knows to dig just a little bit more.  I ask the business users, “Hey, so that source data changed, and the change was applied to this month but not that month.  Are we okay with that?”  This lead to another round of conversations.  Ultimately, we decided to keep the data as-is and note the reason for the change.  The point is, if you want to try to make friends with your business users, you answer the questions they have, and then try to think of the ones they haven’t asked yet.  They’ll love you for it.

FOR XML a la Mode –a study on Modes, especially using For XML Path Mode to Concatenate Data

Hello folks, Julie here with the as promised weekly Datachix blog—

Hup Hup—did you say just say  “Julie” here?

Yes I did.

You specifically said last week that you and Audrey would blog weekly and ALTERNATELY.  You posted last week. Isn’t it Audrey’s turn?

Yes, but Audrey is fiercely battling Reporting Services at a new client site–she can’t post this week.   Also, I think there were Ninjas…. and maybe some Bear-Sharks……… she’s really really….. busy.

Fine.

 Hello folks, Julie here with the as promised weekly Datachix blog.  😉   Recently a buddy of mine left a comment for us which went like this:

 Can you post an example of putting data from multiple rows into a single column?

 For example you have this:

Cat | meow

Cat | eat

Dog | bark

Dog | fetch

 

And you want to display it as this:

Cat | meow, eat

Dog | bark, fetch

It’s hard to find a good example of this online.

Thanks, Ben

  Continue reading FOR XML a la Mode –a study on Modes, especially using For XML Path Mode to Concatenate Data