Personal Note: I wrote the bulk of this post last night. Before I read Chris Webb’s blog post about the future of SSAS and MDX. I almost didn’t post this after reading that. But you know what? Screw you guys, I’m learning MDX anyway. I hate the idea that Microsoft would potentially remove aspects of the BI stack because the learning curve is too high. If I can’t keep up, then put me out of a job. Don’t dumb-down the functionality. That being said, I think there’s a lot of value in understanding the language that accesses any data store. It forces you to think about the internal structure of what you’re working with, and therefore, I see value in learning MDX either way. (But really… I was so disheartened after reading about some of the coming changes. You MS peeps had better know what you’re doing!) On to the original post:
My favorite movie is My Fair Lady. I love Audrey Hepburn. I love the Pygmalion story. Quick aside: I used to tell people that my parents named me after her. They didn’t, and the true story is convoluted. My mom loved the name Audrey Dalton (my middle name is Dalton), which was the name of a movie star. My great-great grandmother was Dalton Harris, and she thought it would be cool to name me after her and the actress. Then she met my dad. His mom’s name was Audrey. (She went by her middle name, Geraldine, which I never understood… but I digress.) Anyway, when I was born, she told everyone that I was named for my paternal grandmother and my maternal great-great grandmother. When, secretly, I was just named after an actress with a name she liked. I’m glad she told me this. (Audrey Dalton was a total hottie.)
<– Audrey Dalton, HOTTIE (courtesy of Ballybane Enterprise Centre http://www.bbec.ie/blog/?p=708)
This week, I’ve decided to start digging into MDX. There are three reasons for this:
1) It’s PASS Summit week. While I’m not there, I’m trying to get into the spirit of things by learning something new.
2) I’m just not good at MDX. There is no excuse for this.
3) I’m gearing up for my MCITP exam in Business Intelligence 2008. I hear rumor that there are MDX questions.
So anyway, I feel a lot like Eliza Doolittle this week. If you’re not familiar with the story, she is the subject of a bet between Henry Higgins and Colonel Pickering. They bet that Prof. Higgins can’t pass her off as a Lady (with a capital “L”) in a year. She’s just a lowly flower girl, complete with cockney accent. In order to refine her, he has to teach her how to speak again. It’s her own language, but she has to learn how to use it in a totally unfamiliar way. Instead of saying, “In ‘artford, ‘ereford, and ‘ampshire, ‘urricanes ‘ardly h-ever ‘appen”, she has to learn to say, “In Hartford, Hereford, and Hampshire, hurricanes hardly ever happen”. (Swear to cheesus, I haven’t hit IMDB yet… I really love this movie) Same words, same meaning, totally different accent.
Rather than a flower girl trying to sound like a Lady, I’m a T-SQL girl trying to sound like an MDX Lady. Or something like that. You know what I mean.
<– T-SQL Flower Girl
To get started, I picked up Microsoft SQL Server 2008 MDX Step by Step (by Brian C. Smith, C. Ryan Clay, and Hitachi Consulting). I’m starting with the basics, so right now I’m in “SELECT * FROM” territory. Or, “SELECT FROM ” territory, since we’re talking MDX.
Transitioning from T-SQL to MDX is not easy. The syntax is just familiar enough to me to trip me up. I keep catching myself trying to equate a query against a cube to a query against a relational data store. It’s not the same, and it has been tough for me to wrap my head around it. But, “I washed my face and ‘ands before I come, I did”, so I think I’m ready to get started.
So far, I’ve learned about one important concept: Tuples. The point of this blog post is to force myself to regurgitate what I’ve learned, because to paraphrase something Jen McCown (Blog | Twitter) said the other day, you don’t really know something until you’ve taught it. True that. Please keep reading, but also read a book by an expert. I’ve been happy with the Step by Step book so far.
Wait… one more silly analogy. Writing T-SQL is a bit like cutting out paper dolls. It can be complex, but it’s just two dimensional space. Writing MDX is like chiseling a hole into a big rock at a specific point. It’s n-dimensional space. While a bit goofy, this visualization has really helped me draw a line between T-SQL and MDX.
Tuples (as Translated by Me)
A tuple is basically the identifying characteristics of a cell inside a cube. Really, a data point inside a cube. Say I have three dimensions, Actor, Movie, and Year. Say I have a Measure Group that includes Budget Amount. Say I wanted to find the cell, or data point, identifying the budget for the movie My Fair Lady starring Audrey Hepburn that came out in 1964. I’d look at the attribute-hierarchies Audrey Hepburn, My Fair Lady, and 1964. (Hey, I didn’t say it was a well designed cube!) Those identifying characteristics of the cell are the tuple, which would be formatted something like this in MDX;
,[Movie].[My Fair Lady]
Another way to look at it is in terms of math. I always swore that geometry and algebra were pointless in high school. Well, Mr. Smith, you were right. I’m about to talk axes. (axises? axii?) Each of my attribue-hierarchies and my measure group make up an axis within my cube. Don’t even try to visualize a 4-dimensional cube. I did, and it made my head hurt when I ran out of 3-dimensional space. Let’s label each axis:
[Actor].[Audrey Hepburn] = a1
[Movie].[My Fair Lady] = a2
[Year]. = a3
[Measures].[Budget Amount] = a4
Now, if I want to identify the point that is the intersection, my notation would look something like this: (a1, a2, a3, a4). I also imagine four lines (in 2-dimensional space, all intersecing one another at one point. That point is my tuple.
The MDX syntax for my query looks like this:
FROM [Pretend Movie Cube]
,[Movie].[My Fair Lady]
It would return one value: $17,000,000
Some Key Points:
1) Every attribute-hierarchy gets an axis, NOT every Dimension. So, if I had two attribute-hierarchies within my Actor dimension, Audrey Hepburn and Rex Harrison, they all have an axis. I could actually reference the same Dimension multiple times like so:
,[Actor].[Rex Harrison] <–Henry Higgins!
,[Movie].[My Fair Lady]
2) Measures each get an axis. They are treated differently at design time, but for the purposes of seeking out that one cell or set of cells, it’s treated just the same as an attribute hierarchy.
3) Analysis Services allows you to be lazy. You can define what’s called a Partial Tuple, leaving out some axis references. But… it’s going to try to figure out where on that missing axis you were headed. It’s going to go in this order:
1 – Default member (defined at design time)
2 – (All) member –remember that Measures don’t have an (All) member
3 – First member
Am I getting this? Have I missed the boat? Close, but no cigar? Any other cliche suggesting I don’t know what I’m talking about?