On Blogs – Think like a Journalist

Today, I was reading a blog post.  The article was published by a group that I would consider reliable and reputable.  It was on a topic that I have passing familiarity with, and would like to be better at.  I’m not going to name the post nor the topic, because that’s not the point.  Here’s what went down…

1130 Hours: Read post.  Surprised at the absolutes declared in the article.

1135 Hours: Read it again.  Think that either I’ve missed some really basic lessons on this topic, or that maybe the article has provided some less than ideal guidance.

1209 Hours: E-mail a friend who I know is an expert on the subject.  Ask him to read it and let me know if he thinks it is right.

1300 Hours: Get response from generous and patient friend.  His e-mail (which was longer than the blog post) explains in very clear terms that there are problems with not only the guidance that the article provides, but also how some of the fundamental concepts were represented.

1420 Hours: Go find another post on the same topic by another trusted expert.  Read it and confirm for the second time that my original suspicions about the article were correct.

1430 Hours: Pat myself on the back for knowing just enough about the topic to realize that it seemed off in the first place.

1445 Hours: Go back to blog post to write comment that maybe the article could use a second look.  See that someone has already done that.  Decide not to pile on.

Now, I’ve been watching a lot of The Newsroom and House of Cards lately, both of which have characters who are journalists.  I think that reading blog posts, articles, and books is a lot like being a journalist.  One source is not enough.  If you’re hearing something new or something that contradicts what you think you know, don’t take the article at face value.  Go find a second source, and make sure that the second source didn’t use your first source as their source.  (Caveat:  I do have a list of absolutely trusted writers.  But it is my list, and is based on a lot of factors.  Okay, fine.  I’ll share one.  His name rhymes with Tall Candle.)  Also, if you’re finding conflicting advice, don’t be afraid to ask questions.  Any writer worth her salt is willing to accept some peer review.

And yes, if you’re wondering, I felt very MacKenzie McHale for all of 3 seconds. 

??????????????????????????? (photo from: http://www.fanpop.com/clubs/the-newsroom-2012/images/33579445/title/mackenzie-mchale-photo)

I imagined myself, headset on, shouting, “We’re not going live with this until we confirm a second source!”  Then I remembered that no one but me was consuming this information.  Oh well.

If you’re on the other side of the keyboard and are writing an article you plan to send out into the world, here are a few guidelines to live by:

1) Unless you are 110% sure that your guidance applies 100% of the time, don’t speak in absolutes.  “It depends” is a running joke for a reason.

2) Find someone you trust to tech edit for you.  Heck, find two people.

3) Remember that there are a lot of young database professionals out there who are reading your work in order to figure out how to do their jobs.  Don’t take that lightly.

4) If someone comments on your article and says you’re wrong, engage with them.  They’ve taken the time to read and comment on your work.  Granted, there are trolls out there, but a thoughtful comment demands a thoughtful response.  And keep an open mind.

5) If you’re preaching something that goes against conventional thought, take the time to post links to opposing views.  Help your reader make an informed choice about which advice to follow.

6) Encourage your reader to do their own research with the information you’ve provided.

7) If you discover that you’ve presented bad information, correct the article.  Own it.

All that being said, the proliferation of online resources has made us all better.  Don’t be afraid to put your research and opinions out there.  Just research, verify, and test.  And look for that second source before going live with the scoop. 

Blog on, my friends…

–Audrey

Discovery… Not Just for Explorers

In the scramble to get hands on keyboards in a development shop, one of the most important phases of development often gets short shrift.  That phase is discovery.  I was lucky enough to spend a few weeks with one of our clients working through a formal discovery phase, and was reminded how valuable this time and effort can be.  It also got me thinking… How can any developer, regardless of project role, gather the information that normally comes out of a discovery effort?

Discovery Who? 

So what is Discovery, exactly?  Discovery is the part of development where you answer a few key and seemingly basic questions about the work you think you want to do.  These questions include:  Why? What? Who? When?  There are a few key outputs of a discovery cycle, including:

  • Problem statement – a short, concise explanation of what’s in need of improvement/enhancement/rework
  • Solution statement – a short, concise statement about what’s going to be done about the problem
  • Scope – a list of things that will be done, as well as a list of things that won’t be done
  • Assumptions – any assumptions that have been made about environment, people, technology, complexity, etc. in regards to the project
  • Risks – a list of things that can derail the project, which might include company direction, technology changes, dependent projects, or that “simple” external system that you need to integrate but haven’t had a chance to see yet
  • High-level time estimates – a high level SWAG (Stupid Wild-Ass Guess) at how long it should take to complete the work, or a time window based on other external factors
  • High-level resource needs – a high level list of people (or skill sets) and technology resources (hardware, software) that are needed to complete the project

Notably absent is one thing:  How?  That’s because the how is design, not discovery.  Resist the overwhelming urge to start designing the solution before you fully define the problem.  This is the part that I struggle with every time I’m asked to help out with the discovery phase.

A Developer’s Life

Discovery information will come at you in one of two ways:  either you’ll be asked to answer the above questions and help define the project, or someone might come to you with all or part of this information and expect you to be involved in the subsequent phases of the project.  In both cases, it is absolutely vital that you (Yes, you… the dashing figure reading this post) feel comfortable with the outputs of discovery before you start designing or developing anything.  This is known (in my head) as Sneaky Discovery.   (Credit to Julie for her invention, the Sneaky Proof of Concept… I’ve adapted the concept for Discovery) It ain’t as evil as it sounds, I promise.  Let’s talk about this one for a bit.

I know… it can be uncomfortable to question a project’s intent or setup, especially when you haven’t been asked to.  But bear with me for a moment.  I’m not asking you to put on your fedora w/optional press pass tucked into the band and annoy the ever loving crap out of the people around you while sporting your best Humphrey Bogart voice.  I’m just asking you to take 30-60 minutes and make sure that you feel good about the work you’re about to do.  Let’s look at a dramatic reenactment of a common situation:

 

Scene: John, a diligent data professional is at his desk, working on the latest product release.  Enter Manager (Mike).

Manager: Hey! Bob!  How are you?

John: It’s, um, it’s John.  Not Bob.  I’m good.  What can I do for you?

Mike:  Good, Bob, good.  Glad to hear it.  I just walked out of a Very Important Meeting with Very Important People, and I have a new set of tasks for you.  We’re going to speed up Database X.  Also, I have these 17 new reports that need to be developed… um, somewhere.  Let me check my notes… Ah, there we are!  I’ll e-mail this right over to you so you can get started.  How long do you think this will take?  Can you have it by next Friday?

John takes a deep breath, and time stops.  Enter Mysterious Announcer (MA).

MA:  Right now, John is faced with a career conundrum.  His fate is in his hands, even though he doesn’t realize it right now.  He can either nod and say “Yessir, right on that, sir” and make his manager feel temporarily happy about his decision to hire John, or he can ask a few questions that might make his manager feel a bit uncomfortable in the short term, but will save the company thousands of dollars in the long term.  Let’s see how John handles it.

Time restarts. John turns to his Manager.

John:  Gee, Mike.  That’s a lot to absorb.  I have a few questions to ask before I commit to a time estimate.  Mind if we sit down and talk through the project together?

Mike: (uncertain) Sure?  I think we can do that.

Angels sing, bells ring, beautiful and intelligent women begin to notice John’s unconventional good looks and charming, yet adorably shy, personality

 

Friends, this is your moment to shine.  When faced with a situation like this, it’s up to you to do your own version of Sneaky Discovery and make sure that you’re not diving into the shallow end head-first.

When John gets a chance to talk to his manager, here are a few basic questions he should be asking:

1)      Why are we doing this?  What is the problem we’re trying to solve here?  (Problem Statement)

2)      What’s the solution that we’re going with?  Is it the only one available?  Why did we pick this one? (Solution Statement)

3)      What are the things that we’re going to do, and how does each of them help solve the problem we’re fixing?  What are we NOT doing? (Scope)

4)      Any assumptions about technology direction, complexity, or anything else that led us to the next Friday deadline? (Assumptions)

5)      Anything I should know that can cause us to get off track? (Risks)

6)      How long did you assume this project would take?  Have you made any commitments about delivery?  (High-level time)

7)      Who else is working on this? Do we have a dedicated development environment? (High-level resources)

8)      BONUS QUESTIONS #1: Who is impacted by this change?  Is it okay if I check in with them?

9)      BONUS QUESTIONS #2: How are we testing this?  How are we deploying it?

This conversation will probably take less than an hour for a narrow scope of work.  It does two things:  1) you step into the project with more information than you had initially and, 2) you show your manager that you want to understand the work you’re doing.  Good managers appreciate this.  If you’ve a manager who doesn’t appreciate your questions and desire to understand the work you’re doing, you’ve got a lot to think about, my friend.

Now, the worst case scenario is that John’s manager can’t answer these questions.  What to do then?  In a perfect world, you pull the comically large brake and hit the giant red STOP! button.  Now, it is up to you to decide how to handle the situation.  I’ll tell you how I normally handle stuff like this, because it happens… often.

 

Manager: Um, I don’t really know why we’re doing this.  The CTO asked for it, and I’m here to get it done.  That’s why I’ve been the vice president of middle management for 15 years, damn it!

Audrey:  Okay.  I can do this work.  However, let me lay out some of the risks for you.  We don’t understand why we’re doing it, and we might make poor design decisions because we don’t see the big picture.  Also, without a clear understanding of what problem we’re solving, I don’t know how to tell you that we’re done.

Manager: Okay, works for me.

Audrey:  Can you put that in an e-mail for me?

 

If I feel likely to get burned, I put my concerns/feedback in e-mail.  If I trust my manager to hear me, I tell him in person.  The point is that I never dive into design or development until I’ve either gained broad understanding of the project or at least told someone that I’m working under less than ideal circumstances.

As long as we’re all sharing, let me tell you… we’ve all been there.  We’ve all taken the task, done the work, and delivered something without really knowing why.  It happens.  The trick is to know when you’re dealing with a tough situation and how to mitigate risk to you as well as your team.

You’ll likely find that it is up to you to help flesh out those Assumptions and Risks, and to make suggestions about Scope.  As the in the trenches guy, you’re probably aware of issues that your manager isn’t.  Don’t be shy about sharing that information.  Trust me, ‘tis better to air the dirty laundry at the beginning of the project than to grumble at the end because you “just knew” that this issue was going to come up.

An Aside on Scope

We’ve all heard of it, and if you’ve worked for any amount of time in this business, you’ve lived it.  The dreaded scope creep.  Many times, scope creep happens because discovery was left wanting.  Or, the team isn’t disciplined about sticking to the original scope.  The pain of scope creep is never totally avoided, but it can be minimized.  Foremost, make sure that you have some scope to work with.  Remember how Manager Mike said that he wanted to “speed up Database X”?  Wow, that’s a broad statement.  He also wants to do it pretty quickly.  This means that you probably can’t re-architect the entire system or purchase shiny new hardware to solve this problem.  You might end up with something like this in your scope statement:

Task:  Speed up Database X

In Scope:
1. Tune top 10 worst performing queries
2. Examine index maintenance plan and add key missing indexes
3. Archive old data that is no longer used and remove from primary database
 
Out of Scope:
1. Change logical design of the database
2. Upgrade hardware
3. Rework data access layer

On a highly regulated team, this may end up in a formal document.  If you’re like many developers and things run a little more fast and loose in your shop, there is nothing wrong with shooting off an e-mail or having a conversation about what can be done in the time you have.  You could say, “Mike, we have a week and a half to do this work.  Let’s focus on low-hanging fruit and non-invasive changes.  When we have more time, we can talk about more aggressive changes”.  (Managers love idioms like “low-hanging fruit”)  He might say, “Great!”, or he might come back with, “So if we wanted to address your out of scope items, how long would that take?”  Both are acceptable responses, and you should be ready for them.  Point is, put some boundaries around the work you’re doing, and set expectations about how much can really be done with the time and budget provided.  This allows you to be the guy who never says, “No”, but also protects you from unreasonable requirements.

Final Thoughts

One last thought on discovery within the development cycle.  As developers, we want to know everything up front.  We want details, and we want them now.  Here’s the thing… it ain’t gonna happen.  Take a deep breath, accept the unknowable, and focus your Sneaky Discovery efforts on getting some clarity on why you’re doing what you’re doing, what you’re doing (and not doing), and what might throw the project off.  The rest will shake out during design and development (and possibly testing), and that’s okay.  We’re never going to get it perfectly right, but asking a few of the right questions at the right time can get everyone comfortable with how the rest of the project is going to go.

Now, get out there, get proactive, discover some stuff, and get vocal about how your work fits into the larger picture!  I’m rooting for you!

–Audrey

Great way to begin 2013! (Hint: it involves a TLA)

Quick, happy announcement… My fellow Datachix, Julie Smith, has received the SQL Server MVP award for 2013!  Also, I’ve been renewed as a SQL Server MVP for 2013.  What does this mean?  Yeah… 100% of the people who blog on this site (all two of us) are MVPs!

mvp

So, Julie, while I am not the first to congratulate you, let me be the first to do it on our blog. 🙂

On a personal note, I am grateful and humbled to be a part of the MVP community for another year.  Having experienced 12 months of the program, I’m honored that Microsoft saw fit to include me for another year.

As a bonus, here’s a picture of Julie being awesome:

Wizard

Rock on, my friend!

–Audrey

p.s. Note to self:  It’s been a while since you blogged about anything, Audrey.  Get your act together and post more often!  (Nodding…)

#Meme15 – Why do you blog?

I have a confession to make… Blogging is hard. Staring at a blank page, cursor blinking in that impatient, foot-tapping way, crappy blog ideas spewing from my desperate mind like oil at Spindletop circa 1901.

Pictured: Audrey having crap ideas

“I know… I’ll blog about my cat. I’ll, um, equate my cat’s love of shiny things to my love of foreign key constraints. You know, because foreign keys are shiny. Wait? What? I’m an idiot.” So, when Jason Strate (B | T) proposed the idea of #meme15, I was all for it. Tell me what to blog about? Save me from myself? Sign me up!

This week’s questions are:

1) Why did you start blogging?

2) Why do you currently blog?

Question 1: Why did you start blogging?

Let’s go way back to January 2010. Julie Smith (B | T) and I were talking and we said, “Hey! Let’s do a blog! It’ll be fun! We can write funny stories, be irreverent, and amuse ourselves with our oodles of witty”. I’m pretty sure there was wine involved. We didn’t think much about networking or career development or even education. We just thought it would be fun to do. Some technical blogs can be, shall we say… dry. We name no names. We both like the idea of making data fun. “Hey”, we thought, “we crack ourselves up regularly. Maybe we can crack someone else up”. That’s it. Almost 2 years later, we’re still blogging. We love it. And our reasons for it have evolved. Which brings us to…

Question 2: Why do you currently blog?

Easy. I can name all my reasons in three words: Me. You. Us. Oh, you want details? Gosh, you’re demanding. Okay, twist my arm. I’ll elaborate.

1) I blog for myself.

This is an important principle to me. I blog because I enjoy it. My first rule of writing is: Amuse Yourself. If something amuses me, I like to share the fun. I think that as humans, we’re all storytellers. We want to know that there are people willing to listen to our stories, and what is a blog post but a story? Sure, it might not be Shakespeare, but it’s still my story. It could be called narcissism to say that I want people to hear what I have to say, and that’s okay. Anyone who tells you that they’re 100% altruistic is probably trying to recruit you into their cult. Don’t believe them. They’re going to make you wear burlap robes and ugly running shoes and refer to their leader as Supreme Ultimate Bob. And burlap is so 1990’s.

Trust me. I'm only thinking of you.

Point is, on the day that I don’t get a rush from hitting “Publish” on a post is the day I stop blogging.

p.s. That’s my Gavin, showing off his new winking skills. Watch out ladies!

2) I blog for you.

It’s sappy. I know. But it’s true. Absolutely NOTHING in my professional life makes me happier than having someone tell me that a blog post I wrote (or a presentation I did) helped them do their job better. The idea that something I put out there made somebody else look smart… that’s so cool.

Dramatic reenactment: Me hugging our readers

Photo courtesy of Stuart Miles http://www.freedigitalphotos.net/images/view_photog.php?photogid=2664

This is part of the evolution of blogging. When we started, I honestly didn’t think anyone was paying attention. Then, I had a few people come up to me and say that they’d read my post on Random Topic X, and that they had a question or that they’d applied my solution. It was a little scary to realize that people occasionally pay attention to my ramblings. I realized that I had a responsibility to be as correct as I knew how, and to continue to share what I learn. And it is so SATISFYING to share information. Try it. You’ll like it.

3) I blog for us.

By us, I mean the SQL Community. A community like ours requires care and feeding. I’ve benefitted in a thousand ways from this loose network of passionate people, and if I’m going to take from the community, I feel an obligation to give something back. If everyone read blog posts but no one ever wrote them, then we’d all be reading the same Microsoft Support articles about Installing SQL Server 6.5 over and over again. Boring. Blogging allows me to contribute to this awesome community while letting my Geek Flag fly.

What? I'm a geek, not an artist.

Plus, I get to have cool conversations with interesting people. Like I said, “Sign me up”.

Blog on, my friends…

–Audrey

SQL University – Entity-Relationship Modeling (Part I)

Hey there! Welcome back! Or just Welcome, whichever applies. This week at SQL University (organized by SQLChicken himself, Jorge Segarra) we’re discussing data modeling. On Monday, we talked Conceptual Modeling. Today, we’re going to talk about Entity-Relationship Modeling. Now, fair warning, this is a BIG topic. People spend their entire careers working with ER Diagrams, and there is a plethora of material out there for you to peruse. Also, today I’m going to focus on OLTP, or transactional, data models. There is an entire discipline focused on OLAP, or dimensional data models (data warehousing), but I firmly believe that you’ve got to understand how to build an OLTP model before you tackle an OLAP model. I’m going to hit the high points, hopefully give you a few things to think about, and and direct you to some materials (and people) that can help round out the topic. Obviously, this isn’t a comprehensive list, but it’s a really good starting point.

Resources

Community Members:

  • Karen Lopez (B|T)
  • Louis Davidson (B|T)
  • Thomas LeBlanc (B|T)

A note: Not only do these three people write about data modeling, they also present regularly. Keep an eye on their blogs for great content and information about where you can go watch them teach in person.

User Group:

PASS Data Architecture Virtual Chapter

Books:

Case*Method: Entity Relationship Modelling by Richard Barker

This book is the one I received as course material in my very first database design class in 1997. I still use it. In fact, it’s sitting beside me right now so I can reference it as I work on this post. It’s out of print, but you can still find new and used versions for sale online. First published in 1990, it’s my personal definitive guide to purist modeling. It says Oracle on the cover, and you’re going to see COBOL code inside, but trust me… you’ll learn from this book.

Pro SQL Server 2008 Relational Database Design and Implementation by Louis Davidson, et al

Louis does an amazing job of teaching relational data modeling as well as couching his work in the context of SQL Server. If you’re a SQL Server professional (or want to be), pick up this book.

Data Model Patterns: Conventions of Thought by David C. Hay

Once upon a time, I used to think that all work I did had to be original. Now, I’m older and wiser. Mr. Hay provides an entire book of data model patterns for different industries and types of data. It’s a great reference to see how to do it right, and to get some ideas for your own database.

INTRODUCTION

So, what is an Entity-Relationship Diagram (ERD)? It is: A visual representation of entities, the relationships between entities, and the attributes that describe them.

At the end of the day, an ERD has three major components:

  • Entity – a thing/object that is significant enough that we need to know information about it
  • Relationship – an association between two entities
  • Attribute – something that describes the entity or the state of the entity

At this point, it would be easy to say, “Oh, I get it… tables, foreign keys, and columns!” Resist that urge. There is a lot of value in getting your mind away from technology and database platform for a while and thinking in terms of an ERD. Why? Well, your focus isn’t really on defining a database yet. I mean, we all know how this story ends (spoiler alert: with a database), but let’s let this plot thicken for a bit before we read the back page. What you’re doing at this point is defining the business. If you do it right, someone should be able to look at your ERD of Widgets, Inc. and with a little understanding of notation be able to say, “Ah, I see what they do for a living.”

Also, Entity-Relationship Diagram <> Database. Yes, what you model with an ERD will probably end up in a database, but I think that unless your database is pretty narrow in scope, you’ve probably got multiple ERD’s in a single database. Or conversely, you might have many databases that reflect a single ERD. It’s why Entity-Relationship Diagrams are referred to as “logical models”. You’re not defining the physical structure of the database; you’re defining the logical structure. The physical structure has a whole other set of factors to consider.

ENTITIES

As I mentioned above, an entity is a thing. Looking at Thesaurus.com, there are a few other synonyms for entity that apply in this context: Individual, Item, Material, Something, Stuff

And an entity doesn’t have to be a physical thing. It can be a concept or idea. It’s a NOUN. Remember that if nothing else sinks in today: entities are nouns.

As with all things, there are some guidelines you should follow. Building on the conceptual model we started with the last post, let’s start looking at an Entity-Relationship Diagram that models Country. (The nations, not the music.)

1) An entity is one thing, and that one thing should be defined by only one entity.

The entity COUNTRY is one thing, a single concept. If I follow this rule, I shouldn’t have a second entity named NATION, because that overlaps with COUNTRY.

2) An entity should have a name that represents the type of thing, not an instance of the thing.

This can be a tough one. Basically, your entity name should be high-level enough that it doesn’t limit how you can describe it later through relationships and attributes. For example, my entity is named COUNTRY, not UNITED STATES. COUNTRY is the type of thing, UNITED STATES is an instance of that thing.

3) An entity must be uniquely identifiable.

If my entity isn’t uniquely identifiable, then it isn’t a single thing. It’s multiple things, and in the world of data modeling, trying to cram two things into one thing is a bad thing. As you’re making your list of entities, ask yourself: Is there something about each occurrence of this entity that differentiates it from every other occurrence? That something is your natural key, or business key.

Aside: Next time you see my fellow Datachix Julie Smith, ask her about business keys. Then go get some popcorn and enjoy the rant. Take your time; she’ll still be going strong when you get back. 🙂

4) Entities are singular.

Why? Because they represent one thing! For some reason, there’s a lot of debate about this. Just because your entity has multiple instances of that thing – COUNTRY has United States, Canada, Mexico, etc. – doesn’t mean that you’re talking about Countries. You’re defining an entity that contains instances of COUNTRY. Thus, singular.

Notation

Entities should be represented by a rounded rectangle with the entity name in ALL CAPS inside of it. If it helps, make note of a few instances of the entity. If you know the natural key already, you can note that too.

So, let’s list some entities and occurrences related to this Country ERD:

The next step is to define the relationships between these entities.

VERY IMPORTANT CAVEAT: I’m using this subject area for illustrative purposes. My goal is to keep things as simple as possible, so I know that there are a lot of other things I might do with a Country ERD, and that there are some business rules I’ve oversimplified.

RELATIONSHIPS

What is a relationship? It is: A named, meaningful relationship between two entities.

Each relationship has two sides, and there are characteristics of those sides that need to be defined. If you can’t define the relationship, then you might not have one. It’s okay – there are always other fish in the sea.

1) Name – this is commonly known as a “verb phrase”. Where an entity is a noun, your relationships are verbs. Try to come up with meaningful verb phrases. If you’re stretching and end up with 42 relationships with verb phrases like, “is related to”, then you might be forcing it.

Let’s try one: COUNTRY and STATE/PROVINCE

Verb Phrase: A STATE/PROVINCE is contained within a COUNTRY, and a COUNTRY is made up of STATE/PROVINCE

2) Cardinality (how many) – there are three kinds of relationships: One-to-Many, Many-to-Many, and One-to-One

If we refine our verb phrase, we can add information about cardinality. In this case, we have a One-to-Many relationship between STATE/PROVINCE and COUNTRY

A STATE/PROVINCE is contained within one COUNTRY, and a COUNTRY is made up of one or more STATE/PROVINCE

3) Optionality – this tells us whether there is a mandatory relationship or not. Some entities must be related, others are optional. Let’s refine the verb phrase again:

A STATE/PROVINCE must be contained within one COUNTRY, and a COUNTRY may be made up of one or more STATE/PROVINCE

Notation

There are many different notations, but I prefer what is known as Crow’s Foot notation because it is simple and I’m comfortable with it. But, use what works for you. Just be consistent.

I like this notation because it tells you a lot about the relationship in a very concise way. Reading it can take a little getting used to, but here’s what you do: Starting with the entity and following the relationship line that comes out of it through to the related entity, say…

“[ENTITY NAME] [optionality] [name] [cardinality] [ENTITY NAME]” where optionality is denoted by solid (mandatory) or dashed (optional) line and cardinality is denoted by crow’s foot (many) or a straight-line (one) connector.

Such as, “STATE/PROVINCE must be contained within one COUNTRY”. I know it’s a lot to take in, but practice a little, and it’ll make sense. I promise.

What you’ve defined with a relationship is a business rule. If you’re lucky enough to have business rules in advance, that’ll take you a long way towards defining your relationships.

ATTRIBUTES

Let’s talk attributes. Attributes are a detail that helps to describe the entity or the state of the entity.

If I’m looking at my COUNTRY entity, I might have many ways to describe it and its state

  • Name
  • Population
  • Sovereign?
  • Established Date
  • Government Type
  • Currency
  • National Symbol
  • Language

These are potential attributes for my entity. Why do I say potential? Well, because until we start filling out this model, we’re not really sure whether they fit as attributes or not. Attributes have rules:

1) Attributes should depend on the key.

So far, I’ve defined my natural (or business) key of Country as Country Code. Each attribute needs to be examined and the question asked: Does this attribute define that key and nothing else?

2) Attributes should be atomic.

This means that an attribute serves as one descriptor of the entity. It can’t represent two things. Say I had an attribute called LanguagePercent that contained information like “English-60%” and “Spanish-40%”. This needs to be handled differently. (Come back on Friday when we talk about ERD refinement to see how)

3) Attributes should not repeat.

Taking the Language attribute, let’s think of this in real-world terms: Some countries have a single, official national language (I’m looking at you, France), but others have multiple recognized languages. Belgium has three: Dutch, French, and German. I don’t want to have to add attributes to my entity like Language1, Language2, Language3. There is a solution: LANGUAGE isn’t an attribute after all. It’s an entity, with a relationship to COUNTRY. How about that? We just refined our list of entities.

What’s really cool about this is that COUNTRY and LANGUAGE actually have a Many-to-Many relationship once you break it out and start working through your verb phrase:

“COUNTRY must communicate via one or more LANGUAGE, and a LANGUAGE may be the communication method for one or more COUNTRY”

Many-to-many relationships are valid business/real world relationships, but they don’t fly within a database. On Friday when we talk ERD refinement, we’re going to have to address this.

4) Attributes should not describe other attributes.

What if I had two attributes, National Symbol and Has National Symbol? Well, the information contained in National Symbol is going to be dependent on whether there is one. We don’t want that. Our attributes should only define the entity.

Given this, we have the start of an Entity-Relationship Diagram for Country.

We’re going to stop here for today. For my homework, I’m going to go through the rest of the entities, define the relationships between them, and then define attributes. If you’re feeling frisky, try fleshing out the model on your own. On Friday, we’ll compare notes then start the very important refinement process. This is where we take a first draft of an ERD, and see how we can improve it. Data Modeling is an iterative process. No one creates a perfect model on the first try – that’s why we’ll spend Friday learning about how to methodically refine our ERD.

Thanks for reading, and I hope to see you again on Friday.

–Audrey

SQL University – Conceptual Modeling

Welcome to SQL University’s Fall 2011 Semester week of Data Modeling.  I’m thrilled to be a part of this great endeavor again, and a giant thank you to the man behind the project, Jorge Segarra (Blog|Twitter).  I hope you enjoyed Julie Smith’s (Blog|Twitter) SSIS Deployment posts last week, and are looking forward to Stacia Misner’s (Blog|Twitter) upcoming week of MDX.  I can’t even tell you how cool it is to be scheduled between these two amazing women.

So… Data Modeling!  Well, not quite yet. I want to take a step back and talk Conceptual  Modeling. Let me explain why. Every good IT person I know is a born problem-solver. It’s one of the reasons we love what we do. We are presented with a problem and get our thrills from finding a solution. But here’s the thing… We’re so enthusiastic about finding a solution that we occasionally jump right past understanding the problem at hand and just start doing stuff. In our world, the results of solutions to undefined problems can vary from the happy accident of an appropriate fix to a snowballing A Simple Plan-esque disaster.

I can assure you, they’re contemplating where to bury the server they just destroyed. It all started with dropping that index.

Continue reading

A Call to Arms: Ladies, I’m Looking at You!

I’m helping to organize SQL Saturday #89 in Atlanta. This morning, I looked to see where we were with speakers and submissions. Running through the list, I saw we had 32 speakers. Awesome. Scrolled through again. We had three women. Not awesome.

As of right now, we’re at 35 speakers and 3 of them are women. That’s 8.6% for those keeping track at home. That’s less than 1 in 10 speakers. I just spoke at SQL Saturday #77 in Pensacola, and was the only female speaker. This is not good. This is not the trend I want to see. (By the way, thank you to Melissa Coates (Blog | Twitter), Jen Underwood (Blog | Twitter), and Sarah Barela (Blog | Twitter) for submitting!)

I have a very strong opinion about objectivity in our community. Pick me based on the value of my content, not based on my gender. I’ve said a few times, publicly, that if we just leave smart women to their own devices, gender diversity will work itself out. Wow… That’s just not happening.

What is happening? Was I wrong? Where are we missing the boat? Where are the smart women who are database professionals, and why aren’t they clamoring to submit sessions? You know what’s the worst? I tried to think of someone I could call and say, “Submit already”! I couldn’t think of anyone. That made me angry. Really angry. Ask Aaron Nelson (Blog | Twitter). Poor guy had to listen to me rant (a lot) today. On reflection, I’m mostly angry with myself. I don’t have anyone in mind? I haven’t thought through who I should be encouraging to submit an abstract? Shame on me.

So, I’m stepping off my standard-issue Gender Neutral Soapbox and putting out a call to arms. Ladies, (yes, you… I’m talking to you), submit a session to a SQL Saturday. Get involved! Put yourself out there. Is it scary? Yes. It’s also fun, exhilarating, and rewarding. The first time someone e-mails you to tell you that your presentation taught them something useful, you’ll be hooked. I promise.

Here’s my commitment to any woman who’s trying to decide if she wants to try her hand at presenting and is on the fence: I will be there for you. I’ll review content, help you practice, give you pep talks, whatever you need. There are too few of us out there. Get involved. Help prove me right – that we’re just as capable, just as passionate, and just as ambitious as the boys.

–Audrey

audreydhammonds@gmail.com

On Epiphanies, Friends, and Career Direction

On Monday, May 25, I woke up to a realization.  I don’t like my job!  I’d like to say it was startling or sudden, but that’s just not true.  But the epiphany wasn’t that I was unhappy doing the work I was doing, it was that I suddenly realized that I could not do it anymore.  Not wouldn’t… couldn’t.  Some switch flipped and I knew with absolute certainty that I wasn’t going to create enough change in my environment to make myself love my job. 

So, while I drove into work, I thought about my epiphany.  I asked myself questions: 

1) Why don’t I like my job? 
2) What am I going to do about it?
3) Am I losing my mind; have I become irrational? 

The answers, respectively, were: 

1) I don’t love Business Intelligence.  It’s a great field, and I thought it was the natural progression of my career, but I just don’t love it.  I am surrounded by people that do, and that just highlights the fact that I’m not passionate about BI the way they are.  That, coupled with the fact that I’ve been in a very tough consulting environment for a few months where I regularly do work that is neither fulfilling nor educational made for a bad combination. 
2) I’m going to change something. 
3) Surprisingly, no. 

Okay, so what to do?  I e-mailed a few trusted SQL friends and told them about my epiphany.  I know that every one of them probably said, “duh, Audrey” as they read the e-mail.  They all intuitively knew that I wasn’t doing something I loved and were all supportive and encouraging of my decision to figure out what to do next.  My dear and always supportive friend Aaron Nelson (Blog | Twitter) pointed out that his company was interviewing candidates for a database developer position over the next few days.  Now, I’ve known Aaron for a while, and he loves his job.  I’ve met his boss and co-worker, and they’re cool guys.  AND… one of my very best friends in the whole world, Ted Hughes, is an application developer over there.  (Tangent:  I know what you’re thinking… I’m friends with an application developer?  I am.  He’s brilliant and awesome, and if I ever need to hide a body, he’s the guy I’m calling.  Partly because I know that he has a truck and proper tools, but also because I know he’d show up. Oh, and he writes good SQL.  That’s how we became friends in the first place.  :End Tangent)

Also, my poor husband got this text from me on Monday:  “Had an epiphany. Don’t love BI. Getting a new job.”  To his credit, he just said, “Do whatever you need to do to be happy”.  Yay, Jeremy.

Long story short, epiphany on Monday, interviewed, loved what I heard about the position, offered position, accepted position, turned in notice on Wednesday.  My last day at Key2 Consulting is June 14.  My first day at ista is June 20.  Yes, I’m taking 3 days off.  Mike, my new boss, will appreciate me not being a frazzled, stressed, overwhelmed mess on my first day.  I will probably hang on to 1 or 2 of those characteristics even with the time off, but I haven’t decided which yet. 

Key2 Consulting is the company I’m leaving.  I love this company.  I pursued Key2 when I started to look at a career in BI Consulting.  Brian Thomas is an amazing boss, and more importantly, an amazing person.  I have friends here that I’m sorry to leave.  I have no regrets about coming to work at Key2, and I’ve truly enjoyed being a part of this family.  But, I’m moving in one direction and Key2 is moving in another.  Loving the people you work with is important, but loving the work you’re doing is more so.  For anyone who IS passionate about BI, this is a quality group of people. 

I know what you’re thinking:  Okay Audrey, you got a new job.  Big whoop.  What’s your point?  I do have a point (actually a few).

What I’ve learned about my career and life in general

1) Don’t be afraid to try new directions.  Also don’t be afraid to admit when they haven’t panned out. 
2) If you’re unhappy where you’re at, try to change where you’re at.  If you can’t do that, find a new place to be.
3) Friends are so important.  My friends validated and supported what appeared on the surface to be an impulsive decision.  But, they’d listened to me over the past few months, and each one of them knew exactly what to say.  A couple of them even had opportunites to offer up. 
4) Forget career ladders and job titles.  Figure out what aspect of this crazy career choice of ours you love and go get paid to do it. 
5) Sometimes you just KNOW.  When you do, ACT.  This doesn’t just apply to our careers.  This applies to life in general.  How often do we feel certain about what to do next?  When one of those rare moments comes along, go with it.

To my friends who read my rambly, strange e-mail that Monday and responded with kindness and encouragement… Thank you.  You know who you are, and I hope you all know that you’re awesome.  Hugs all around. 

–Audrey

Cage Match I: Anti-Joins

Before we get started on the Cage Match, a quick shout-out to my blogging partner and fellow Datachix, Julie Smith (Blog | Twitter).  She’ll be speaking at SQL Rally (May 11-13).  I’ve seen her slides, and man, are you guys in for a treat.  Attend her session for some SSIS and Datachix-y Awesomeness!

 On to your regularly scheduled blog post…

One of the coolest things about writing and presenting is every time I present, I end up learning something new. Either someone in the audience points out something I hadn’t thought of, or a question spawns a whole new line of thinking. Both scenarios are very cool.

Recently, I presented on EXCEPT. I love this concept, because it is very “natural language-y”. It’s simple and effective, and I use it all the time. I had a few comments, mostly around performance and other patterns that can be used to achieve the same results. As a result of these comments, I decided that it was time for the inaugural Datachix Cage Match. So, without further ado, I present to you….

CAGE MATCH I: ANTI-JOINS!

Anti-Joins are used when we want a result set that represents this idea: Give me everything from one result set that doesn’t exist in another result set.

The Players:

NOT IN (The Old Man) – This is probably the first form of anti-join we all learn. It’s easy to understand, and it works. It’s developed a reputation as a resource-eater, and common wisdom is to avoid it.

SELECT Title, LastName 
FROM dbo.ImportBookONE 
WHERE Title +'--'+ LastName NOT IN  
  (SELECT Title +'--'+ LastName FROM dbo.ImportBookTWO );

LEFT JOIN (The Ninja) – This is the pattern that looks least like an Anti-Join. It’s an outer join coupled with a WHERE clause eliminating any rows that didn’t show results in the right query. It can be confusing if seen in the wild, because it isn’t immediately apparent that it’s an anti-join. It’s a favorite of savvy database developers.

SELECT o.Title, o.LastName 
FROM dbo.ImportBookONE o 
  LEFT JOIN dbo.ImportBookTWO t   
    ON o.Title = t.Title 
      AND o.LastName = t.LastName 
WHERE t.Title IS NULL   
  AND t.LastName IS NULL;

NOT EXISTS (The Existentialist) – NOT EXISTS has been around for a long time, and seems to get overlooked pretty often as an Anti-Join option. It comes right out and asks the age-old question, “Does what I’m looking for even exist”?

SELECT Title, LastName 
FROM dbo.ImportBookONE o 
WHERE NOT EXISTS   
  (SELECT t.Title, t.LastName
  FROM dbo.ImportBookTWO t
  WHERE t.Title = o.Title     
    AND t.LastName = o.LastName );

EXCEPT (The New Kid) – EXCEPT has been available since SQL Server 2005, and is a cousin to UNION. It’s the one I presented on, and is a personal favorite. (But I promise we’ll stick to facts for this cage match)

SELECT Title, LastName 
FROM dbo.ImportBookONE   
  EXCEPT 
SELECT Title, LastName 
FROM dbo.ImportBookTWO;

The Setup:

The cage match will consist of 4 rounds:

Round 1: iHeartBooks Match – I take the EXCEPT I wrote for the presentation I did and rewrite the Anti-Join using all 4 options

Round 2: Mini Match – 4 contestants, 10 and 8 row tables, respectively

Round 3: Big Match – 4 contestants, 332,500 rows in one table, 321,416 rows in the second

Round 4: Main Event Match – 4 contestants, 6,760,676 rows in one table, and 6,250,625 rows in the second, with a 2-part business key thrown in for good measure

The Measurements:

CPU Time – The amount of CPU time per query (in milliseconds)

Reads – The number of logical disk reads per query

Duration – The total amount of processing time for the query

Percent of Batch – The percentage of the batch taken up by the query (when all 4 anti-joins are run in a single batch) – this doesn’t really tell us much besides giving us rough comparative numbers, but it’s interesting so I’m keeping it

ROUND 1: iHeartBooks Match

Using the same database I presented with, I run the variations on Anti-Join. Below are the queries competing in Round 1:

LEFT JOIN NOT EXISTS
SELECT o.Title, o.LastName
FROM dbo.ImportBookONE o
LEFT JOIN dbo.ImportBookTWO t
 ON o.Title = t.Title
  AND o.LastName = t.LastName
WHERE t.Title IS NULL
 AND t.LastName IS NULL;
SELECT Title, LastName
FROM dbo.ImportBookONE o
WHERE  NOT EXISTS
(
SELECT t.Title, t.LastName
FROM dbo.ImportBookTWO t
WHERE t.Title = o.Title
AND t.LastName = o.LastName
);
EXCEPT NOT IN
SELECT Title, LastName
FROM dbo.ImportBookONE
EXCEPT
SELECT Title, LastName
FROM dbo.ImportBookTWO;
SELECT Title, LastName
FROM dbo.ImportBookONE
WHERE Title +'--'+ LastName NOT IN
(
SELECT Title +'--'+ LastName
FROM dbo.ImportBookTWO
);

OUTCOME:

CONTESTANT CPU READS DURATION Percent of Batch
LEFT JOIN 0 51 40 25
NOT EXISTS 15 51 6 25
EXCEPT 0 51 7 25
NOT IN 0 51 0 25

Hmmm… Not very decisive. Every query had the same number of logical reads? That’s no fun! We’re working with very few rows, and we really end up with no discernable difference between each contestant. On to Round 2!

ROUND 2: Mini Match

Now, we’re going to see if this logical reads thing holds with another set of queries. This time, we’ve got 2 very small tables, one with 10 dates and one with 8 dates. The Anti-Join will return 2 rows.

LEFT JOIN NOT EXISTS
SELECT m.CageMatchDate
FROM ExceptCageMatchTiny m
LEFT JOIN ExceptCageMatchTinySub s
ON m.CageMatchDate = s.CageMatchDate
WHERE s.CageMatchDate IS NULL;
SELECT CageMatchDate
FROM ExceptCageMatchTiny m
WHERE NOT EXISTS
(
SELECT CageMatchDate
FROM ExceptCageMatchTinySub s
WHERE m.CageMatchDate = s.CageMatchDate
);
EXCEPT NOT IN
SELECT CageMatchDate
FROM ExceptCageMatchTiny
EXCEPT
SELECT CageMatchDate
FROM ExceptCageMatchTinySub;
SELECT CageMatchDate
FROM ExceptCageMatchTiny
WHERE CageMatchDate NOT IN
  (SELECT CageMatchDate
  FROM ExceptCageMatchTinySub);

OUTCOME:

CONTESTANT CPU READS DURATION Percent of Batch
LEFT JOIN 0 31 5 19
NOT EXISTS 16 31 6 18
EXCEPT 0 31 7 45
NOT IN 15 31 6 18

So what have our first two rounds of the cage match taught us? Well, if you’re working with very small data sets, you probably won’t see much difference in performance. Look at the differential between overall duration – almost nothing. This is a great example of the trap that we as database developers tend to fall into. We work with small sample databases during development, and it allows us to be lazy. Every option looks similar, and we go with what’s familiar instead of what’ll scale. Watch what happens in Rounds 3 & 4. Also, a great example of why only looking at Percent of Batch in SSMS can be misleading… 45% of the batch was EXCEPT, even though CPU time registered 0 and overall Duration was only 7? If you’re interested in seeing a complete picture of what’s happening on a server when you hit F5 in SSMS, take the time to kick off Profiler.

ROUND 3: Big Match

Ah, now the pros are entering the cage! We’re working with over 300,000 rows in each of our two tables, and there might be some blood spilled (finally!).

LEFT JOIN NOT EXISTS
SELECT m.CageMatchDate
FROM ExceptCageMatchMain m
LEFT JOIN ExceptCageMatchSecondary s
ON m.CageMatchDate = s.CageMatchDate
WHERE s.CageMatchDate IS NULL;
SELECT CageMatchDate
FROM ExceptCageMatchMain m
WHERE NOT EXISTS
(
SELECT CageMatchDate
FROM ExceptCageMatchSecondary s
WHERE m.CageMatchDate = s.CageMatchDate
);
EXCEPT NOT IN
SELECT CageMatchDate
FROM ExceptCageMatchMain
EXCEPT
SELECT CageMatchDate
FROM ExceptCageMatchSecondary;
SELECT CageMatchDate
FROM ExceptCageMatchMain
WHERE CageMatchDate NOT IN
 (SELECT CageMatchDate
 FROM ExceptCageMatchSecondary)

OUTCOME:

CONTESTANT CPU READS DURATION Percent of Batch
LEFT JOIN 389 1912 436 7
NOT EXISTS 390 1912 372 7
EXCEPT 624 1912 705 12
NOT IN 1262 1001148 880 74

Ah, the Old Man’s starting to show some wear. Logical reads for NOT IN went through the roof, and if you look at the Execution Plan, you can see why:

Poor guy had to spool out to a temporary table to hold the rows. None of the other contestants ran into this, and it absolutely killed NOT IN. I think he might be on his last leg. Very interesting that LEFT JOIN, NOT EXISTS, and EXCEPT are still showing the same number of logical reads. I was surprised by this.

ROUND 4: The Main Event

Finally, the round you’ve all been waiting for! A chance to see a no holds barred showdown between each of the contestants. So far, NOT IN has fallen behind, but LEFT JOIN, NOT EXISTS, and EXCEPT have both held up well. In this round, we’re working with two tables containing well over 6 million rows each. Just to make it more challenging, we’ve set up a two-part key to evaluate. Let’s see the final queries and results:

LEFT JOIN NOT EXISTS
SELECT m.CageMatchAlpha,  m.CageMatchNumber
FROM ExceptCageMatchGinormous m
LEFT JOIN ExceptCageMatchGinormousSub s
ON m.CageMatchAlpha = s.CageMatchAlpha
AND m.CageMatchNumber = s.CageMatchNumber
WHERE s.CageMatchAlpha IS NULL
AND s.CageMatchNumber IS NULL;
SELECT CageMatchAlpha, CageMatchNumber
FROM ExceptCageMatchGinormous m
WHERE NOT EXISTS
(
SELECT CageMatchAlpha, CageMatchNumber
FROM ExceptCageMatchGinormousSub s
WHERE m.CageMatchAlpha = s.CageMatchAlpha
AND m.CageMatchNumber = s.CageMatchNumber
);
EXCEPT NOT IN
SELECT CageMatchAlpha, CageMatchNumber
FROM ExceptCageMatchGinormous
EXCEPT
SELECT CageMatchAlpha, CageMatchNumber
FROM ExceptCageMatchGinormousSub;
SELECT CageMatchAlpha, CageMatchNumber
FROM ExceptCageMatchGinormous
WHERE CageMatchAlpha+'-'
  +CAST(CageMatchNumber as varchar(10))
NOT IN
 (SELECT CageMatchAlpha+'-'
  +CAST(CageMatchNumber as varchar(10))
 FROM ExceptCageMatchGinormousSub);

OUTCOME:

CONTESTANT CPU READS DURATION Percent of Batch SSMS Run Time
LEFT JOIN 12446 30844 13029 8 0:14
NOT EXISTS 11483 30840 13917 7 0:13
EXCEPT 19456 30848 19988 10 0:20
NOT IN 60216 20363994 116674 76 1:52

Aha! Finally, we see some real difference in performance. The Old Man (NOT IN) has been knocked out cold. But, surprisingly, the other 3 contestants held up surprisingly well. EXCEPT started to show some cracks in the millions-of-rows arena and lagged behind LEFT JOIN and NOT EXISTS. I think I’m going to have to call it a tie between The Ninja and The Existentialist, with The New Kid in a close third.

This post was fun to research and fun to write. But there’s a moral to the story besides “NOT IN = Bad and LEFT JOIN = Good”. If you’re in a development environment, think about the size of your development data set versus what you expect your production data to look like. (Yeah, you heard me… Capacity Planning) Develop using best practices, and avoid lazy pitfalls (Like NOT IN). And most importantly, take the time to try different options. I mean, who hasn’t dreamed of refereeing a cage match?

Query on, my friends….

–Audrey

Five Places to Document Before You Win the Lottery

When a project starts to go sideways and you’re cursing Naive Past You who signed off on the “aggressive” time estimates as you feed quarters into the vending machine in hopes that powdered donuts and honey buns won’t give you a heart attack before you hit the deadline, the last thing you want to talk about is documentation. Non-developers, if you value your life, observe the developer and her surroundings before you ask whether the documentation is updated. If you see any of the following…

  • Pyramid of Diet Coke or Mountain Dew cans
  • Pile of takeout containers (half-eaten because there’s no time)
  • Wild, bloodshot eyes
  • Caffeine-induced shakes
  • Sleeping bag and/or pillow

 …Back away slowly. DO NOT, under any circumstances, ask the developer if the documentation is up to date. You may lose a valuable body part before you can wedge yourself behind the file cabinet.

I love the beginning of a project. It’s full of resolutions and promises about how this time it’s going to be different. A quick run-down of statements I’ve heard (and said):

  1. “We’re building time into the schedule to keep the documentation up to date”
  2. “All of my time estimates will include some buffer for documentation”
  3. “QA will not sign off without documentation”
  4. “We’ve made accurate documentation part of the critical path”
  5. “We’ll document as we go”

What ends up happening? I can sum it up in 4 words: “We’ll fix it later”. But we never do. Then we leave and the next poor soul comes in blind and realizes that the documentation is either non-existent or woefully out of date. He gets to spend the next few weeks being indignant and self-righteous because the documentation is bad. He quickly forgets the undocumented tangle of code he left behind at his last job.

So, what exactly is documentation? Is it a necessary evil, a sacrifice to the development lifecycle gods? Is it yet another bit of unproductive administration work you have to do to check all the boxes on the project plan? Is it some antiquated legacy of waterfall methods? No. It is none of these things. Documentation, when done well, is a conversation between you and the next guy who comes along and looks at your work. You might not be there to explain why you had to do that weird thing in that stored procedure. You might not be able to tell the story behind the funky flat table you wish your name wasn’t associated with. You might not be able to sit down over lunch and talk about the wacky business rules you’re trying to enforce. But, your documentation will be there. More than anything else, this is your legacy.

Am I proposing that you create exhaustive documentation explaining every decision and line of code? No. That’s totally unrealistic, and I believe that it’s what gets us into trouble. Overwhelmed by the goal of perfect documentation, we end up not writing any documentation at all. I fall into the good enough camp in my approach. Even then, it can be a struggle to keep up. In that spirit, I have a few must-haves and guidelines for what and when to document. Also, I’m sort of a function over form kind of girl. If it works, do it. If it’s pretty, well, that’s nice too. Of course, I’m a developer. If you’re a DBA, your list will probably look different. But, I recommend the same basic approach. What are my must-haves and what are my triggers for a bit of explanatory prose? Or, you know, poetry… but if you have time to write documentation in iambic pentameter, you probably need a more challenging job.

Must-Haves

Data Dictionary – If nothing else, get the basics down:

  1. Entities and their purpose
  2. Attributes and their definition
  3. Business names for entities and attributes
  4. Explanation or examples of what data can end up in a column

Source to Target Mapping – if you’re moving data around, make a map

  1. Where does the data come from?
  2. Where is it going?
  3. What are the stops along the way?

Anything Manual – if you have to do anything out of process or manually, write it down

  1. Bob from Accounting asked you to run a “quick little report” for him in 2009. Somehow, you’re still running it every Monday
  2. You manually kick off that pesky job every Thursday evening instead of scheduling it
  3. You had to (gasp!) manually update data in the database

Other Guidelines

In-Line Documentation – (Stored Procedures/Functions/Scripts/etc.)

  1. Set up a nice little template inside any procedure/function/script that tells: Author/Date/Purpose/Major Revisions
  2. If you did something unconventional inside some encapsulated code, make a note of why you went this direction
  3. If some piece of code started out pretty and then grew horns and a tail and bought a pitchfork from Evil Processes R Us, explain what happened
  4. If you’re working around some limitation of the application/database/etc., say how and why

In Case I Win the Lottery Documentation (aka, In Case I Get Hit by a Bus)

    Imagine that you walk out the door this afternoon, and for some wonderful (not getting hit by a bus) reason, you never step foot inside your office again. You never log into the network again, and no one knows how to get in touch with you. What do only you know? Trust me, if you’ve been with an organization for any length of time, I can guarantee that you have some proprietary knowledge. This can be the least formal of all. Just make a list of things you hope someone else knows how to do when you’re sitting in Tahiti sipping fruity drinks, ogling the very cute cabana boy as you ask him to bring you yet another extra towel.In our little SQL Server community, we talk a lot about being part of the community. (I know… totally Meta) Documentation is a great way to be a good citizen inside our world. Chances are someone you know, or someone who knows someone you know, will be coming into an organization after you leave. Helping the next guy ramp up more quickly and avoid your early stumbles is just about the best you can do to support your fellow database professional.