Can you trust your foreign key? (Or, feature #821 I didn’t already know about SQL Server)

Hey friends! Long time, no see. I know, you’re wondering where I’ve been. Lately, I’ve had the luxury of taking some time to work on Microsoft certifications. I just got through 70-448, Microsoft SQL Server 2008, Business Intelligence Development and Maintenance. Now, I’m working on 70-433, Microsoft SQL Server 2008, Database Development. I’ll admit… I thought the Database Development exam would be a no-brainer. Heck, I’ve been doing this for years. Much to my chagrin, I’ve learned a few things I should have already known. (Isn’t that how it always ends up?)

Anyway, I was working my way through the Self-Paced Training Kit, and stumbled across one thing that I hadn’t known before and am so excited about that I want to share it with all 12 of our readers. Here we go…

First, because good bloggers give credit where credit is due, major props to the team that wrote the training kit for exam 70-433. They include: Tobias Thernstrom, Ann Weber, Mike Hotek, and GrandMasters. It’s a well-written book, and they obviously snuck some things in there that were more about good design and development and less about answering test questions. Kudos, gentlemen and lady! I’m basically re-writing something that you’ve already covered in your book, but I think it’s really, really cool and want everyone to see it regardless of whether they’ve bought the book or not. Given that, buy the book, dear readers. Even if you’re not studying for the exam. I promise you’ll learn something new.

Okay, to the point… Do me a favor, get on your local database and run this query. Go ahead, I’ll wait for you.

SELECT name, is_not_trusted

FROM sys.foreign_keys;

Good to have you back. I missed you. I stared pensively into the horizon while I awaited your return. I even wrote a poem and a folk song. Sorry, I digress… one too many rom-coms lately. Did you see any “1” values in the is_not_trusted column? Did you even know that foreign keys could be trustworthy? Nope, neither did I. What does it mean? It means that your foreign key hasn’t been verified by the system. How does this happen? Well, remember that optional clause called WITH CHECK | NOCHECK when you create a foreign key constraint? Yup. That did it.

So why does this matter? Well, it actually has an effect on your query execution plan in some cases. You know, that Query Optimizer is pretty darn smart. Let’s look at an example from the trusty old AdventureWorks database. I’m using AdventureWorks2008R2, but it should work with the older AdventureWorks databases. The scenario is this: I want to know if I have any sales orders that have invalid customers. CustomerID is a NOT NULL column in Sales.SalesOrderHeader. I know that if I count all the rows in Sales.SalesOrderHeader and the number of rows returned in the query we’re about to run, I should get the same number of rows back each time. But, sometimes rogue values slip in, especially when the database design has been refined over time. This could happen for any number of reasons: constraints that were added after the fact, legacy data, disabled constraints, etc.

Run this query, but turn on Include Actual Execution Plan (Ctrl+M) before you do.

SELECT soh.*

FROM Sales.SalesOrderHeader soh

WHERE EXISTS (SELECT * FROM Sales.Customer c WHERE soh.CustomerID = c.CustomerID);

Note that I’m using WHERE EXISTS rather than an IN clause with a subquery. This is because in this business scenario, I just want a boolean result (true or false), and I want it to run FAST.

Check out the execution plan. Note that the Customer table was never accessed. Why? Because that foreign key is trusted! Since it was verified on creation, we know that no rogue CustomerID’s snuck into the Sales.SalesOrderHeader table. It doesn’t even need to look at it.

Now, let’s muck with the foreign key and make it un-trusted. We’re disabling the foreign key with this statement. Books Online has a good article about what this means.

ALTER TABLE Sales.SalesOrderHeader

NOCHECK CONSTRAINT FK_SalesOrderHeader_Customer_CustomerID;

Verify that the foreign key is disabled by checking sys.foreign_keys again:

SELECT name, is_disabled, is_not_trusted

FROM sys.foreign_keys

WHERE name = ‘FK_SalesOrderHeader_Customer_CustomerID’;

If we run our query again, we see a completely different execution plan. Now, the query optimizer has to go look at the Sales.Customer table to get us an answer:

SELECT soh.*

FROM Sales.SalesOrderHeader soh

WHERE EXISTS (SELECT * FROM Sales.Customer c WHERE soh.CustomerID = c.CustomerID);

The execution plan had to change because SQL Server cannot guarantee that a CustomerID wasn’t entered while the foreign key constraint was disabled.

Here’s where it gets interesting. Enable the foreign key by executing the following:

ALTER TABLE Sales.SalesOrderHeader

CHECK CONSTRAINT FK_SalesOrderHeader_Customer_CustomerID;

Check out your sys.foreign_keys table again.

SELECT name, is_disabled, is_not_trusted

FROM sys.foreign_keys

WHERE name = ‘FK_SalesOrderHeader_Customer_CustomerID’;

What? It’s enabled, but it’s still not trusted! If we execute our query, we’re still going to get the execution plan that looks at Sales.Customer. Why? Well, that CHECK keyword up there just said to enable the foreign key, it didn’t say verify it. We have to issue this statement (Of course, if an invalid CustomerID snuck in while your FK was disabled, this ALTER is going to fail):

ALTER TABLE Sales.SalesOrderHeader

WITH CHECK —> this clause will make your FK trustworthy again

CHECK CONSTRAINT FK_SalesOrderHeader_Customer_CustomerID;

If we run our query now, we’ll get the sleeker, more efficient plan, because now our foreign key is enabled and trusted.

SELECT soh.*

FROM Sales.SalesOrderHeader soh

WHERE EXISTS (SELECT * FROM Sales.Customer c WHERE soh.CustomerID = c.CustomerID);

Cool, huh? The point is, this one teensy-tiny flag in the foreign key metadata makes a huge difference in how the query optimizer handles your query. It might not be much, but why not make sure that you can get as many trusted foreign keys as possible? You might just end up looking like a rock star for improving performance without having to modify any actual queries.

Now if I could just figure out put to get a is_not_trusted flag on people…

Why You Still Count as a Person if You Have Used Microsoft Access—(or How Old Are You, Julie? (or the overuse of parenthesis in early 21st century Bloggery))

( )

(A Short blog post for DYFHID.  This is like, eight years later than I had hoped, but I do like to keep my promises.  )

Yes, people rag on MS Access. Yes it’s a tippy tool.  YES, you really SHOULD wean yourself off of Access and move to SQL Server as quickly as possible if you like this crazy mixed up data business (and making money). 

But let me share my Holly Hobby Sewing Machine story with you.  (really Julie, Holly Hobby?  How old are you?) 

When I was a kid I had a toy Holly Hobby sewing machine.  It was made of thin plastic parts, glued together with finger paste.  It couldn’t do much.  If you tried to sew through more than 2 kleenex depth worth of fabric, or tried to go too fast, or tried to do any fancy curves or zig-zags, it would complain loudly and break in some dramatic way.   But –using it I learned the absolute basics of sewing and sewing machines.  Thread the needle. Thread the bobbin.   Push the pedal to sew the fabric together.  Avoid Puckering. (tee hee, you said pucker).   I quickly realized I needed a better tool and moved on to a domestic sewing machine designed for grownups with denim, and even eventually supported myself as a costumer using super duper industrial machines capable of sewing diamonds to kryptonite at the speed of light.

There was a similar evolution in my technical career.   I spent the first four years of my career at a business where the primary data was stored in a COBOL application.  The reporting and data extraction were all done through ODBC links to —(drumroll please) MS ACCESS 97. 

It was tippy.  It didn’t scale.  It was located on my local machine.  It used way too many staging tables, because its memory couldn’t process the data from the application without freezing and dying.  If you tried to do anything fast or fancy, it would complain and break, just like my toy sewing machine had done 15 years previously.  But—using it I did learn the absolute basics of sewing database programming and design.  Create a table.  Create another related table.  Write queries.  (Access does not in any way prevent the design of properly normalized data models)

 I forged on learning how to write decent (albeit tippy) databases and got quite a lot of work done with them, considering.  I did realize that I needed to make the switch to the heavy duty Server Technology and paid for my own training in SQL Server, then through stubborn perseverance got a job using it.  Several jobs later, here I am.  I haven’t seen the inside of Access in years.

Here is my point (and yes thank you Ellen DeGeneres, I do have one) (**are you quoting the title of Ellen DeGeneres’ 120 year old book?, SERIOUSLY Julie HOW OLD ARE YOU?)

I am not a bad developer because I started with Access.  It didn’t cause brain damage.  I was a good Access programmer and I am a good SQL Server developer because I have a fairly logical, process oriented mind.  No matter what tool I am using, I still have to process my data logically and efficiently.  I still have to listen to my users, formulate a plan for how to realize their dreams and implement that plan, using the best tools at my disposal. 

A person can have a lot of technical knowledge about SQL Server and still miss the mark on a project through any number of missteps (I.T. projects are fraught with peril—never forget that).  Never underestimate the importance of being smart and attentive.  You can’t fix stupid with a better tool.

(Now if you’ll excuse me I’m off (to lunch) at Piccadilly.)