by Julie Smith
I’m back from the MVP Summit and processing a staggering number of thoughts. Not only am I processing the brain dump of NDA technical goodness of exciting new stuff coming to the product, but a number of personal reflections. How lucky I am to have been somehow deemed worthy of this honor. How fortunate the SQL MVPs are in the number of sessions, pampering and genuine respect we receive from the SQL Server Product team. How lucky I am to be a part of one of the closest technical communities out there. The SQL Community calls itself family because we truly have a network of close, warm friendships. It dawned on me recently that I could go practically anywhere in the world (the world!) and find a colleague I have met from being a part of the SQL Community. I am a happy lady right now (in spite of the double ear infection).
Another benefit to you, my readers from my attendance to the Summit is a new motivation to continue sharing my knowledge with the community. That said, let me be the millionth prodigal blogger to apologize for my lack of blogging. Insert every lame but true excuse here: I’ve been busy, I’ve had things come up in my personal life, all true, but I know you want to hear from me.
So from the back logs of my “things I wanted to blog about” I picked a random DQS tidbit: How to import domain values and synonyms into DQS from Excel. Get ready for a wee bit of technical content!
When I first started working with DQS and saw the ability to import values from Excel, I used the function, but only to import the actual values. I did not realize until PASS Summit this year that you could also import the synonyms. I was asked if you could do so during a presentation on DQS, and luckily Rakesh Parida from Microsoft was in the audience and was able to field the question.
Let me do a small intro to DQS so that you have context. In Data Quality Services, the equivalent to a database which we are all familiar with is called a Knowledge Base. Knowledge Bases contain Domains, which are the equivalent to columns or attributes in a database.
In a domain, you can create domain values, which are the expected values for an attribute in the data. The beauty of DQS is that we can also create relationships or synonyms between domain values. As an example, if I have a list of vending machine suppliers for my company, I know that I should expect to see Coca Cola and Pepsi. But as an information worker in my company, I also know that sometimes I see Coke instead of Coca Cola. Coke is valid in my domain, but it should be corrected to Coca Cola so that at the end of the month, I know for certain how much I spent with the one company Coca Cola. Coca Cola and Coke are both valid domain values, but Coca Cola in my example would be the leading value.
So I’m going to show you how to import both domain values and their synonyms into a DQS Knowledge Base from Excel. Here is the data in Excel. The first column are the leading values, or the correct value for the domain. The Second column contains values possible for the domain, but that should be corrected to the leading value. Keep in mind that if a row in my spreadsheet has a value for both Corrected and Synonym, then DQS is going to insert two Domain values, one for each, but with the second having a correct to value.
Let’s set up a quick Knowledge Base and a Domain with datatype String for the Vendor:
- From the DQS Client, click New Knowledge Base.
- Type in “Vendors” for the Name.
- Leave None as the choice in the Create Knowledge Base from text box.
- Click Next
To create the Domain:
- Click the Create a Domain button in Domain Management.
- Once the Domain properties modal box comes up, type in Vendors for the Domain name. Leave the other values as is. This will give you a string datatype Domain.
Once the Domain is created, you will see the tabs for various properties of the domain above. We are going to be in the Domain Values Tab:
From here, click on the third button from the right (the Import Values Button) and select “Import valid values from Excel”. That will bring up the Import Values dialog box. Select your workbook and sheet names:
be sure to click the box beside “Use first row as header” and click ok. You’ll get a logging dialog to let you know how it went.
Now you will see your values from Excel available as values in your domain. Not only did you import the values, but the leading values. So when we run actual data through this KB in a cleansing project, the input of “Coke” will return as the having a corrected value of “Coca Cola”.
I hope this was a useful demo of loading your Knowledge Bases more quickly and I look forward to another wonderful year of SQL Server Community.