by Julie Smith
Originally Published 9/24/2017, updated daily.
Microsoft makes major improvements via monthly and sometimes weekly releases to Power BI. In my time working on Power BI projects at Innovative Architects, I have found that the only way to stay on top of the frenetic pace of Power BI’s improvements is to closely follow its blog and other social media feeds.
This page is a Power BI report (how meta right?) built off of the content from Microsoft’s Power BI blog. I have set it up to run daily and refresh. Please let me know in the comments if you feel that the data has become stale. As part of building this report, I have added some curated slicers–things like PBI Service, PBI Desktop, Gateways, Connectors– and blogging content tags such as Contest or Webinar. Keep in mind that this is my best effort and there is no guarantee on the accuracy of my tagging. I provide this as my gift to you “as is” with no expressed or implied warranty.
I’ve been working with both Power BI and Azure SQL for the past nine months. One advantage to using Azure SQL Databases with Power BI is that there is no need for a gateway, personal or enterprise. The data refreshes every 15 minutes, period. You do not need to schedule a refresh or configure anything beyond the initial connection strings with server, database, user and password.
In planning for future SQL Saturday presentations in Chattanooga, Pensacola, and Atlanta I definitely wanted to show the happy green path that I feel is a viable pure cloud solution now for data warehousing. No problem, I’ll just use AdventureWorksDW, the warehouse sample database of our beloved Adventureworks bicycle shop, right?
Not so fast. There’s not a sample Data Warehouse for Azure SQL like there is for the transactional Adventure Works:
(Go here for full instructions on deploying AdventureWorksLT from the Azure portal.)
Today, Microsoft announced the April Power BI Desktop updates and they include something I and my colleagues at Innovative Architects have been long awaiting: formatting for table reports!
While it was understandable that initially Power BI wanted to focus on visualizations first and foremost, the display for pure tabular data in Power BI was lackluster. There was no control over any of the following:
Colors –of any element in a table, font, background color, title, totals. Nada
See the below, created in Power BI desktop. A sad little table showing sales figures for the year 1998 in Northwind.
MVP Summit is always an amazing event. This year was no exception. It’s one part boot camp, one part super-secret secret-telling time, and one part family reunion. Along with that, we get cool swag (like the utterly amazing Data Platform jackets Jennifer Moser hooked us up with this year), interesting conversations, and time with the guys & gals who build the products we’ve bet our careers on. Needless to say, I was happy to be there.
This year was also a little different, and I want to talk about that for a minute. There has been a lot of buzz since Satya Nadella took the helm at Microsoft that things were going to be Different. That product teams were going to align, that they’d be smarter about how they build software, and that they’d move faster than they ever have before. I have to be honest… I thought it was all marketing hype. Until last week.
The very first thing I noticed on Monday morning was that the level of transparency was through the roof. As a person who builds software for a living, I know that we all err on the side of pretending like we have all the answers and that our process is bulletproof. That was not the message from anyone on the Microsoft team last week. While it is always awesome to hear about what’s new on the technical side of things, there was another level of value coming out of the talks. Honesty. A willingness to fail. Engagement that was real. Actual two-way conversations.
One of the things I love to do during presentations is take a lot of notes. Along with the obligatory talking points and feature notes, I like to write down things that are said by the presenters that resonate. I cannot share the exact quotes because of NDA rules, but I have been given permission to share the gist of what I learned. Because I spend way too much time on Imgur, I’m including memes to illustrate my points.
Don’t be afraid to fail. Failing, and failing fast, gets you to the good stuff.
Sometimes, you have to admit that you’re doing something totally new and that you might not already be an expert. This is okay. Go learn it, then you can build it.
There’s a lot of new stuff coming at us. Embrace it. It ain’t going away.
Applaud the person who points out that things aren’t on the right track. She’s the one who is unafraid. (And as Mr. Herbert taught us, fear is the mind-killer)
Experiment. Try something different. Be willing to fail and then try again. It’s science.
In all seriousness, to hear these kinds of messages coming from the most venerable software development organization in our business was inspiring. It made me feel like going home and taking a few risks. It made me feel like we were all in this together. Data and data management is moving at an insane pace these days. Always changing, always moving forward. Keeping up is overwhelming on a good day. That the experts at Microsoft are saying , “We’re learning right along with you. We’ll get this.”, it is empowering.
My point is, the technical stuff was great. The product positioning information was helpful. But my real takeaway last week was that… well, let me share one little story…
I was in a meeting about a (NDA – sorry, y’all) thing. The presenter threw out some concepts and thoughts about the thing. I raised my hand and said, “I think I have a use case for you. Let me run you through a scenario that one of my clients has.” After I explained what I needed, I asked, “So, how would you solve this problem?”. The response? “I don’t know yet. But I think we can solve it together. Let’s stay in touch and see if we can come up with some good ideas.”
And that’s it right there. I went to a session about a topic where Microsoft didn’t have the answer yet. They still got in front of us and talked about where they were, what their goals were, and what they were doing to move forward. And when we had ideas or real-world problems to solve, they engaged. They asked us for help. Not “help”, as in, “fill out this survey for us; we promise we’ll do something with your feedback”. We were treated as peers and as people on the ground who had real value to add to the conversation. It was a little bit amazing.
And you know what? It’s working. They’re doing more, faster. They’re innovating in a way that big companies aren’t supposed to be able to do. I’m excited about where we’re headed.
So in short, thank you to Microsoft, the MVP Summit organizers, and everyone who makes our experience as MVP’s special. It was an awesome week.
I have been working with Microsoft’s shiny new Azure Data Integration tool, Azure Data Factory. ADF was made generally available on August 12th
ADF is available on the Azure portal, and you can use it to create pipelines to move data to and from other cloud based data stores and on premise data stores using Data Management Gateways.
There is a lot of documentation and info about ADF online. If you are brand new to it, I’d recommend starting here with the learning path from Microsoft. ADF has been in preview since 2014, and one caution I’d give you is that the domain specific JSON used by ADF went through a major rewrite mid July. So if you find a post previous to that, understand that any of the JSON in it will be old and need to be translated. In my experience, the JSON editor on the portal attempts to translate it for you. There is also a GitHub site with a translator
Reza Rad is sharing a lot of great content as well on his blog.
This post is going to be a “skip to the chase,” targeted post with a script to help you speed up your JSON descriptions of SQL tables. As I mentioned, ADF is ALL JSON –ALL THE TIME. Continue reading →
Summary: The BimlScript methods of Business Intelligence Markup Language (Biml) will only work (ie access SQL metadata) with SQL Server versions 2005 and higher. This article briefly tells the story of how Innovative Architects worked around this limitation for one of our projects and successfully tricked Biml by creating system views in SQL Server 2000. They ddl script for the views is found in a link at the end of the post.
I was happy to co-present a session at this week’s Atlanta BI User Group with Rob Volk (@SQL_R) meeting entitled “Harvesting XML Data from the Web with Power Query and Curl.” The demo gods were not with me on my grand finale demo that night however. I had spent the demo building a Power Query Function and when I tried to invoke it against a list of values, I got a failure which I couldn’t resolve that night. Of course, as soon as I opened the spreadsheet the next day I immediately saw the problem, which I will share here, as I think it is probably going to be something people encounter frequently as they start to work with Power Query.
What the Function Did:
Here’s the setup: www.SQLSaturday.com contains a page for every SQL Saturday, and if it’s available, the schedule for the event. Atlanta’s last event was this month and was SQL Saturday #285-hence, its schedule is located at http://sqlsaturday.com/285/schedule.aspx. Any other SQL Saturday event number works in the same manner. If I want to use Power Query to view this data as a page, I would enter that url as the source in Power Query:
So two days ago I posted this. It’s a way to generate an SSIS Expression for use in an Incremental Load’s Conditional Split. A friend had pointed out that this pattern was not the best– as NULL handling is not always as easy as replacing the NULL with what you might consider a safe value. I also got a very thoughtful comment on the post from a lovely gentlemen expressing the same concern. So obsessed, I went back to tinkering. I came up with ANOTHER expression (and consequently another T-SQL generator for it). I like this one a little better as it seems to me that it performs what is asked without introducing risk of replacing NULL values. So folks please read this, use it, bash it up and let me know what you think.
Here’s the new (to me, sure someone had already figured this out) NULL Handling expression for DELTA rows, using the column Color as an example:
This does NOT break the Conditional Split if there are NULLS. There can be NULLS in either the source or the destination or both and it does not break the pipeline. I love that.
How to read it from the left:
The whole expression will evaluate as TRUE and the row split into the Delta path when either the yellow portion or the Green and blue portion together evaluate to True.
The yellow highlighted expression asks: is either side NULL while the other is not? If yes, then evaluate to TRUE.
The Green highlighted section: Are both sides NOT NULL ? If yes, Then Blue highlighted section asks: are they unequal? If yes, then evaluate to True.
If both sides are non null, yet equal, or if both sides are NULL, then the condition is not met and the row is ignored. Just like we want it to be.
Now the SQL to generate the whole expression can be datatype agnostic. I love that too. Here is the SQL to generate the whole concatenated shebang for all of your columns:
Click on the SQL below to get a copy/past version 🙂
Happy 2014! Happy to report that Audrey and I were both renewed as SQL Server MVPs today! To celebrate I’m publishing a really long blog post.
This is a MONSTER long post. The main point of this post was to give you guys some T-SQL code which can be used against the information schema view of a SQL Server table destination to spit out a complex SSIS Expression. While that was the point of the post, I felt like I also needed to provide context of what I was trying to do or what you might be doing when the need for such SQL would arise.
also, I had Audrey read it over and she scolded me for a sad lack of chuckles. So I’m adding in some random Chuckle-y interludes. Enjoy them. Or skip them completely. Chuckle interludes are indicated by the face of Chuck Norris for easy recognition.
One of the most common scenarios encountered while ETL’in is an Incremental Load –that is determining if a source row already exists in the target database (often your warehouse), and if it does, has it changed since the last time it was loaded? The pseudo code goes like this: Continue reading →