Advanced DAX: Problem, Design, Solution

Thanks for watching my latest webinar on DAX. If you missed the webinar you can watch the recording here:

http://pragmaticworks.com/Training/Details/Advanced-DAX-Problem-Design-Solution

I didn’t have a lot of questions from this webinar but I will answer a the ones I did have.

Question: What is the difference between the Find function and the Search function.

Answer: The FIND function is case-sensitive and the search function is case-insensitive. In the webinar I used the FIND function in my first demo but really this could have been replaced with the SEARCH function since I was not concerned with case sensitivity.

Question: Where did you find the School Grade demo that you used in your webinar?

Answer: The data I used in the webinar can be found at the link below:

http://schoolgrades.fldoe.org/reports/#2014

Question: Might there be an upcoming webinar regarding DAX as it pertains to SSAS cubes rather than Power BI?

Answer: I am not aware of any webinars upcoming that will be specific to Tabular Cubes. It is important to point out that the examples used in this webinar are also applicable with Tabular SSAS Cubes since it uses the same backend engine as Power BI Desktop (xVelocity). If you implement these solutions in SSAS Tabular, the consumption of the data would be achieved through excel and then the necessary filters and slicers would be applied there instead of the report view provided in Power BI Desktop.

Thanks again!

Advanced DAX Webinar

Hey Everyone,

Thanks for attending the free 1 hour DAX Webinar by Pragmatic Works. In this blog I am going to focus on the primary question that I saw come up over and over again in the chat logs. The webinar can be found here:

  • How come the totals for Forecast and Dynamic are the same?
  • The Grand Total is different as well.
  • Are we sure all those totals are right? for The MLS data?

The totals that are generated in DAX are usually an area that cause confusion, especially if you are still learning about filter context. Totals do NOT simply add up the values that appear in the rows on the report. This is confusing because most of the time it looks like this is exactly what is occurring in the background.

If the Totals cell in your table does not sum the rows then how does it come up with a total?? Great question! The total cell also executes the DAX formula defined in your measure. Let’s start with a visual, one from the webinar that prompted the questions leading to this post. Notice in this picture you can clearly see that the “TOTAL” is not the sum of the rows. The values represented in the total row look completely wrong, however they are not.

SNAGHTML1a38ee9e

Remember this question? How come the totals for Forecast and Dynamic are the same? Let’s try to answer this question now. We can begin with Forecast YTD Sales.

Forecast YTD Sales is a simple calculation in the model. It is simply Prior YTD Sales * 2. In the screenshot below I have added the calculated measure Prior YTD Sales to the table so we can visualize exactly how the total cell is being calculated.

The total value is $19,582,120 which is $9,791,060 * 2.

image

Remember that in our original table there was only data up until June of 2008. Therefore, why is the total row taking Prior YTD Sales from December? Why is the total calculation taking the value of $9,791,060 from the Prior YTD Sales measure instead of $3,037,501 (June)? The table visual only displayed data up until June but this was because we don’t have any sales past June of 2008. The total row is looking at the entire 2008 year and therefore gets the Prior YTD Sales value as of the last date in 2008 ($9.791,060). Although this may not be the desired behavior it is the correct behavior.

How come the totals for Forecast and Dynamic are the same?

Well all of that is great, but that still doesn’t explain why the Forecast YTD Sales has the same value as our Dynamic Measure at the total row. The Dynamic Measure calculation always takes YTD Sales unless there are no sales on the last day of the current month in which case it will take the Forecast YTD Sales. This was explained in the webinar so please revisit that if you don’t remember the why behind this calculation.

IF(
    LastSaleDate <> LastDayOfMonth,
    [Forecast YTD Sales],
    [YTD Sales])

At the total row the calculation would read like this:

IF(
    “6/20/2008” <> “12/31/2008”,
    [Forecast YTD Sales],
    [YTD Sales])

Based on this interpretation of the measure we can see that the last day a sale took place in our model does not equal the last date in our model for the current year. That means that the total ROW is going to display the measure [Forecast YTD Sales]. This explains why we see the following values in our total row:

image

Once again understanding the total row is not easy, especially if you are transitioning to DAX from working with excel. If the total row seems incorrect it’s important to validate the calculated measure at the total level.

Thanks!

Advanced TSQL Takeover

Thanks for joining the 3 hour webinar I did for Pragmatic Works on Advanced TSQL. There were, as could be expected, many questions regarding the webinar so in this blog post I will try to answer as many of those questions as I can.

The TSQL Scripts used in the class can be found HERE: Please note that for one of the examples you will need to create and load the MicrosoftStockHistory example and I have provided a script to do just that.

Common Table Expressions

Question: Why does a common table expression begin with a semi-colon?

Answer: The WITH keyword is used for other purposes in TSQL other than just CTEs so to avoid confusion any statements prior to the CTE must be terminated using a semi-colon.

Question: What are the performance differences between CTEs and Temp tables?

Answer: Unfortunately CTEs and Temp Tables have entirely different use case scenarios so this is like comparing apples to oranges. CTEs are logical tables used to simplify code, essentially they are like views. Temp tables on the other hand are physically materialized on disk in the temp db. They also have column statistics and can be indexed for optimizing your TSQL Code. The answer to this question is really much more in depth then what I have provided here but this is the really really short answer.

Question: Is there a performance difference with CTE and Derived tables?

Answer: Rarely will you see a performance difference here, I have always seen identical plans personally when comparing performance. However, I called and talked to the smartest DBA I know, Bradley Ball. There are some situations with self-referencing CTEs where the SQL Server optimizer can generate a bad SQL Plan due to the Cardinality Estimator not being able to read the statistics of the underlying table. This is the same behavior that you would see if you were nesting views on top of views. In those situations a Derived table or a Temp Table would be a better option.

Window Functions: Framing

Question: When using Range for the frame could the duplicates values be eliminated with the Partition By Clause?

Answer: Range can generate unusual results when used to generate running totals. The unusual results occur when the Order By clause within a partition can contain duplicate values therefore duplicating the “running total” for multiple cells. The fix here is to either use the ROWS framing option or to choose a column for the Order BY clause that is unique.

Question: Can you call LAG(LAG(ClosePrice)) to get the close price from 2 days previous?

Answer: The default behavior of lag is to go back one record. The default behavior can be overridden by using the optional parameter [offset] when writing your lag function. For example the code here would look like: LAG(ClosePrice, 2) OVER (ORDER BY DATE)

Question: Is the performance gain of framing window functions only applicable for the aggregate window functions, i.e. not important for the other window functions?

Answer: The performance gain can be achieved also with ranking and analytical window functions. Anytime an order by clause is required we can specify ROWS instead of the default behavior of RANGE. We showed a demo of using ROWS with the Last_Value analytical window function in the webinar.

Question: Will there be a bathroom break?

Answer: I don’t plan on making it three hours without taking a break! Smile

Working with XML in SQL Server

Question: Can you query a XML File?

Answer: Absolutely. Querying an XML file is very easy to do and this is also referred to as Shredding XML. I will try to write a specific blog walking through how to do this in the coming weeks. Please note that I didn’t have enough time to show this in our extended 3 hour presentation but in the Pragmatic Works Advanced TSQL course this is covered.

Question: Can you create element centric XML but include some columns as Attributes?

Answer: This can be done easily by using the XML Path mode. XML Path generates element centric XML by default, this is in contrast to XML Auto and Raw which generates attribute centric XML. With XML Path you can convert any of the columns into attributes by simply giving it an alias name preceded by the @ symbol. For example FirstName as [@FirstName].

On-Demand Training

Question: Can we see a syllabus of the on demand courses and what material is presented in each class/section without signing up for a trial?

Answer: Please visit http://pragmaticworks.com/Training/On-Demand-Training, once there scroll down to the bottom of the page and then click on one of the individual class names to get the class outline and other information.

Question: Does the on-demand training include the ability to ask questions when something isn’t clear?

Answer: Yes, you can send questions to our training director directly. Another great feature of our On-Demand training platform is that it comes with two free hours of virtual mentoring. This is one on one time with a Pragmatic Works consultant.

How to get Line Item Detail information with Totals in T-SQL (Percent of Parent).

Hi All,

This blog post is a follow up to a question I received when I gave my Advanced TSQL Webinar for Pragmatic Works. If you haven’t seen that yet and would like to see it you can view the webinar here: 

Question: How can we get SubTotals in TSQL using the CTE method you mentioned?

In my webinar I showed how to get totals and sub totals in the same result set as line item detailed information (See screenshot below). The method I used involved using the OVER clause and it kept the SQL very clean and easy to read. Unfortunately this method is not the best performing option available and that is because the over clause without framing uses disk. (I have pasted the TSQL example with the over clause at the bottom of this blog post for comparison and reference purposes.)

image

Sub Totals in SQL with CTE method:

First of all can I just preface this by saying I love CTEs? (Common Table Expressions). Let’s jump right in and write some code! For this example I’m going to be using the AdventureWorks2012 database but this should work with all versions of Adventure Works.

Step 1) – Create CTE with total information: 

WITH TOTALS AS
(
    SELECT CustomerID, SUM(TotalDue) AS Total
    FROM Sales.SalesOrderHeader
    GROUP BY CustomerID
)
SELECT * FROM Totals ORDER BY CustomerID

 

Results:

image

Step 2: Create an SQL query with line item detail information.

SELECT CustomerID, SalesOrderID, OrderDate, TotalDue
FROM Sales.SalesOrderHeader

image

Step 3: Join them together!

WITH TOTALS AS
(
    SELECT CustomerID, SUM(TotalDue) AS Total
    FROM Sales.SalesOrderHeader
    GROUP BY CustomerID
)
SELECT soh.CustomerID, SalesOrderID, OrderDate, TotalDue, Total
FROM Sales.SalesOrderHeader soh
JOIN Totals t
    ON t.CustomerID = soh.CustomerID

Final Result:

image

As I mentioned above you can get the same results using the OVER Clause in TSQL. I have pasted the code below for that example:

SELECT
    CustomerID, 
    SalesOrderID, 
    OrderDate, 
    TotalDue, 
    SUM(TotalDue) OVER(Partition By CustomerID) AS CustomerTotal
FROM    
    Sales.SalesOrderHeader

Final Thoughts: This method will generally perform better than simply using the over clause method, but it takes a more code and work. If the over clause function gets the job done and performance is not an issue I would recommend using that method to keep the code simpler and easier to read!

Thanks for looking!

Advanced T-SQL Webinar / Free Training

Thank You!

First of all thank you to everyone who attended my T-SQL presentation on 5/10/2016. We had over 1400 people registered for the training and 871 people joined and listened in, that’s a lot of people! I also received a lot of good and encouraging feedback so thank you for that as well.

Resources

The number one question I received is will the T-SQL scripts be available for download. Of course they will. If you want to download the T-SQL scripts that I used for my presentation you can find those here:

Free Recording:

The full one hour webinar that I did for Pragmatic Works on 5/11/2016 can be found here:

Questions and Answers:

I just received the list of questions from the webinar so I will get this section updated in the next couple days. I wanted to go ahead and post this blog so you could have access to the SQL Scripts.

Question:

Les Said: BTW, best presentation EVER!!!! Very clear and straight to the point in each case. Congrats!!!

Answer:

Thank you!, Best question ever!

Question:Recursive CTEs in SQL

Ken Asked: Should anchor be unique? What happens if there are two rows returned in the Anchor member?

Answer:

Hey Ken, the anchor member does not need to be unique here. In our example we were specifying that the CEO is at the first level or Level 0. If there are multiple members in the Anchor then multiple members would show up at Level 0.

Question: Pivot in SQL

Travis asked: Can you do multiple columns, such as minutes and cost?

Answer:

Travis I believe you are asking about the pivot example that I showed in my webinar. The answer is yes, you can definitely do a double pivot or pivot on multiple columns. I will write a blog on how to do this in the next week, so please check back!

Question: Merge in SQL vs. SSIS Update

Vineet asked: How does the merge statement compare to updates done in SSIS.

Answer:

The only native built in update capability that we have in SSIS is the OLE DB Command so I assume that this question is in regards to comparing the merge with the OLE DB Command. The merge pattern will perform light years better than doing updates in SSIS using the OLE DB Command. The merge pattern I showed in the webinar is a very popular design pattern used for loading data warehouses.

Question: SubTotals with CTEs in SQL Server

Vineet asked: Can you share CTE Examples on your blog.

Answer:

Hi again, you were not the only one to ask this question as I have also received a few emails requesting this example. I will write a blog post with an example on this in the next week, so please check back!

Question: What SQL Books would you recommend?

Answer:

There are a lot of great SQL Books out there. I will stick with one author here. I have three books by Itzik Ben-Gan.

  • T-SQL Fundamentals
  • T-SQL Querying
  • T-SQL Programming

MDX CALCULATION WIZARD – 12 Awesome MDX Calculations

12 MDX Calculations for your SSAS Cube:

If you have suddenly found yourself in a situation of having to improve the functionality of your cube with MDX Calculations but you don’t know MDX then you have come to right place!  Learning MDX can be a difficult task and one that can take some time, and time may not be something you have!

MDX Calculation Builder Wizard

Well I have great news for you! BIxPress by PragmaticWorks has an MDX Calculation Builder that will help you improve your cube functionality without having to learn MDX! The Wizard will walk you through a series of simple steps and then generate the MDX code for you and add it to your SSAS Solution and voila!

FREE TRIAL HERE: BIxPress Free Trial

Posted below you can find the different calculations and sets included in the wizard:

image

Thanks for looking!

MDX ISEMPTY function to return products with no sales.

Have you ever wondered how can I return all my non selling products in MDX? Well, if you found your way to this post hopefully you have! Fortunately this is actually quite easy to do.

ISEMPTY FUNCTION

The ISEMPTY function returns a BOOLEAN value. If the cell evaluated is empty the value of TRUE is returned and if it is not empty then the value of FALSE is returned. This is a great function to nest inside other functions. Let me show you how we can use this function along with the filter function in MDX.

MDX FILTER FUNCTION

Here is a simple MDX query that returns [Internet Sales] on columns and [Products] on rows.

image

In this query we can quickly see we definitely have products that have never had internet sales. Now how can we return only the products that have never had sales? Well there are many ways to do this of course but I want to show how to use the Filter Function along with the ISEMPTY function for this example.

image

Here I use the set of members returned from [Product].[Product].Children as the first parameter in the filter function. For the second parameter in the filter function I use the ISEMPTY function on [Measure].[Internet Sales Amount]. This means that only products from parameter 1 that have no sales will be returned in the final result set. In the result set above we can see that only products without sales are being returned in the final result set. Easy right? Smile 

As always thanks for looking!