Unexpected Totals in DAX (Part 2)

In a previous blog post we discussed how to replace the total row with a blank value, primarily to eliminate confusion. You can find the original post here: Unexpected Totals in DAX (Part 1)

In this post we want to go a step further and replace the total row with our own DAX calculation.

image_thumb5

Our goal is to replace the value of 8,691 with the value of 1,005. The value of 1,005 is the total number of new homes listed on the market in 2016, which in our case is the last year or current year of the data set we are looking at. This is slightly more challenging though, because our date table goes to the year 2018. But just wait, we will get to that shortly.

Let’s take a look at the steps to solve this problem:

  1. Identify if the calculation is at the total row, we will use HASONEVALUE as we did in the previous blog post.
  2. Determine the MAX year in the data set with homes on market, not the last year in your date table, but the last year with actual homes listed on the market.
  3. Write a calculation that returns the New Homes on Market for the last year in the data set.

I am going to create a new measure so we can look at the two measures side by side.

image

I have modified the original DAX calculation, here we first check to see if we are at the total row using HASONEVALUE, if we are at the total row then we return blank. If this doesn’t make sense, please stop and go back to Part 1 where I cover this in detail.

  • A = Check to see if the current filter context has one value, if not, then we are at total row.
  • B = If there is more than one year in the filter context, replace with a blank value.

Determine the last year with homes listed.

Now that the total row has been identified, it’s time to author a DAX formula that returns the total homes listed for the last year in our data. The last year with homes listed in our data is 2016, therefore we need to write a DAX formula that reads like this:

Return the number of new homes listed in 2016.

Now, technically we can’t write the year 2016 in our formula because we know that this would no longer work once we move into the next year and we need our DAX formula to be dynamic (automated) and change with the years. Here is our first attempt at getting the MAX year.

image

  • A = Return the MAX year in the Filter Context
  • B = Return the MAX year from the variable at the total row.

image

For demo and validation purposes I am displaying the results of the variable in the total row (The max year), here we can see that the results are maybe not what we would have expected. The year 2018 is the last year in our date table but not the last year that new homes were listed. One way to get the last year with homes listed is to use the function LASTNONBLANK.

LASTNONBLANK

For the sake of brevity I won’t cover LASTNONBLANK here, but I will do a separate blog series on semi-additive measures. Let’s rewrite the DAX formula:

image

Using LASTNONBLANK we now get the last year that we had new homes listed. See results below:

image

Perfect! The hard part is over, now we simply count the total rows where the listing date of the home equals the year 2016.

image

Here is the final result:

image

As always, thanks for reading and I hope this helped!

Unexpected Totals in DAX (Part 1)

DAX is an awesome language and when paired with properly created relationships, DAX can add significant analytical value to your data models with minimum effort. But, you already knew that! So why are we here? Well, sometimes, DAX can produce results that are unexpected and this is usually very noticeable when looking at the totals row.

image

So why do we get unexpected results at the total row? Why doesn’t the total row simply return the sum of all the rows in a specified column? The simple answer is your DAX calculation is also computed for the total row and operates within the contexts of the total row. Let’s take a look at the previous screenshot.

In this example, the total row is returning all homes listed for all years in the filter context, this is in fact the sum of all the rows in the column. However, you may expect to only see the homes newly listed on the market for the latest time period, in this case that would be how many homes were listed in 2016. The result of 8,691 homes listed in the total row is not wrong or incorrect, it depends on what you are specifically looking for, it could definitely be unexpected depending on your analytical needs. If you do not wish to see the total of all homes ever listed in the current filter context then you have a couple of options available to you.

There are generally 3 ways you could address incorrect / unexpected totals.

  1. Pretend the problem doesn’t exist. (ProTip: Don’t ignore problems.)
  2. Identify when the DAX calculation is being evaluated for the total row and return a BLANK value.
  3. Identify when the DAX calculation is being evaluated for the total row and perform a different calculation.

In this blog post you are going to learn how to return blank to eliminate any confusion. In a future blog post you will learn how to use DAX to change the value of the total row.

HASONEVALUE function in DAX

In this simple example, the total row is the sum of all the rows, but this is not what we want. What we would want to show at the total row is how many homes have recently been listed for sale, not the total of all homes that have ever been listed. The first thing we must do is identify if we are at a total row. The way I do this is by using the function HASONEVALUE.

MSDN Definition:

HASONEVALUE: Returns TRUE when the context for the columnName has been filtered down to one distinct value only. Otherwise FALSE.

In the screenshot below I created a new measure called “Totals” and in this measure the function HASONEVALUE is used to correctly identify which row is the total row. This works because at the total level the filter context is all years, so the function returns FALSE.

image

The final step is to now use conditional logic and replace the Total row with a blank value.

BLANK() function in DAX

New Homes on Market (2)=
IF(
    HASONEVALUE(‘Date'[Year]),
    [New Homes on Market],
   BLANK())

SNAGHTMLde25f4f

Now the total row is no longer confusing or misleading. In part two of this series I am going to take this a step further and show how to use DAX to return your expected results at the total row instead of just returning blank!

Further Reading: Rob Collie over at PowerPivotPro has a great blog post on HASONEVALUE vs. ISFILTERED vs. HASONEFILTER:
https://powerpivotpro.com/2013/03/hasonevalue-vs-isfiltered-vs-hasonefilter/

Thanks!

Quick Tips – Mapping Geography Data in Power BI

One challenge of working with data of a geographical nature is that sometimes, it can be mapped incorrectly. In this quick post I want to give you a couple of tips that will help you to reduce, if not eliminate, incorrect mappings of your data!

There are a few different methods you can use to try to solve the issue of incorrect mapping of geographical data.

  1. Use hierarchies in your map visuals, hierarchies store relationships between attributes and can help with mapping a lot. A geography hierarchy might look something like the following: Country –> State –> City – > Zip
  2. Use data categorization. Sometimes a state can share the name with a city or a country. I remember years ago when I heard on the news that Georgia was under attack, that was pretty concerning for me since Florida is very close to Georgia ha ha. Of course, the state of Georgia was not under attack, it was the country! We can use data categorization in Power BI to specify that a column is a city, state, zip, or country.
  3. Remove ambiguity, for example, instead of having a city column, create a new column with the city and state. Then you can assign that new column a data categorization of Place. If you have millions of potential combinations then this may not be feasible within Power BI, this column would have terrible compression and most likely exceed any memory limits. However, this method works great.

Using hierarchies in Power BI to map geography data types

Take a look at the screenshot below, the State/Province of Nord in France is being incorrectly mapped to Lebanon.

image

Fortunately, this one can be solved very easily by using hierarchies. Nord, by itself, is not clear enough for Bing maps, however, if we add the country to the visual as well, then the picture becomes clearer and Nord will be properly mapped to France.

1) Add the Country to the Location. The country should show up above the state in the location list as seen in the following screenshot:

Add Country to Location

After adding the country, the map is at the highest level and you would want to now drill down to show the next level in the hierarchy. In the animated gif below, you will notice that Nord is now being mapped correctly in France!

Visual Drilldown

Using the PLACE data categorization in Power BI to map geographical data

In the following image, you will notice the map visual has been filtered down to the state California in the United States and therefore only the cities that exist in California should be displayed. Yet, the map visual is a little confused, and this happens because multiple states could have the same city name.

image

To solve this confusion you want to remove the ambiguity here and create a new calculated column with the city and state combined. Next, assign the new column a data category of Place. See demo below:

DataCategoryPlace

Replace the column city in your map visual with the new column city, state. Here is the final result, the cities are now mapped correctly and only cities directly related to California appear on the map:

image

Thanks for checking out this “Quick Tips” blog. Please check out my YouTube channel to find more Power BI related material!

Quick Tips – Quickly Changing Connections in Power BI

Hopefully, you don’t have to go in and change your connection strings often in Power BI, but when you do, you will want to know this little trick.

What’s not covered: Parameters

In this blog I’m not going to discuss parameters, although as a developer, I love that parameters are available for administration. Even better, the use of parameters in Power BI became even more powerful with the recent update to Power BI where parameters can now be changed in the Power BI Service! I will definitely do a blog or series on parameters in the near future.

What is covered: Power Query Editor & Data Source Settings

As a trainer, I come across the issue of needing to change connection string information quite often and this saves me a lot of time. Let’s jump right in. In this example, I have a power bi report with two separate data sources and a total of eight tables.

image

  • The first data source is to an excel file with many different sheets.
  • The second data source is a separate excel file.

Imagine that both excel files get moved from one directory location to another directory location. Typically this would mean that I would need to individually update the connection information for all eight tables in my data model. However, I’m here to show you a better way.

Launch the Power Query Editor

You will be making source connection changes from the Power Query Editor. Launch the Power Query Editor by clicking on “Edit Queries” found on the Home ribbon.

image

This will launch the Power Query Editor. Next, click “Data source settings” found on the Home ribbon.

image

This launches the Data source settings window and you will be able to see all of your connections here. If you had twelve tables all coming from the same excel file or a SQL Server database, ect… then you would only need to change one connection for all twelve tables. You get the idea!

Select the first connection in your list you want to update and then click on Change Source…

SNAGHTML2fc8e1db

This will launch a new window that allows you to quickly make source changes, this window will look slightly different depending on your data source. This example is loading data from an excel file, therefore the following window is provided for making changes:

image

After clicking browse, you can now browse to the new file location. In this example, the file has moved over to the DAX Advanced folder so I want to point to that directory and file, as you see below:

image

Once the file has been selected, click Open. Click OK. Now all seven tables that were pointing to that AdventureWorksDW excel file have been updated at one time. Repeat this process for any remaining data sources that may need updating and you are done!

I hope you enjoyed this Quick Tip!

Power BI–Problem, Design, Solution (Text Search)

 

There is a video that walks you through the solution in depth, please find the video at the end of this post along with the completed solution.

In this installment of the Problem, Design, Solution series we are going to show you how to perform a text search using slicers in Power BI, this simulates a “LIKE” type search. In the following screenshot you can see that when “Tax” is selected all records in the table that have “Tax” anywhere in the record are returned, likewise whenever “IT” is selected from the slicer all records in the table that have IT in them are returned. Hope you enjoy this post!

Solution_LikeSearch

Problem

Normally slicers (visual filters) only filter records that are an exact match, for example, if you select “IT” from the slicer then you would only expect to see records that have IT and only IT in the record, if it does not match IT exactly then those records are not returned. This is the intended behavior of a slicer and in general, this is exactly what we want. However, there are times when you might want to do a “LIKE” type search similar to the like function in T-SQL. In this scenario, we had a client who needed to do this exact thing. The screenshot below is our simulated problem for this example.

In order to bring back all records that contain “IT” you are required to click each occurrence of “IT” in your slicer, of course, this isn’t a big problem if you have a small number of possible combinations like our sample size. In larger environments, this may be an unreasonable expectation!

Problem_LikeSearch

Design

Now that we have discussed both the problem and what we want the ultimate solution to look like, it is time to lay out the design. The design includes a four step process.

Let’s break this down a little bit:

  • The first step in the process is to identify all the unique values that we need to filter on and put those values into their own distinct table, in this case, we are looking at departments (HR, IT, FIN, Audit, and Tax).
  • The second step is to create a measure that uses the distinct departments in our new table as a filter. We will call this measure IsFiltered.
  • The third step is to create a new slicer using the values listed in our disconnected table.
  • The final step is to tie our new measure (IsFiltered) to the specific visual that we want to be affected by our disconnected table of distinct values.

Step 1 – Loading unique values in a distinct table

I would recommend storing this list of values in a file or in a table in a database, for this example I will quickly put these values into a table in my PBIX file. Navigate to the home ribbon and select “Enter Data”, I will name the column Departments and the table Slicer, next I will add the five distinct departments as records for the departments’ column (IT, HR, Tax, FIN, Audit).

Note: This is using the disconnected table design, we are not going to set up a relationship between this new table and any other table in our model.

Step 1_LikeSearch

Step 2 – Creating the IsFiltered measure

This IsFiltered measure is going to give us the “LIKE” type search that we are looking for! Excited yet?? Well, you should be! In this example I am going to use the FIND function, the find function is exactly like the SEARCH function with the key difference being that the FIND function is case-sensitive and the SEARCH function is case-insensitive. The code below will complete step 2 in our design.

IsFiltered =
IF(
SUMX(‘Slicer’,
FIND(
‘Slicer'[Departments],
MAX(‘Department Goals'[Affected Departments]),,0)) > 0,
True“,
False
)

What is this measure doing? The FIND function returns the starting position of the text string we pass in, if the value is greater than 0 then that means that the text exists and this is how we generate a LIKE type search. If the text is not found then nothing is returned and we give that record a default value of 0.

Step 3 – Create a slicer from the disconnected table

Now we can create a slicer from our distinct list of values. At this point the slicer does not filter anything, this is because the filtering occurs through the measure we created in step 2 and needs to be added as a filter to a visual or report in our Power BI Report.

image

Step 4 – Use the IsFiltered measure to filter a visual in the report

The last step in this process is about as easy as step 3, first we need to figure out which visuals we want to be filtered with the “LIKE” type operation and then we need to use the IsFiltered measure as a filter on those visuals. The criteria here will be to only show the records where the result is “True”. Remember the DAX calculation from above? If a value was found, then a number higher than 0 is returned and we flag the record as “True”!

Finally, I am going to add the IsFiltered measure as a filter to my table visualization, this is done in the Fields section of the visualizations pane. See the below image:

Step 2_LikeSearch

 

Solution

With the 4 step design completed, the solution is also complete! Now we can apply a filter that does a text search rather than an exact match, see the image below:

Solution_LikeSearch

Summary

By leveraging the DAX expression language we are able to override the default behavior of slicers and filters in Power BI. Please note that this solution would also work in an SSAS Tabular model, the implementation would be slightly altered as you may be consuming that model from excel rather than Power BI but it would work nonetheless!

Resources

Video of the solution:

Completed Example:

http://tinyurl.com/yd2qon7x

SSIS Performance Tuning

This is my much overdue follow-up blog from the presentation I gave to the New York SQL Server User Group. In this post, I am going to provide some additional resources to supplement the presentation, check out the following blog post on different performance tuning techniques that can be used for SSIS.

SSIS Performance Tuning the For Each Loop. Retain Same Connection.

https://mitchellpearson.com/2015/01/12/ssis-performance-tuning-the-for-each-loop-retain-same-connection/

Use File Cache and the Cache Transform to help mitigate network issues.

https://mitchellpearson.com/2014/06/30/use-file-cache-and-the-cache-transform-to-help-mitigate-network-issues/

Create efficient Range Lookups with the Task Factory Advanced Lookup Transform

https://mitchellpearson.com/2014/06/27/create-efficient-range-lookups-with-the-task-factory-advanced-lookup-transform/

Load thousands of files in SSIS lighting fast! MultiFlatFile Connection Manager.

https://mitchellpearson.com/2014/06/27/load-thousands-of-files-in-ssis-lighting-fast-multiflatfile-connection-manager/

Why are Asynchronous components so bad in SSIS?

http://tinyurl.com/ycnvr5ef

Using cache transforms in SSIS to improve Performance (Part 1)

https://mitchellpearson.com/2014/06/20/using-cache-transforms-in-ssis-to-improve-performance-part-1/

Use the aggregate transform in SSIS to out perform T-SQL!

https://mitchellpearson.com/2014/06/22/use-the-aggregate-transform-in-ssis-to-improve-performance-of-t-sql/

Advanced DAX: Problem, Design, Solution

Thanks for watching my latest webinar on DAX. If you missed the webinar you can watch the recording here:

http://pragmaticworks.com/Training/Details/Advanced-DAX-Problem-Design-Solution

I didn’t have a lot of questions from this webinar but I will answer a the ones I did have.

Question: What is the difference between the Find function and the Search function.

Answer: The FIND function is case-sensitive and the search function is case-insensitive. In the webinar I used the FIND function in my first demo but really this could have been replaced with the SEARCH function since I was not concerned with case sensitivity.

Question: Where did you find the School Grade demo that you used in your webinar?

Answer: The data I used in the webinar can be found at the link below:

http://schoolgrades.fldoe.org/reports/#2014

Question: Might there be an upcoming webinar regarding DAX as it pertains to SSAS cubes rather than Power BI?

Answer: I am not aware of any webinars upcoming that will be specific to Tabular Cubes. It is important to point out that the examples used in this webinar are also applicable with Tabular SSAS Cubes since it uses the same backend engine as Power BI Desktop (xVelocity). If you implement these solutions in SSAS Tabular, the consumption of the data would be achieved through excel and then the necessary filters and slicers would be applied there instead of the report view provided in Power BI Desktop.

Thanks again!