Using R to unzip and process files into Power BI

Posted by

There are a lot of blogs and videos out there on how to use R to enhance the visualization layer of Power BI. In this small series, I want to focus on how R can be used to improve the data importing and data cleansing capabilities of Power BI. Previously, I wrote how you could use the R language in Power BI to export data out to a CSV file. You can view that blog by following this link.

Update: Video embedded at bottom of blog post.

Although, I personally am not an expert when it comes to R, I’m learning more each day, it’s one of the topics I enjoy teaching the most in our Advanced Power BI Boot Camp.

Now it’s time for the required Pirate Picture: ARRRRRRRR. By the way, this image is from a tabletop game I like playing called Merchants & Marauders!!

Image result for pirate arrrrr

Getting Started with R in Power BI

To work with R you will need to have R installed. I use Microsoft R open, link provided below. You can optionally download R studio as well. Both of these are free downloads.

Required: Download the Microsoft R Open

Optional: Download R Studio

Discloser: At the time of this blog post, there is no built-in intellisense in Power BI to help write and debug R. Therefore, I use R Studio to write and test my code! R Studio is not a requirement, it is, in my experience, just very helpful.

UNZ and READ.CSV in R

The problem I want to solve in this blog post is how to unzip a file with Power BI and then process that file. Because I come from an enterprise BI background my first thought is to approach this problem with a tool like SSIS or PowerShell. However, there is an easier way, why else would we be here! Smile

I want to show you how you can use R to unzip and then extract data from a file all in one step. By the way, it’s really, really easy!

Let’s start off by looking at the code, this is all of it!:

image

  1. On line one, we are declaring a variable called data, this variable will store the results of our expression.
  2. On line two we are using read.csv to extract the results from the csv file, Customer Sales, and these results are stored in the variable from step one, please note that this process can not occur until after the file has been unzipped.
  3. On lines three and four we are using UNZ to unzip the file Customer Sale.zip and then return the file Customer Sales.csv.

And that’s it!

Wait, wait, wait…. What is line seven? Well, line seven allows me to test the results in R Studio. This is not needed for the importing data into Power BI Desktop. Oh, that reminds me, we need to import the data into Power BI Desktop!

Importing data using R in Power BI Desktop

As you are already aware, there are a lot of data sources available in Power BI, it shouldn’t surprise you that one of those data sources is R. Now let’s walk through how we can quickly import data from an R script into Power BI Desktop.

From the home ribbon in Power BI, click Get Data –> Other –> R Script. See screenshot below:

image

This will launch the R Script editor. As I previously mentioned, this script editor is simply a text box. Now we simply paste in our code from R studio, or if you happen to be an R aficionado you can just type your code in this box, as seen below:

image

Once we click ok, the Navigator window will open. Since we only declared one variable in our script, only one option will be provided. From here, everything is exactly the same as any other data source.

4_Navigator Window

Ok, now we’re done! I hope you enjoyed this blog post and I look forward to writing more, until next time.

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s