We are thrilled to announce that Unilytics has joined Optimus SBR
Learn more

Posted by Adam Jacobs

November 10, 2018


Leave Reply

Tableau Prep – The Fun, Fast, Effective Way to Clean and Prepare Your Data for Reporting

What is Tableau Prep?

Tableau Prep is a desktop software tool that was introduced this summer. It’s a new offering in the Tableau Product family, joining Desktop, Server, Online and Reader.

Tableau Prep is a separate software installation, with its own license key. With the new licensing of Tableau, everyone with Tableau Creator (which includes Desktop) also has access to Tableau Prep.

Tableau Prep is a tool to manipulate your data and get it into the desired format. This process could include:

  • Removing rows
  • Removing columns
  • Adding new columns
  • Filtering
  • Grouping values together (misspellings for example)
  • Aggregating data to a higher level – summarizing sales up to the level of story or city to match target or budget data
  • Joining data
  • Unioning data (combining data that has the same format)
  • Removing blanks
  • Splitting fields (e.g. “Name” is split to “First Name” and “Last Name”)
  • Renaming fields and values

All of this and more can be achieved in Tableau Prep, in an easy-to-understand, visually attractive, point-and-click tool.

Tableau Desktop is the white icon on the left. Tableau Prep is the blue icon in the centre.

Why Tableau Prep?

You’ve heard it repeatedly, for years – prepping and cleaning the data takes far longer than the actual dashboard design or reporting project. Everyone needs a tool to prepare, cleanse, transform and manipulate their data. Often, this is done in cumbersome ways inside Excel, but Tableau Prep offers a much easier and more scalable solution.

There are many competing tools to cleanse and prepare data – leading vendors would include Alteryx, Talend, or Informatica as well as tools that are tied to a specific environment like SQL Server Integration Services (Microsoft), Business Objects Designer (SAP) or Infosphere (IBM). What makes Tableau Prep unique?

First, integration with the Tableau environment. Tableau Prep outputs Tableau extracts by default and contains an option to publish them directly to Tableau Server.

Second, Prep maintains the clean look and feel that characterizes Tableau. Prep offers a very simple suite of tools to manipulate your data. At any point, there are only 7 possible “steps” that can be added to a workflow. Some tools offer hundreds of different operators to flatten hierarchies, geocode addresses, perform complex filters and call APIs. Like Tableau desktop itself, Prep opts for a more minimal approach.

Within those steps, though, there is considerable complexity. A “clean” step can filter, add new columns, remove rows or columns, group values together and perform many other operations. For example, here’s a clean step that groups, filters, changes data types, removes fields, trims spaces and adds new columns:

Third, Tableau Prep leverages the Tableau Data Engine to perform extremely quickly. In our in-office testing, Tableau Prep was able to reshape and write 1.4 BILLION rows of data locally in under half an hour – a very impressive result.

Tableau Prep – Killer Features

Some of these can be achieved in Tableau Desktop, such as creating calculated fields. However, some tools are specific to Prep, such as the fuzzy matching that can be performed on text. Suppose we had our “greatest artists of all time” data table, but it contained some messy data:

Using Tableau Prep’s “group by pronunciation,” anything that would sound the same gets rolled into a single category. Silent Es and multiple letters are easily combined, but this would be difficult to achieve with formulas:

What about the entries “Bieber, Justin” vs “Justin Bieber.” Thanks to Prep’s “Group – Common Characters” feature, we can combine these into a single entity. Again, this would be extremely difficult to achieve with formulas – even Tableau’s powerful Regular Expression formulas wouldn’t provide an easy solution to this.

Previously, all of this needed to be cleaned by hand.

Also, Tableau Desktop has certain limitations about the order in which operations can happen. If you split a field, it cannot be reshaped (pivoted) afterwards. In Tableau Prep, there are no such limitations. Multiple sources can be reshaped, then joined, then unioned, then joined again:

Tableau Prep – What’s Next?

Tableau Prep is still a very young tool. Recent updates have included the PDF data connector (one of my favorite tools in Desktop) and the software will continue to add new functionality, including the possibility of new steps (e.g. rows to columns instead of only columns to rows).

There are a few limitations of Prep to be aware of, although many will probably be addressed in future releases:

  • Prep lacks the full range of connectors that are available in Desktop. Prep connects to many common data sources like Oracle, MySQL and DB2. However, many of the convenient web-app integrations from desktop are missing: Google Sheets, Salesforce, Dropbox, Sharepoint Lists, Quickbooks and so on.
    • Prep doesn’t handle spatial files. Tableau desktop has made great strides in handling spatial files, supporting shapefiles, KMLs, geodatabases and recently adding spatial joins. None of this is available in Prep yet. Prep doesn’t even support geographic roles that are a standard part of Desktop.
    • Prep doesn’t write back to databases. Some data prep tools allow you to write the results of the process directly to a database as a table. This isn’t available in Tableau Prep – it only returns either Tableau Extracts or a text file currently.

    Tableau Prep offers a flexible, fast and fun way to Prep your data for presentation. Like all Tableau products, there are plenty of videos explaining the process and offering walkthroughs on the Tableau website.

    (Note the “Tableau Prep” section is the 2nd one after “Getting Started”)

    Additionally, Unilytics offers 2-day training courses on Tableau Prep delivered onsite or at our Toronto facility. Please contact us if you would like to drastically reduce your data cleansing time and create scalable solutions for your data.

Leave a Reply

Your email address will not be published. Required fields are marked *

Explore Posts By Category



Want to know more?

Contact us