Posted by Adam Jacobs
August 24, 2018
Points and Polygons – together at last. But how? Read on…
Tableau has made amazing progress on spatial data over the past year or two. In Tableau 9.2, we got the ability to add new maps through Mapbox, which opened up a world of more detailed and creative backgrounds like satellite or live traffic. Another big leap came in Tableau 10.2, when users could connect directly to spatial files. This opened the entire universe of “shapefiles” to users, who could now bring in boundaries and points that were created in GIS software tools.
In Tableau 10.4, shapefile support was updated to allow for not only shapes and points, but also lines; see our blog post that summarizes the new features and shows an example with the road and rail network of Toronto: New-mapping-features-tableau-10-4/
Up to this point, though, Tableau still could not link two shapefiles together. Working with spatial files requires a different type of logic than standard tables in SQL or spreadsheets. Instead of matching tables on a key, we match on geographic traits. There are join types that have no equivalent in tables: one shape might contain another, it might intersect another, it might be closest to another.
(This document from ESRI, the leading GIS software provider, explains the join types in more detail).
Prior to Tableau 2018.2, there was no alternative but to use other tools to perform these linkages. Typically, a GIS suite like ArcMap, Mapinfo or QGIS would be used to open the files, perform the spatial join, and then export a new version with the join completed. Such operations could also be done in a database that supports spatial operations such as Microsoft SQL Server or Oracle, or perhaps with a general-purpose programming language like Python using a library like GeoPandas.
However, all these approaches add some friction to the process of advanced analytics in Tableau. Spatial join has been on the Tableau roadmap for a while; it was even demonstrated at the Tableau conference last year. But with the latest release (see all the details here), spatial join is a reality.
Right now, spatial join is limited to intersection. More complex spatial joins, like finding the closest point, aren’t available (yet?). Still this allows us to supercharge our interactive Tableau maps and add new insights.
Here’s some data I’ve taken from the Nova Scotia Open Data portal. Open data sites from the government (at any level – federal, municipal, state/provincial) are a wonderful source, especially for geographic and map data. I’ve selected two datasets: the boundaries of municipalities, and the locations of defunct oil wells in the province.
You can imagine some basic questions that would be asked with this data, like:
Let’s look at the municipal boundaries. These are more complex features; oil wells are just represented as dots, but towns are shapes. When we preview the table in Tableau, we see that each row represents a city/town, and each has a “geometry” column that says polygon (or “multipolygon” – some cities have multiple pieces that aren’t all connected).
Similarly, the shapefile for oil wells is structured the same way. Each row is a well, with an ID, company and location. For a “geometry” column, we have “point” instead of polygon.
To join data, we must have a common field. What field do these files have in common? Only Geometry. But we can’t ask Tableau to join the usual way – a point will never be equal to a shape.
In the new version, they don’t have to be equal. Tableau detects that these are spatial elements, and offers a new option in the join: “intersects”
As with all joins, we can choose left, right, inner and outer. In this case, every well is within a municipality; however not every municipality has a well. If we put the boundaries in first and select left join, we’ll see all the cities; here we’ve selected inner join, so the cities with no wells won’t be included in the result.
You could imagine a situation where there were wells offshore, in no municipality. In that case, might want to select right join, or perhaps full outer. Otherwise the offshore wells would be removed by a left/inner join with municipalities.
Tableau has had the ability to make “dual axis” maps for some time. This allows us to put dots on top of shapes for the purposes of visualization; however, it does not actually join the two datasets. If we wanted to know the number of wells per municipality, that wasn’t possible; there was no link between the data sources. Furthermore, the dual axis approach only worked if all the data was combined into a single file. In this case, that wouldn’t be possible – the data is split across two files.
But spatial join changes that! To get the number of wells per city, I write a simple formula: COUNTD([Well ID]). (In this case, there are no duplicate well IDs, so I suppose it could be just COUNT([Well ID]), but COUNTD is more reliable).
Now we can add the number of wells per municipality to the map:
You can view and download the workbook here to see how this dashboard was created:
Think of all the objects that don’t have addresses: trees, fire hydrants, garages, parking spaces, highway exist, street lamps, mines… Spatial join will allow us to quickly align these with boundaries like city borders, postal codes or custom regions.
Hopefully future versions of Tableau will expand the mapping capabilities even further. Enjoy the new functionality and please reach out to us for any questions about mapping in Tableau.