Connecting Tableau directly to a data source such as SQL, Excel or a CRM system may seem more efficient, but the results are often not optimal. There are often performance and analytical advantages to instead create an OLAP (Online Analytical Processing) database, such as a Data Mart or Data Warehouse that is specifically designed for reporting and data visualization.
Bringing data into the data warehouse and organizing it using ETL (Extract, Transform, Load) functions is critical in preparing the data for reporting and data visualization. Fundamental to our approach is to separate the ETL for data import and data integration into three distinct operations, or tiers, thus making them independent. This creates a solution that is flexible and scalable to support inevitable reporting enhancements.
The first tier, copies data from the original source systems into a special purpose database designed to bring source data together in a single location. This is called the “landing database” and the data processing that occurs between the source system and the landing database is called “landing ETL”.
Once the data consolidation is complete, the second tier introduces an additional set of transformation scripts that apply business rules to the landed data and makes it suitable for end-user consumption. This is called “integration ETL”. This makes the data ideally suited for reporting and data visualization and may be of a variety of possible structures, including an enterprise data warehouse (EDW), analytical data store, etc. The integration ETL is designed to properly transform the data into the relevant data warehouse architecture.
The third tier creates the BI Tool Data Source, which is the extracted or exported data, or specialized versions of the data created to optimize for performance or to support advanced visuals in Tableau or PowerBI. This allows for data visualization to be done autonomously from the underlying data warehouse or to prepare data for the unique tasks that data visualization and reporting require for analytical purposes.
The rationale of the three-tier approach is to abstract data consolidation from the application of the business rules to the data. This ensures that any changes in source systems, or changes to business rules will not incur excessive effort to integrate. Abstracting the landing and integration layers insulates each layer from changes in the other layers. This greatly reduces the maintenance required to keep a data warehouse up to date with systems or procedural changes within the organization.
Besides minimizing upkeep, another benefit of the 3-tier data architecture is that different business rules can be applied to the same landing data for different end uses. For example, one set of integration ETL can summarize source data from raw transactions up to organizational summaries, while another set of integration ETL can populate a database for data visualization and analytics, and yet another set of ETL can populate a different database for integration with other systems. In all three examples, only one set of landing ETL is required, despite the varying data repositories the data ends up in.
We can help you create the data models, policies, rules, or standards that govern which data is collected, and how it is stored, arranged, integrated, and used for optimal reporting.
|Do you lack the time or technical expertise to incorporate multiple, disparate data sets into Tableau?
For advice or assistance, leave your email.