Connection Management: the Datasource Pane and Data Tab
The Datasource pane is important for a user to be able to create reports on Tableau. This section looks at the Datasource pane as well as the tab that will likely appear in the exam. There are several options in these tabs that can change the data connection type and even modify the data coming into the report for analysis, such as adjusting the data type to an appropriate one.
The Datasource Window
This window allows users to view, edit, add, remove, and combine data sources present in a workbook. The window is generally structured as follows:
- Metadata grid
- Results pane
The number of records will be displayed along the top of the Results pane. If data contains fewer than 1,000 records, all records will be displayed; otherwise, it is limited to a sample of 1,000 rows.
Some key features are worth identifying in this window, as listed after the figure:
Figure 1.23: Tableau Data Source page
Refresh button:
Add connection:
Relate/join tables:
Union tables:
Extend tables:
Connection type:
Data source filters:
Results pane settings cog:
The Data Tab
When at least one data connection is set within a workbook, the Data tab at the top of the screen contains multiple options for managing sources. (This tab is always accessible, regardless of which tab or sheet in the workbook the user is currently.)
Options in this tab are as follows:
Figure 1.24: Data tab options
Take some time to go over these options and explore what happens. It is good to familiarize yourself with what these options mean and how they work in a workbook setting.
Making New Connections in Existing Workbooks
Various data sources can be combined as required in a single Tableau workbook to suit specific analytic requirements. They can be used separately or combined together through methods such as joins, unions, relationships, and blends.
Joins use a key identifier or linking field to join two tables together, while unions stack data tables on top of each other.
Relationships are a new model and a flexible way to connect two or more tables together. Blends allow the user to establish a connection between a detailed table and an aggregated table.
These connections will be explained later in this book.
Replacing Data Sources
There might come an occasion when the original data source is outdated or has been improved and in a new location. If a specific visualization has been created with a data source, a user can easily replace the data source with a new one by adding the new data source.
Once this has been done, right-click on the old data source and select Replace Data Source.
There will be an option to select the current data and then the replacement data. Once this is done, the data should automatically update with the new data source. If the data is kept the same as before, maybe a new column is included, then the replacement should go smoothly and all charts will be able to take on the new fields.
However, it is important to note that if the field has changed or a specific field that was used is no longer in the data, this can potentially break the workbook.
Notes, Caveats, and Unsupported Data Sources
Uncommon connectors may require a few additional steps to establish a connection.
In some cases, a specific driver needs downloading that does not come pre-packaged with Tableau Desktop; for example, for a connection to SAP HANA (a database), a user will need to have a driver installed, as shown here:
Figure 1.25: SAP HANA connection error
Drivers convert information between each end of the connection so applications can speak to each other; they can be thought of as translators at each end of the connection.
For caveats on multidimensional sources such as cubes, please see the previous section/s regarding database connections.
Web-Based Data Access: Drivers and APIs
Servers such as the previous ones are made accessible through drivers; as discussed previously, some of these come pre-packaged with the Tableau software when it is downloaded, and others require downloading by the user. Tableau can also connect directly to data on the internet that is accessible by Application Programming Interfaces (APIs) rather than ODBC or JDBC drivers, where a user can access a database system using SQL; this is done using Web Data Connectors. The details of Web Data Connectors are not required for the exam.
Multidimensional Systems
There are also multidimensional systems that are structured and queried with different technologies to relational databases. The technical details of these data source types are not expected for the exam, but it is useful to have an awareness of them for comparison against relational sources.
Data in multidimensional systems is pre-aggregated with particular business questions in mind. For example, the developer of an OLAP cube (a common type of multidimensional system) may create a summary view of profits at a country level, rather than having a user query billions of records at the transaction level to calculate this higher-level value outside of the source. As the user can query these aggregations directly with algorithms optimized to do so, multidimensional databases are much quicker to query. However, they are far less flexible to use than relational sources in Tableau: useful features such as data source filters, level of detail calculations and forecasting are not available when using cube sources.
Only the following sources are supported in Tableau at the time of writing:
Figure 1.26: Supported sources available in Tableau