Data Integration – Frequently Asked Questions

Everything you need to know about connecting Tableau to your data and cloud solutions

Confused by data integration? Don’t be. Tableau experts reveal all about data connection and integration

Data integration may seem complex, but Tableau gives you more options to deploy and connect securely to data sources for a reason. We’ve answered your most asked questions around data integration, native data connectors, and third-party partners to help you connect, integrate, and extend the functionality of your Tableau platform.

We’ve invested a lot in developing our partner relationships to give you great functionality however you choose to connect to Tableau.

Learn how to:

  • Connect to all of your data – no matter where it resides
  • Move from on-premises to cloud
  • Choose between live data or extracts
  • Integrate with a variety of third-party tools

Amazon Web Services

My strategic cloud platform is AWS and I want to move my on-premise Tableau Server instance to the cloud. What are my deployment options and how do I know which is right for me?

There are two options. You can deploy Tableau Server to an AWS EC2 cluster within your own virtual private cloud (VPC) and manage it yourself, or you can use Tableau Cloud. 

There’s no difference in the user experience or functionality between these options, but Tableau Cloud will eliminate any administrative overheads of deployment, scaling, backup and recovery, and upgrades.

Running your own Tableau Server cluster will give you greater control over any integrations and is better if you have restrictive security policies requiring analytics to be run from your VPC, or if your data sources are not cloud services.

What are the deployment best practices for hosting Tableau Server in our AWS VPC?

We defined a joint Quickstart template for Tableau in EC2 with AWS, and this is the simplest way to get started. This reference deployment leverages AWS CloudFormation templates to provide a starting point for a single node and a resilient multi-node deployment integrated with cloud load balancing. 

 Customers with larger Tableau Server environments will typically leverage Tableau Advanced Management and move the following services off the server instance and onto external cloud services to simplify management and backup/recover and to improve scalability:

  • Keys used for Extract Encryption at Rest can be managed using AWS KMS instead of the local  KMS
  • The Tableau metadata repository can be moved off to an external AWS RDS Postgres instance
  • An external file store can be used to store  extracts. This allows you to scale storage capacity independently of  analytics processing power, reduces processing overhead on Tableau Server  required to maintain multiple copies, and enables you to use cloud  snapshots for backup and recovery.
What AWS data services can Tableau connect to and what are the typical use cases for each Amazon service?

Tableau can connect live and via extracts to most AWS data services and third-party services that our customers typically run in AWS.

If the data source you would like to connect to is not currently listed here then check the connector gallery for third-party developed connectors. Alternatively, consider using the generic JDBC connector for structured sources or the Web Data Connector if you need to pull data from a web service that exposes a REST API.

Amazon services can be grouped into three broad buckets of use cases:

 

  • Cloud data warehouses like Redshift

Redshift scales well for both throughput and concurrency. Use live connectivity from Tableau whenever possible to offload processing from your Tableau environment into the cloud data warehouse.

  • Data lake centric services like Athena and occasionally EMR Presto

These are best for data exploration when you have relatively infrequent access of large volumes of data hosted in S3. They scale well for high analytical throughput but are not designed for high concurrency. Most users connect live for data exploration but use extracts from specific secondary data sets.

  • Relational database services like Aurora and RDS

These are primarily used for transaction processing, and analytics will typically be a secondary function. Tableau can query these live for data exploration but most customers use extracts to ensure large analytical queries don’t impact performance of primary transaction processing use cases.

Snowflake

What makes the partnership between Tableau and Snowflake so unique?

As a leading data analytics company, we pride ourselves in making smart investments to stay ahead of the curve. While we offer the best data visualisation capabilities around, Snowflake’s enterprise cloud data warehouse is an excellent place to store your data. We partner with Snowflake and actively invest in research and development so our customers can get the best from both technologies.

When you pair Tableau with Snowflake, you can take advantage of some great features, such as enhanced security around roles and permissions, different sized data warehouses for different use cases and roles, and the use of external tables.

We’ve also developed dashboards together so you can monitor your Snowflake environment and keep track of compute costs, performance, and user adoption.

What’s the benefit of using Tableau on top of my Snowflake enterprise cloud data warehouse?

With Tableau, you can put the power of data in your users’ hands, and it works really well with Snowflake. Tableau allows you to visualise data so that even non-techy users can analyse it to make insights quickly and across a variety of use cases – and when you start to make the shift to a more data-driven culture, user adoption will go through the roof.

That’s where Snowflake comes in. It’s scalable, user friendly, and can handle a high volume of concurrent users. Features such as automatic Scale Out of compute resources or Scale Up for bigger compute nodes mean you benefit from virtually unlimited storage, and can scale your data infrastructure up or down quickly in line with demand.

Can I connect semi-structured data stored in Snowflake to Tableau?

Yes, Snowflake supports structured and semi-structured data in multiple formats such as JSON, XML, Parquet, Avro, and ARC, as well as enabling you to join multiple data sets together to achieve better performance. It does this by storing data in a format called VARIANT.

Tableau can read and visualise VARIANT data in dashboards through Custom SQL or Initial SQL, making Snowflake and Tableau a powerful combination. Tableau can also access Snowflake features such as Time Travel and other table functions in the same way.
Snowflake can handle large volumes of structured and semi-structured data very quickly and without impacting performance, making it one of the most scalable solutions around.

Data Bricks

We have huge volumes of data in Databricks. How can I prepare it for use in Tableau?

Databricks is a great place to store vast quantities of data, but it’s not a database, so you may be wondering how to tap into that data lake to make it actionable. The solution’s Lakehouse architecture merges the best of data lakes and data warehouses into a single, secure platform.

The Lakehouse is where the magic happens. The simple user interface looks just like a file system and makes all your structured, unstructured, and semi-structured data accessible. It uses clusters to process that data, and one of the great things about cloud is that you can scale your clusters to whatever size you need to get your data ready for analysis.

How can I connect to Databricks data in Tableau?

Connecting Databricks data in Tableau is simple when you use the native Tableau Databricks connector. All you need to do is find the cluster endpoint in your Databricks Lakehouse and add the endpoint name in Tableau to connect it. Hit the ‘check database’ button and search for tables. Data can be pulled into the same interface from either the data warehouse or the data lake.

If you’re connecting large data sets it’s better to use a live connection than extracts, especially if your source has fresh data coming in every day – if you’re streaming or batch processing data for example.

Can I make Databricks data actionable for both business users and data analysts?

Yes, Databricks is really good at making data accessible for business users, while giving your data analysts and scientists the level of functionality they’d expect from a business intelligence platform.

The user-friendly Data Lakehouse makes all of your business data actionable and accessible quickly, but combining Databricks and Tableau can really help you take visualisation to the next level so even your least tech-savvy users can self-serve and start pulling their own reports. It’s also super easy to schedule Tableau to refresh the data behind existing visualisations, so your users can see the latest figures without having to manually pull a new report every time

Mulesoft

How do Mulesoft and Tableau fit together in Salesforce's Customer 360?

MuleSoft is the glue holding together all your applications. Leveraging re-usable APIs, pre-built connectors, and integration templates, you can free data from across your whole enterprise and deliver it in near real-time for analysis in Tableau.

Tap into that fully connected ecosystem of applications, databases, and backend systems like SAP, Workday, Oracle, or custom mainframe applications to get a single source of the truth for analysis. Tableau also makes data sharing and self-service analytics simple. No more black boxes and silos – you can truly analyse all your data and empower anyone on your team to make insight-based decisions and share data across the whole organisation.

How can Mulesoft help me to unlock all my data for fast analytics in Tableau?

MuleSoft creates a single source of truth. There are several ways to make data available for analytics in Tableau.

Direct ways to provide data:

  • Hyper API allows you to build Hyper files through MuleSoft and publish them directly to Tableau Server or Tableau Cloud
  • Web Data Connector can call any API exposed via MuleSoft
  • MuleSoft Composer allows you to connect apps and data natively in the Salesforce UI and build a no-code data pipeline

Indirect/layered ways to make data available:

  • Unlock 360-degree visibility by using MuleSoft to unify your data in Salesforce. You can then leverage native Tableau connectors to unlock insights
  • MuleSoft can write data to the data warehouse of your choice. Using specialised Tableau connectors, you can connect to those data warehouses out of the box
How can I use MuleSoft to transform insights into action?

MuleSoft and Tableau help you to access more data and make it available for analysis. Data is displayed in interactive, user-friendly dashboards that make it easy to spot trends and anomalies. Leveraging MuleSoft APIs that you can call directly from Tableau, you can take action when you identify an insight on your dashboard, all without switching apps.

For example, by automating next steps, workflows, and events, you can create an opportunity in your CRM platform by simply clicking on the relevant data point on your dashboard, or you can close or escalate a support case that has been lost in your ticketing system in one click by identifying outliers on a scatter graph.

Oracle

How can I access an Oracle database from Tableau? How can I get the best of both technologies?

To access an Oracle database, you need to deploy the Oracle JDBC driver (https://help.tableau.com/current/pro/desktop/en-gb/examples_oracle.htm) into Tableau and create the connection. You can then set up the data source as live or extracts, and access the data in schemes and tables.

Live data is queried directly against your Oracle database and returns the data for use in Tableau. This data will always be as up-to-date as it would be in Oracle.

Extracts are snapshots of data optimized for aggregation and loaded into Tableau to be recalled for visualization. Tableau’s engine runs the queries so Oracle isn’t impacted, but you will need to schedule when your extracts refresh.

How do I connect to Oracle Exadata?

Tableau and Oracle Exadata combine the best products on the market to give you the highest possible performance.

You don’t need to do any additional configuration to connect to Oracle Exadata. Tableau will connect in exactly the same way as any other Oracle database, and you can connect using either live data or extracts depending on your requirements.

Oracle Exadata is the highest performing database available on the market, and Tableau makes the most of its performance capability by running analytical queries directly on the Oracle Exadata engine.

Can I connect to Oracle Autonomous Data Warehouse (ADW)?

Yes, you can connect Tableau to Oracle ADW. If you’re working with Tableau Cloud (https://www.oracle.com/a/ocom/docs/tableauonline6-2_adw.pdf), you need to take into account some networking requirements. As Oracle ADW only runs on Oracle Cloud Infrastructure (OCI), you’ll need an Oracle Compute machine in your OCI with Oracle Client library installed.

You can then deploy and configure Tableau Bridge (https://www.tableau.com/products/tableau-bridge), which communicates with Tableau Cloud from behind your OCI firewall and can handle both scheduled extract refreshes and live queries.

With Tableau Bridge in place, you’ll have a secure connection to Oracle ADW.

SAP

Historically, integrating SAP with third-party tools has been challenging. How well does Tableau integrate with SAP?

At Tableau, we understand the importance of integrating seamlessly with SAP data. Our products are HANA-certified, and we continuously provide new enablement assets relating to SAP connection to make sure we can meet your needs today and as they evolve.

We regularly capture feedback from customers on desired functionality and collaborate closely with SAP product management to bring these requirements to life. This makes a real difference when it comes to hierarchy support or data source performance. After all, SAP data sources play a major role in our customers’ success, and we aim to deliver the best possible user experience.

SAP offers myriad products from HR business applications such as SuccessFactors all the way to data warehousing with SAP BW. Which products does Tableau connect to?

There are several connectors that allow you to connect to SAP products. The two most used connectors are the SAP HANA and SAP BusinessWarehouse connectors. Both allow you to connect live or via extracts. When connecting live you have the benefit of leveraging hierarchy support or SAP permissions for Row-Level Security. There’s also a dedicated SuccessFactors connector or the option to connect to additional SAP Systems via OData.

SAP best practices advise connecting to SAP products that are built for analytical workload, such as SAP BusinessWarehouse, to avoid overloading transactional systems such as S/4HANA.

Is it possible to connect directly to SAP BW/4HANA, and if so which connector should I use?

You can use both the SAP HANA and the BusinessWarehouse connector for BW/4HAHA. The difference between the two is that the BusinessWarehouse connector uses MDX to connect to Business Explorer (BEx) Queries, whereas the HANA connector uses SQL to connect to HANA calculation views.

We recommend using the HANA connector where possible because of its higher performance. HANA makes it a breeze to visually analyse your way through financial, procurement or sales data, and for that reason developing the HANA connector is one of our key priorities here at Tableau.

Cloudera

How can Tableau connect to my Cloudera Hadoop? Are there any advantages of one type of connection over the other?

Tableau can connect to Cloudera Hadoop via Impala or Hive as a live or extract-based connection. Because Hive is a batch-oriented process, it works best with extracts. Therefore, for data exploration or business use cases, where fast access to dashboards is critical, Impala may be the better choice.

Whichever way you connect however, Tableau allows you to bring aggregated data on memory for smart dashboard design, which speeds up queries that can guide the user towards live, more granular information and give them a great user experience.

How does Tableau leverage Cloudera’s technical capabilities?

To analyse large volumes of data, Tableau live queries your Cloudera clusters. This leverages Cloudera’s inherent scalability and quickly turns large volumes of data into valuable insights for users, whether you’re connecting by Hive or Impala. It can also handle Hive columns containing XML elements, by allowing you to un-nest using HiveQL. That means you don’t have to ask questions directly in Cloudera to make use of its features.

Tableau makes it easy to visually analyse your data using drag and drop capabilities, so it’s more accessible to business users who don’t have a technical background. That means more users can query large volumes of data, driving self-service and a more data-driven culture at your organisation.

Google GCP

My strategic platform is Google Cloud Platform (GCP) and I’d like to move my Tableau installation to the cloud. What are my deployment options?*

The simplest way to consume Tableau in the cloud is using Tableau Cloud. However, if you’ve already looked at this and ruled it out, there are a number of “self-deploy” options. ** We have a strong technical alliance relationship with Google and have defined two reference deployments for deploying Tableau to Google Compute Engine. These are the simplest way to get started:

We also support deployment standards built on Google Kubernetes Engine and are currently working to define a GCP specific deployment reference architecture. See how to deploy Tableau in a container to get started today.

Which Google cloud data services can Tableau connect to and what are the typical use cases?

The most common Google data services that users connect to are:

  • Google BigQuery: for most use cases use *live connectivity* to offload query processing and leverage the scalability of cloud. Use *extracts* for specific secondary data sets to control query costs or to provide low latency response for highly used mission critical dashboards.
  • Google Cloud SQL: the primary use case for this is transactional processing. Tableau can query these live for data exploration, but most customers use extracts to ensure large analytical queries don’t impact transaction processing performance.
  • Google Sheets: this provides an easy way to rapidly analyse datasets that are maintained by end users.
  • Google Analytics and *Google Ads:* these are used to analyse advertising ROI and impact.
What are my options for authentication and security for my cloud data sources?

Cloud native data sources use OAuth for authentication. Embed shared credentials in a published data-source connection or use individual credentials. Embedding shared credentials is the best option if you want to use extracts or you would like to enable Ask Data (Natural Language Query) to do full semantic indexing so that users can ask unstructured questions from a non-restricted data source. If you would like to use individual credentials, follow the instructions here to configure a service specific OAuth consent screen and to enable Tableau to store credentials for each user.

Microsoft Azure

We use Microsoft Azure Active Directory as our identity and access management service. Can we use it to authenticate with our Tableau Server?

Yes, Tableau allows you to authenticate with Azure Active Directory as your identity provider. Tableau uses the SAML (Security Assertion Markup Language) protocol to leverage your existing identity provider setup.

You can set up authentication in three easy steps:

  • Add Tableau Server as an external application in your Azure Active Directory environment
  • Configure the SAML connection on Tableau Server
  • Simply sign into Tableau Server with your standard credentials

It’s super convenient and users can connect to data sources such as Azure Synapse or Azure Databricks using single-sign on.

We’re moving our infrastructure to the cloud. How difficult is it to move our Tableau Server onto Microsoft Azure?

Moving your Tableau environment from on-premises to the cloud is super easy. Tableau is vendor-agnostic and works closely with every cloud technology provider.

There are two ways to do it:

  • Use our Azure Quickstart template to set up a single Linux or Microsoft Azure Virtual Machine with Tableau Server
  • Deploy the Tableau Server on an Azure Virtual Machine yourself

Tableau technology is the same on-premises as it is on the cloud, so you can easily restore your Tableau Server backup on your newly created cloud virtual machine. Migrating to cloud is a popular trend right now, so we’ve done everything we can to make it simple.

Azure SQL Synapse is our primary data warehouse. Can we query it with Tableau?

Yes. The Tableau platform includes a variety of different connectors from Microsoft and other vendors. These connectors include multiple Azure services, such as Azure Data Lake Storage Gen2 or Azure SQL-Database.

Our newest Azure SQL Synapse Analytics connector allows you to query your data warehouse via either a live connection or extracts. And by using Azure Active Directory as your identity provider, your users can simply authenticate through single-sign on. Just log in once, and your users can start querying your data.

Related content

Tableau for IT

Deliver value with trusted analytics and empower organizational agility.

Read more

Tableau Blueprint Assessment

In just a few minutes, get personalized, actionable guidance on scaling analytics and building Data Culture.

Read more

Transformational analytics

Accelerate digital transformation with modern BI.

Read more

Quality conversations with the CIO

How do CIOs ensure board-level discussions get results? Just ask the experts. Leading global CIOs share their insights.

Read more