Transcript Tableau_8.0 By Brave Word
By Hari Poluru
1
History
Heritage
Company traces its roots to academic research in Stanford University’s Department of Computer Science between 1997 and 2002 Founded in Seatle 2003 Patended technology named VizQL Growing revenue ~$72M 2011 Started at Seattle
Organization
Management Team consists of Former Prof and PHD scholars from Stanford and Leaders from Ventures caps ~around 600 employees Incorporated in the US HQ in the US 8 offices in 5 countries Customers and partners in all continents Tableau's products have been incorporated into product suites of multiple independent software vendors, including Oracle for its Oracle Essbase Visual Explorer product 2
All About
Fast Time-to-Value
With traditional OLAP, constructing cubes is time consuming and requires expert skills. This process can take months, and sometimes over a year In addition, the cube must be constructed before it can be calculated, a process which itself can take hours. And, all this must occur before analysis or reporting can be performed before the user even sees answers to his questions Because the data could be loaded in memory, creating analysis in Tableau takes seconds. There is no pre-definition of what is a dimension – any data is available as a dimension and any data is available as a measure The time implementing Tableau is spent locating data, and deciding what analysis is interesting or relevant to solving the business question. Typically, this process takes lesser time 3
Why ?
Easy to Use
The entire end user experience in Tableau is driven by the “click.” End users enjoy using Tableau because it works the way their mind does Each time they want to review the data sliced a new way, they simply click on the data they want to evaluate. Because Tableau could operate in memory, with each click all data and measures are recalculated to reflect the selection Users can go from high level aggregates (e.g., roll up of margin on all products in a specific line) to individual records (e.g., which order was that?) in a click – without pre-defining the path to the individual record
Powerful
Because queries and calculations could be performed in memory, they are extremely quick. In addition, if in memory feature is in use, the performance of Tableau is not constrained by the speed of the underlying source Even if the underlying data is stored in a system which has poor query performance (for instance, a text file), the performance is always optimal because the data could be loaded in memory. Tableau also compresses data as it is stored in memory, allowing large amounts of data to be stored Patented
VizQL
technology makes Tableau acts more faster 4
What data can I analyze with Tableau?
• • • Your data needs to be in a database, spreadsheet or structured text format before you can analyze it with Tableau. Databases include relational databases and multidimensional OLAP databases. The specific databases your copy of Tableau can connect to depends on your purchase options. To see which data sources your copy of Tableau can connect to, select Data > Connect to Data. Any data source that is not supported by your version of Tableau is grayed out. Contact Tableau to upgrade your database accessibility options.
Tableau can connect to many relational data sources, and two OLAP/cube data sources: Oracle Essbase and Microsoft Analysis Services. Tableau queries the data using standard drivers and query languages (like SQL and MDX) and presents a visual analysis of the data.
5
Types
• • • • • Tableau Server provides browser-based visual analytics tool and just a few clicks, users can publish dashboards and reports with current data automatically customized to the needs of everyone across your organization. It deploys in minutes and users can produce thousands of reports without the need of IT services.
Tableau worker is a additional Tableau tool to use in distributed environment to support more browsing & searching activities to handles a large number of tasks and refresh extracts. Workder will be a dedicated machine to run several background task processes.
Tableau Desktop is a desktop data visualization application(report development tool) that analyze virtually any type of structured data and produce highly interactive, beautiful graphs, dashboards, and reports in just minutes. After a quick installation, developers can connect to virtually any data source from spreadsheets to data warehouses and display information in multiple graphic perspectives. Tableau Reader is a free viewing application that lets anyone read and interact with packaged workbooks created by Tableau Desktop.
Tableau Public is a free service that will create and share data visualizations on the web. Thousands of users can use to share data on websites, blogs and through social media like Facebook and Twitter.
6
n-Tier Client-Server Architecture
7
Data Layer and Data Connectors
Data Layer
One of the fundamental characteristics of Tableau is that it supports your choice of data architecture. Tableau does not require your data to be stored in any single system, proprietary or otherwise.
Data Connectors
Tableau provides two modes for interacting with data: Live connection or In-memory. Users can switch between a live and in-memory connection as they choose
Live connection
Tableau’s data connectors leverage your existing data infrastructure by sending dynamic SQL or MDX statements directly to the source database rather than importing all the data. This means that if you’ve invested in a fast, analytics-optimized database like Vertica, you can gain the benefits of that investment by connecting live to your data
In-memory
Tableau offers a fast, in-memory Data Engine that is optimized for analytics. You can connect to your data and then, with one click, extract your data to bring it in-memory in Tableau. Tableau’s Data Engine fully utilizes your entire system to achieve fast query response on hundreds of millions of rows of data on commodity hardware. 8
Server
Tableau server have its own Apache web server and it will configure automatically when the installation is processing.
Tableau authorization is handled by the following:
Roles and permissions:
Define specific capabilities that users can or cannot perform on certain objects in Tableau. A role is a set of permissions that administrators can use as-is or customize.
Licensing and user rights:
Control the maximum set of permissions that a user can have.
9
Server Components
Tableau Server Components
The work of Tableau Server is handled with the following server processes:
Application Server
Application Server processes (wgserver.exe) handle browsing and permissions for the Tableau Server web and mobile interfaces
VizQL Server
Once a view is opened, the client sends a request to the VizQL process (vizqlserver.exe). The VizQL process then sends queries directly to the data source, returning a result set that is rendered as images and presented to the user. Each VizQL Server has its own cache that can be shared across multiple users
Data Server
The Tableau Data Server lets you centrally manage and store Tableau data sources. It also maintains metadata from Tableau Desktop
Backgrounder
The backgrounder refreshes scheduled extracts and manages other background tasks
Gateway/ Load Balancer
The Gateway is the primary Tableau Server that routes requests to other components. Requests that come in from the client first hit the gateway server and are routed to the appropriate process. If multiple processes are configured for any component, the Gateway will act as a load balancer and distribute the requests to the processes. 10
How it Works?
The diagram describes how tableau works between the client's web browser, web server(s) and Tableau Server.
1.
When a user visits the webpage and it sends a request to web server.
2.
3.
The web server sends a request to Tableau Server.
Tableau server creates a request and check the authentication and reply to web server.
4.
5.
6.
Web server process the request & construct the view’s URL & passes to web browser.
Browser sends a request to Tableau Server for the continues request.
Tableau Server checks the web browser request and it must be redeemed within three minutes after they are processed and sends back the final URL for the embedded view.
11
Tableau Desktop
Can create graphs, report, dashboard using Tableau Desktop and publish reports & dashboards to Tableau portal for business use. separated into three tiers: • • • Visually Analyze Data Build Interactive Dashboards Share and Interact
Tableau workspace:
Tableau desktop contains Data window, view cards, toolbar, worksheet, workspace controls, dimensions area, measures area, set, parameter, status bar, worksheet bar.
Workbooks and Sheets: Worksheets
- can contain worksheets and dashboards. A worksheet is where you build views of data by dragging and dropping fields onto shelves.
Workbook
s – It contain one or more worksheets or dashboards and hold all of your work. It allow to organize, save, and share results.
Dashboard
is a combination of several worksheets that can arrange for presentation or to monitor the data.
12
Connect to Data
To start analyzing your data, connect Tableau desktop to data sources and data source can be as an Excel workbook, Teradata table, Oracle data warehouse. After connecting, the data fields become available in the Data window to create your reports. You can perform many actions with your data as below, Support multiple data source Connecting to a Custom SQL Query Editing the Connection Renaming the Connection Duplicating the Connection Replace Data Source Exporting the Connection Refreshing the Data Closing the Connection Connecting to Multiple Tables Creat table joins Adding Tables to the Data window 13
Connect to Data
Once the server information / credentials are provided, the user can select a schema to connect to and pick a table.
Through the multiple tables options, it is possible to create a join between two tables and use them both in a single report. Tableau also provides a custom SQL option for freeform queries to retrieve data exactly as the user wants it. This is useful in cases that require operations to be done on the data before importing the dataset for reporting.
Tableau also allows connection to pre-existing OLAP cubes created by a variety of tools.
Tableau connections to the native .tde extract canisters can be set up. TDE extracts are Tableau’s method of caching data locally, so as to reduce database hits for frequently accessed reports. They also allow tableau workbooks to be packaged, standalone reports that can be distributed for viewing without having to connect to the data source.TDE extracts can be refreshed manually through tableau desktop or be scheduled for periodic refresh through the enterprise edition.
14
Data sources supported by Tableau
Actian Vectorwise 1.5 and later Aster Data nCluster 4.5 and later Cloudera Hadoop Distribution via Hive (CDH3u1, which includes Hive .71, or later) Firebird 2.0 or later Greenplum 4.x and later IBM DB2 (9.1 or later) for Linux, UNIX, and Windows MapR Distribution for Apache Hadoop Microsoft Access 2003 or later Microsoft SQL Server Analysis Services 2000, 2005, 2008,2008R2, 2012 (Multi-dimensional mode only) Microsoft PowerPivot 2008 (whether or not published in Sharepoint) Microsoft Excel 2003 or later Microsoft SQL Server 2000, 2005, 2008, 2008R2, or 2012 Microsoft Windows Azure Marketplace DataMarket MySQL 4.0 or later Netezza release 4.6 or later OData Oracle Hyperion Essbase 11.1.1 Oracle Database 10.x or later ParAccel Analytic Database (PADB) version 3 and later PostgreSQL 7.0 or later Progress OpenEdge 10.2B patch 4 or later SAP NetWeaver Business Warehouse 7.00 with SP20+ recommended. Also requires SAP GUI for Windows 7.20 Client Sybase IQ 15 and later Teradata 6 or later Text files, comma delimited format Vertica 4.x or later Also, many databases that are ODBC Version 3.0 compliant 15
Data Source Relationships
Select Data Connect to Data to connect to at least two data sources. The data sources are shown in a drop-down list at the of the Data window.
The Data window color codes the primary and secondary data sources with a colored bar down the left side.
16
Data Source Relationships
Relationships are automatically created based on field names, however, you can define custom relationships by selecting
Data > Relationships.
In the Relationships dialog box, select the primary data source in the drop-down at the top of the dialog box. Then select the secondary data source. Any automatic relationships are shown or you can select
Custom and then click Add to relate fields from each data source.
17
Dimensions
Tableau treats any field containing qualitative, categorical information as a dimension.
For example, text / date fields would classify as dimensions.
In simplest terms, a dimension can be thought of as a column header in a relational table. A context provider for data.
18
Measures
Tableau treats any field containing quantitative, countable values as measures For example, revenue or number of units sold.
19
Calculated Fields
Calculated fields are custom dimensions or measures created by the user. These fields are typically derived from existing fields imported from the source data.
For example, tableau imports two dimensions named City and Country. A user can create a dimension that displays both in a single column by creating a calculated field to concatenate the two dimensions.
It is also possible to create compound measures using calculated fields.
Total revenue = no. of units sold x price per unit 20
Calculated Fields (contd.)
Right click on the dimension / measure pane and select calculated field. Tableau validates expressions automatically.
21
Report Area
22
Report Area (contd.)
The Tableau report area consists of several “shelves” where the data elements (dimensions, measures, sets, etc) can be dragged and dropped.
Dragging and dropping dimensions into the column or row shelf will create a basic table.
Dropping a measure into the text shelf will display the fact data grouped by the context of the dimension in the row or column shelf.
Dropping dimensions in both row and column shelves will generate a crosstab.
23
Report Area (contd.)
Basic Crosstab layout 24
Sets
Tableau sets are used to work with specific elements of data sets.
The three main uses of a set are: Create a subset of the data – Select one or more dimension members that are of interest to you. For example, sort a field and select only cities on the west coast with populations greater than 500,000, or manually select outliers that appear in a scatter plot. Refer to Example – A Set Containing a Subset for more information.
Create unique encodings – Combine dimension members to create unique encodings. For example, create a set that combines market and product, and then color-encode a data view using the combined members. Refer to Example – A Set Containing Unique Encodings for more information.
Save filters for later use – once you have created a filter, you can save the filter as a set and use it in all of the worksheets in a workbook. This saves you from having to recreate the filter every time you want to use it.
25
Filters
Filters are used to restrict datasets at the query level.
They affect the elements of the “where” predicate clause in an SQL query against the data source.
Filters in Tableau are of three types: - Static filters on a report - Dynamic / View filters or prompts - Relative filters 26
Filters
Static filters affect the where clause of the query They are used to restrict the elements retrieved in the report.
Filters can be made global, that is affect the entire report instead of a single worksheet.
27
Hierarchies
Tableau supports hierarchies for drilling analysis.
To create a hierarchy, simply drag a dimension over another dimension, and a popup to create a hierarchy will appear.
While this can be achieved with any two or more arbitrary dimensions, it is generally good practice to restrict hierarchies to elements that will allow simplification and clarity in report presentation.
For example, date / time dimensions form natural hierarchies with Year / Quarter / Month / Day / Time 28
Hierarchies
29
Groups
Tableau allows the creation of groups of consolidations of elements of a dimension.
For example, Jan – Mar can be grouped as “Winter” seasons. Measures and calculations will rollup according to the consolidated group rather than the individual dimensional elements.
Groups are often used to create parents for hierarchical relationships within the same dimension.
30
Groups
Right click a dimension and select “Create Group”.
31
Aggregations
The “Customize” option opens the formula window for the user to enter a custom calculation.
32
33