Tableau integration with sparkSQL and basic data analysis with Tableau
Steps for Tableau integration with sparkSQL and basic data analysis:
================================================
- Run the spark-Sql in NameNode[sparkSql server node] as: /opt/spark/sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.port=10001
- Download & install tableau-10 from the site: https://www.tableau.com/products [14-days trail version]
- Download & install tableau driver for spark-SQL: https://downloads.tableau.com/drivers/mac/TableauDrivers.dmg
- Open tableau & connect to sparkSQL.
- Provide server as NameNode IP & port as 10001 [as in step-1 above]
- Select Type as ‘SparkThriftServer’
- Select Authentication as ‘Username and password’
- Provide username as ‘hive’ [This is same as in hive-site.xml]
- Provide password as ‘hive@123’ [This is same as in hive-site.xml]
- Search & select the database name in ‘Select Schema’ dropdown. [This is the same parquet db sparkJobs created ]
- Search & select the table names.
- Drag & drop the table to the right hand side area.
- Go to sheet1 tab already created.
- Change the data Type of measures to ‘Number Decimal’
- For each measures on left side, create calculation field & do the required calculations example: to convert bytes to Gbps, the calculation field would be '[Bytebuffer]*8/3600/1000/1000/1000’
- Now drag and drop the dimensions & measures[with calculation fields created] to right hand side.
- Select the graph type of choice by clicking on ’Show me’ on top right corner.
- Once graphs various metrics/tables are finalised, create a dashboard from menu.
- Drag and drop the graphs finalised to the dashboard.
Comments