Tableau integration with sparkSQL and basic data analysis with Tableau

Steps for Tableau integration with sparkSQL and basic data analysis:
================================================

  1. Run the spark-Sql in NameNode[sparkSql server node] as: /opt/spark/sbin/start-thriftserver.sh   --hiveconf hive.server2.thrift.port=10001
  2. Download & install  tableau-10 from the site: https://www.tableau.com/products [14-days trail version]
  3. Download & install tableau driver for spark-SQL: https://downloads.tableau.com/drivers/mac/TableauDrivers.dmg
  4. Open tableau & connect to sparkSQL.
  5. Provide server as NameNode IP & port as 10001 [as in step-1 above]
  6. Select Type as ‘SparkThriftServer’
  7. Select Authentication as  ‘Username and password’
  8. Provide username as ‘hive’ [This is same as in hive-site.xml]
  9. Provide password as ‘hive@123’ [This is same as in hive-site.xml]
  10. Search & select the database name in ‘Select Schema’ dropdown. [This is the same parquet db sparkJobs created ]
  11. Search & select the table names.
  12. Drag & drop the table to the right hand side area.
  13. Go to sheet1 tab already created. 
  14. Change the data Type of measures to ‘Number Decimal’ 
  15. For each measures on left side,  create calculation field & do the required calculations example: to convert bytes to Gbps, the calculation field would be '[Bytebuffer]*8/3600/1000/1000/1000’
  16. Now drag and drop the dimensions & measures[with calculation fields created] to right hand side. 
  17. Select the graph type of choice by clicking on ’Show me’ on top right corner. 
  18. Once graphs various metrics/tables are finalised, create a dashboard from menu.
  19. Drag and drop the graphs finalised to the dashboard. 

Comments