Performance monitoring and sizing at scale in the big data ecosystem is a real challenge. Here are a few of the tools to use: 1. Metric beat : Run the metric beat on each of the cluster nodes, and visualise the stats using Elasticsearch/Kibana. https://www.elastic.co/guide/en/beats/metricbeat/current/index.html This is good for many components such as Docker, Kubernetes, KVM, Elasticsearch, Kafka, Logstash and many more components. 2. Dr.Element: This is mainly for performance monitoring and tuning of Hadoop cluster and spark jobs: https://github.com/linkedin/dr-elephant 3. ElasticHQ/ Rally Monitor the elasticsearch Indexing and query performance at scale: http://www.elastichq.org/index.html Rally for sizing ES: https://www.elastic.co/blog/announcing-rally-benchmarking-for-elasticsearch 4. Sparklens from Qubole For profiling and sizing of spark jobs alone sparklens from Qubole is a good choice too : https://github.com/qubole/sparklens 5. ...