Distributed/Cluster nodes performance monitoring Live using metricbeat
Live performance monitoring is the critical part of the big data ecosystem. There are many ways to monitor around it. However, i feel the metricbeat is one of the best choices.
Metricbeat always runs in conjunction with EK(Elastic search and Kibana) stack. It's really special for the platform components (Kubernetes, Docker, fluentd, hyperkube, etcd, HA proxy etc ), where we have inbuilt monitoring support in Metricbeat.
Metricbeat collects other system statistics such as CPU, Memory, IO, network stats etc and pushed them into elasticsearch. From Elasticsearch we can visualise them using Kibana.
In my experiment, i set up the Elastic-search and Kibana as isolated system( on 192.168.1.2) and Metricbeat on all the cluster nodes which require performance statistics monitoring.
Usage
- Download the tar from the link here to all the cluster nodes: https://github.com/Indu-sharma/Utilities/blob/master/metricbeat.tar.gz
- Extract the tar in a folder:
- Point metricbeat to Elasticsearch/Kibana and change the ES indices:
- Start the metricbeat service on all the cluster nodes:
- Open the Kibana UI and configure Visulisations and dashboards:
- Now, From Visualisation button create the timeserise -> Visual builder and publish the graphs such as CPU, Memory by process etc to the dashboards.
- Kibana UI Samples:
A. Top Hosts with CPU usage on cluster and Top Processes with CPU usage:
B. Top Hosts with RSS memory usage on cluster and Top Processes with RSS memory usage:
Above statistics are Live, and can be monitored continuously. Along with default, customisations can be done to automatically monitor/detect the statistics of all spark/MR jobs by dynamically updating the PID value of them in elastic-search with the actual strings as provided in the file
Comments