小小小幸运 发表于 2022-11-25 16:00:12

使用Prometheus监控docker compose方式部署的ES

https://img2022.cnblogs.com/other/3034537/202211/3034537-20221125101845611-1016084365.jpg
需求

收集 ES 的指标, 并进行展示和告警;
现状


[*]ES 通过 docker compose 安装
[*]所在环境的 K8S 集群有 Prometheus 和AlertManager 及 Grafana
方案

复用现有的监控体系, 通过: Prometheus 监控 ES.
https://img2022.cnblogs.com/other/3034537/202211/3034537-20221125101845828-1599852029.svg
具体实现为:
采集端 elasticsearch_exporter

可以监控的指标为:
NameTypeCardinalityHelpelasticsearch_breakers_estimated_size_bytesgauge4Estimated size in bytes of breakerelasticsearch_breakers_limit_size_bytesgauge4Limit size in bytes for breakerelasticsearch_breakers_trippedcounter4tripped for breakerelasticsearch_cluster_health_active_primary_shardsgauge1The number of primary shards in your cluster. This is an aggregate total across all indices.elasticsearch_cluster_health_active_shardsgauge1Aggregate total of all shards across all indices, which includes replica shards.elasticsearch_cluster_health_delayed_unassigned_shardsgauge1Shards delayed to reduce reallocation overheadelasticsearch_cluster_health_initializing_shardsgauge1Count of shards that are being freshly created.elasticsearch_cluster_health_number_of_data_nodesgauge1Number of data nodes in the cluster.elasticsearch_cluster_health_number_of_in_flight_fetchgauge1The number of ongoing shard info requests.elasticsearch_cluster_health_number_of_nodesgauge1Number of nodes in the cluster.elasticsearch_cluster_health_number_of_pending_tasksgauge1Cluster level changes which have not yet been executedelasticsearch_cluster_health_task_max_waiting_in_queue_millisgauge1Max time in millis that a task is waiting in queue.elasticsearch_cluster_health_relocating_shardsgauge1The number of shards that are currently moving from one node to another node.elasticsearch_cluster_health_statusgauge3Whether all primary and replica shards are allocated.elasticsearch_cluster_health_timed_outgauge1Number of cluster health checks timed outelasticsearch_cluster_health_unassigned_shardsgauge1The number of shards that exist in the cluster state, but cannot be found in the cluster itself.elasticsearch_clustersettings_stats_max_shards_per_nodegauge0Current maximum number of shards per node setting.elasticsearch_filesystem_data_available_bytesgauge1Available space on block device in byteselasticsearch_filesystem_data_free_bytesgauge1Free space on block device in byteselasticsearch_filesystem_data_size_bytesgauge1Size of block device in byteselasticsearch_filesystem_io_stats_device_operations_countgauge1Count of disk operationselasticsearch_filesystem_io_stats_device_read_operations_countgauge1Count of disk read operationselasticsearch_filesystem_io_stats_device_write_operations_countgauge1Count of disk write operationselasticsearch_filesystem_io_stats_device_read_size_kilobytes_sumgauge1Total kilobytes read from diskelasticsearch_filesystem_io_stats_device_write_size_kilobytes_sumgauge1Total kilobytes written to diskelasticsearch_indices_active_queriesgauge1The number of currently active querieselasticsearch_indices_docsgauge1Count of documents on this nodeelasticsearch_indices_docs_deletedgauge1Count of deleted documents on this nodeelasticsearch_indices_docs_primarygaugeCount of documents with only primary shards on all nodeselasticsearch_indices_fielddata_evictionscounter1Evictions from field dataelasticsearch_indices_fielddata_memory_size_bytesgauge1Field data cache memory usage in byteselasticsearch_indices_filter_cache_evictionscounter1Evictions from filter cacheelasticsearch_indices_filter_cache_memory_size_bytesgauge1Filter cache memory usage in byteselasticsearch_indices_flush_time_secondscounter1Cumulative flush time in secondselasticsearch_indices_flush_totalcounter1Total flusheselasticsearch_indices_get_exists_time_secondscounter1Total time get exists in secondselasticsearch_indices_get_exists_totalcounter1Total get exists operationselasticsearch_indices_get_missing_time_secondscounter1Total time of get missing in secondselasticsearch_indices_get_missing_totalcounter1Total get missingelasticsearch_indices_get_time_secondscounter1Total get time in seconds...
展示端 基于Grafana

<blockquote>

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
页: [1]
查看完整版本: 使用Prometheus监控docker compose方式部署的ES