Skip to content

Huawei Cloud Search Service CSS for Elasticsearch

Collect monitoring Metrics for Huawei Cloud Search Service CSS for Elasticsearch

Configuration

Install Func

It is recommended to activate the Guance integration - extension - DataFlux Func (Automata): all prerequisites are automatically installed, please continue with the script installation

If you deploy Func on your own, refer to Self-deployed Func

Install Script

Note: Please prepare a qualified Huawei Cloud AK in advance (for simplicity, you can directly grant global read-only permission ReadOnlyAccess)

To synchronize the monitoring data of Huawei Cloud Search Service CSS for Elasticsearch, we install the corresponding collection script: access the web service of func and enter 【Script Market】-【Details】, search by css keyword, and install 「Guance Integration (Huawei Cloud-CSS)」(ID: guance_huaweicloud_css)

After clicking 【Install】, input the corresponding parameters: Huawei Cloud AK, SK, Huawei Cloud account name.

Click 【Deploy Startup Script】, and the system will automatically create a Startup script set, and automatically configure the corresponding startup script.

After the script installation is complete, find the script 「Guance Integration (Huawei Cloud-CSS)」 in the "Development" section of Func, expand and modify this script. Find collector_configs and monitor_configs respectively and edit the content under region_projects. Change the region and Project ID to the actual region and Project ID, then click Save and Publish.

In addition, see the corresponding automatic trigger configuration in the 「Management / Automatic Trigger Configuration」. Click 【Execute】, and it will be executed immediately without waiting for the scheduled time. After a while, you can view the execution task records and corresponding logs.

Verification

  1. In 「Management / Automatic Trigger Configuration」, confirm whether the corresponding tasks have the corresponding automatic trigger configurations. You can also check the corresponding task records and logs to see if there are any abnormalities.
  2. On the Guance platform, in 「Infrastructure - Resource Catalog」, check if there is asset information.
  3. On the Guance platform, in 「Metrics」, check if there are corresponding monitoring data.

Metrics

Configure Huawei Cloud CSS Metrics. More Metrics can be collected through configuration Huawei Cloud CSS Metric Details

Instance Monitoring Metrics

The instance performance monitoring Metrics for Huawei Cloud Search Service CSS for Elasticsearch are shown in the following table. For more Metrics, please refer to Table 1

Metric ID Metric Name Metric Meaning Value Range Monitoring Period (Raw Metric)
status Cluster Health Status This Metric is used to statistically measure the status of the monitored object. 0,1,2,3;
0: The cluster is 100% available.
1: The data is complete, but some replicas are missing. High availability is somewhat weakened, posing risks, so please pay attention to the cluster situation promptly.
2: Data is missing, and anomalies will occur when using the cluster.
3: Cluster status not obtained.
1 minute
indices_count Index Count Number of indices in the CSS cluster. ≥ 0 1 minute
total_shards_count Shard Count Number of shards in the CSS cluster. ≥ 0 1 minute
primary_shards_count Primary Shard Count Number of primary shards in the CSS cluster. ≥ 0 1 minute
coordinating_nodes_count Coordinating Node Count Number of coordinating nodes in the CSS cluster. ≥ 0 1 minute
data_nodes_count Data Node Count Number of data nodes in the CSS cluster. ≥ 0 1 minute
SearchRate Average Query Rate Query QPS, the average number of query operations per second performed by the cluster. ≥ 0 1 minute
IndexingRate Average Indexing Rate Ingest TPS, the average number of indexing operations per second performed by the cluster. ≥ 0 1 minute
IndexingLatency Average Indexing Latency Average time required to complete an indexing operation on a shard. ≥ 0 ms 1 minute
SearchLatency Average Query Latency Average time required to complete a search operation on a shard. ≥ 0 ms 1 minute
avg_cpu_usage Average CPU Usage Average CPU utilization across all nodes in the CSS cluster. 0-100% 1 minute
avg_mem_used_percent Average Used Memory Ratio Average ratio of memory used across all nodes in the CSS cluster. 0-100% 1 minute
disk_util Disk Utilization This Metric is used to statistically measure the disk usage of the object. 0-100% 1 minute
avg_load_average Average Node Load Value Average value of the 1-minute average queue length across all nodes in the CSS cluster within the operating system. ≥ 0 1 minute
avg_jvm_heap_usage Average JVM Heap Usage Average JVM heap memory usage across all nodes in the CSS cluster. 0-100% 1 minute
sum_current_opened_http_count Current Opened HTTP Connections Sum of Http connections opened and not yet closed across all nodes in the CSS cluster. ≥ 0 1 minute
avg_thread_pool_write_queue Average Number of Queued Tasks in Write Queue Average number of queued tasks in the write thread pool across all nodes in the CSS cluster. ≥ 0 1 minute
avg_thread_pool_search_queue Average Number of Queued Tasks in Search Queue Average number of queued tasks in the search thread pool across all nodes in the CSS cluster. ≥ 0 1 minute
avg_thread_pool_force_merge_queue Average Number of Queued Tasks in ForceMerge Queue Average number of queued tasks in the force merge thread pool across all nodes in the CSS cluster. ≥ 0 1 minute
avg_thread_pool_write_rejected Average Number of Rejected Tasks in Write Queue Average number of rejected tasks in the write thread pool across all nodes in the CSS cluster. ≥ 0 1 minute
avg_jvm_old_gc_count Average JVM Old Generation GC Count Average cumulative value of the number of times "old generation" garbage collection has run across all nodes in the CSS cluster. ≥ 0 1 minute
avg_jvm_old_gc_time Average JVM Old Generation GC Time Average cumulative value of the time spent executing "old generation" garbage collection across all nodes in the CSS cluster. ≥ 0 ms 1 minute
avg_jvm_young_gc_count Average JVM Young Generation GC Count Average cumulative value of the number of times "young generation" garbage collection has run across all nodes in the CSS cluster. ≥ 0 1 minute
avg_jvm_young_gc_time Average JVM Young Generation GC Time Average cumulative value of the time spent executing "young generation" garbage collection across all nodes in the CSS cluster. ≥ 0 ms 1 minute

Objects

The Object data structure for the collected Huawei Cloud Search Service CSS for Elasticsearch can be seen in 「Infrastructure - Resource Catalog」

{
  "measurement": "huaweicloud_css",
  "tags": {
    "RegionId"                   : "cn-north-4",
    "project_id"                 : "xxxxxxx",
    "enterpriseProjectId"        : "",
    "instance_id"                : "xxxxxxx-xxxxxxx-xxxxxxx-00001",
    "instance_name"              : "css-3384",
    "publicIp"                   : "xxxxx",
    "status"                     : "100",
    "endpoint"                   : "192.168.0.100:9200",
  },
  "fields": {
    "vpc_id"                     : "3dda7d4b-aec0-4838-a91a-28xxxxxxxx",
    "subnetId"                   : "xxxxx",
    "securityGroupId"            : "xxxxxxx",
    "datastore"                           : "{\"supportSecuritymode\": false, \"type\": \"elasticsearch\", \"version\": \"7.6.2\"}",
    "instances"                           : "[{\"azCode\": \"cn-east-3a\", \"id\": \"95f61e90-507b-48d4-8ac5-53dcefd155a3\", \"ip\": \"192.168.0.140\", \"name\": \"css-test-ess-esn-1-1\", \"specCode\": \"ess.spec-kc1.xlarge.2\", \"status\": \"200\", \"type\": \"ess\", \"volume\": {\"size\": 40, \"type\": \"HIGH\"}}]",
    "publicKibanaResp"                    : "xxxx",
    "elbWhiteList"                        : "xxxx",
    "updated"                             : "2023-06-27T07:35:29",
    "created"                             : "2023-06-27T07:35:29",
    "bandwidthSize"                       : "100",
    "actions"                             : "REBOOTING",
    "tags"                                : "xxxx",
    "period"                              : true, 
  }
}

Descriptions of some parameters are as follows:

Parameter Name Description
status Cluster status value
updated Last modified time of the cluster, ISO8601 format
bandwidthSize Public bandwidth, unit: Mbit/s
actions Current actions of the cluster
period Whether it is a subscription cluster

Meanings of the values for status (cluster status value):

Value Description
100 Creating
200 Available
303 Unavailable

Meanings of the values for actions (current actions of the cluster):

Value Description
REBOOTING Rebooting
GROWING Scaling up
RESTORING Restoring the cluster
SNAPSHOTTING Creating a snapshot

Meanings of the values for period:

Value Description
true Subscription-billed cluster
false Pay-as-you-go billed cluster

Note: Fields in tags and fields may change with subsequent updates.

Hint: The value of tags.instance_id is the cluster ID, which serves as a unique identifier.

Feedback

Is this page helpful? ×