Collector "Alibaba Cloud - CloudMonitor" Configuration Manual¶

Before reading this document, please read:

Cloud Integration

Tip

Before using this collector, you must install "Integration Core" and its accompanying third-party dependencies.

Tip

This collector supports multi-threading by default (five threads are enabled by default). If you need to change the thread pool size, you can set the environment variable COLLECTOR_THREAD_POOL_SIZE.

1. Configuration Structure¶

The configuration structure of this collector is as follows:

Field	Type	Required	Description
`targets`	list	Required	List of CloudMonitor collection target configurations The logical relationship between multiple configurations in the same namespace is "AND".
`targets[#].namespace`	str	Required	The namespace of the CloudMonitor to be collected. For example: `'acs_ecs_dashboard'` See the appendix for the complete list.
`targets[#].metrics`	list	Required	List of CloudMonitor metric names to be collected See the appendix for the complete list.
`targets[#].metrics[#]`	str	Required	Metric name pattern, supports `"NOT"`, wildcard matching Normally, the logical relationship between multiple patterns is "OR". When `"NOT"` is included, the logical relationship between multiple patterns is "AND". See below for details.

2. Configuration Examples¶

Specifying Specific Metrics¶

Collect 2 metrics named CPUUtilization and concurrentConnections in ECS.

collector_configs = {
    'targets': [
        {
            'namespace': 'acs_ecs_dashboard',
            'metrics'  : ['CPUUtilization', 'concurrentConnections'],
        },
    ],
}

Wildcard Matching Metrics¶

Metric names can use the * wildcard for matching.

In this example, the following metrics will be collected:

Metrics named CPUUtilization.
Metrics whose names start with CPU.
Metrics whose names end with Connections.
Metrics whose names contain Conn.

collector_configs = {
    'targets': [
        {
            'namespace': 'acs_ecs_dashboard',
            'metrics'  : ['CPUUtilization', 'CPU*', '*Connections', '*Conn*'],
        },
    ],
}

Excluding Specific Metrics¶

Adding "NOT" at the beginning indicates excluding the following metrics.

In this example, the following metrics will [not] be collected:

Metrics named CPUUtilization.
Metrics whose names start with CPU.
Metrics whose names end with Connections.
Metrics whose names contain Conn.

collector_configs = {
    'targets': [
        {
            'namespace': 'acs_ecs_dashboard',
            'metrics'  : ['NOT', 'CPUUtilization', 'CPU*', '*Connections', '*Conn*'],
        },
    ],
}

Multi-step Filtering for Required Metrics¶

The same namespace can be specified multiple times, and the metric names will be filtered step by step from top to bottom.

In this example, the metric names are filtered as follows:

Select all metrics whose names contain CPU.
From the results of the previous step, exclude metrics named CPUUtilization.

collector_configs = {
    'targets': [
        {
            'namespace': 'acs_ecs_dashboard',
            'metrics'  : ['*CPU*'],
        },
        {
            'namespace': 'acs_ecs_dashboard',
            'metrics'  : ['NOT', 'CPUUtilization'],
        },
    ],
}

Configuring Filters (Optional)¶

This collector script supports user-defined filters, allowing users to filter target resources based on object attributes. The filter function returns True or False.

True: The target resource should be collected.
False: The target resource should not be collected.

# Example: Enable filter, filter based on InstanceId, RegionId attributes, configuration format as follows:

def filter_instance(instance, namespace='acs_ecs_dashboard'):
    '''
    Collect metrics for InstanceId i-xxxxxa, i-xxxxxb and RegionId cn-hangzhou.
    '''
    instance_id = instance['tags'].get('InstanceId')
    region_id = instance['tags'].get('RegionId')
    if instance_id in ['i-xxxxxa', 'i-xxxxxb'] and region_id in ['cn-hangzhou']:
        return True
    return False

from integration_core__runner import Runner
import integration_alibabacloud_monitor__main as main

@DFF.API('AlibabaCloud-Monitor Collection', timeout=3600, fixed_crontab="*/5 * * * *")
def run():
    Runner(main.DataCollector(account, collector_configs, filters=[filter_instance])).run()

Tip

When multiple filters are configured under the same namespace, all filters must be satisfied for the data to be reported.

3. Data Reporting Format¶

After data is successfully synchronized, you can view the data in the "Metrics" section of Guance.

Take the following collector configuration as an example:

collector_configs = {
    'targets': [
        {
            'namespace': 'acs_ecs_dashboard',
            'metrics'  : ['CPUUtilization'],
        },
    ],
}

The reported data example is as follows:

{
  "measurement": "aliyun_acs_ecs_dashboard",
  "tags": {
    "instanceId": "i-xxxxx",
    "userId"    : "xxxxx"
  },
  "fields": {
    "CPUUtilization_Average": 1.23,
    "CPUUtilization_Maximum": 1.23,
    "CPUUtilization_Minimum": 1.23
  }
}

Tip

All metric values will be reported as float type.

4. Interaction with Custom Object Collectors¶

When other custom object collectors (such as ECS, RDS) are running in the same DataFlux Func, this collector will automatically try to match the tags.instanceId field with the tags.name field in the custom objects.

Since custom object information needs to be known first for interaction in CloudMonitor-type collectors, it is generally recommended to place CloudMonitor collectors at the end of the list, for example:

    # Create collectors
    collectors = [
        aliyun_ecs.DataCollector(account, common_aliyun_configs),
        aliyun_rds.DataCollector(account, common_aliyun_configs),
        aliyun_slb.DataCollector(account, common_aliyun_configs),
        aliyun_oss.DataCollector(account, common_aliyun_configs),
        aliyun_monitor.DataCollector(account, monitor_collector_configs), # CloudMonitor-type collectors are generally placed last.
    ]

When a successful match is made, the fields in the tags of the matched custom object, except for name, will be added to the tags of the monitoring data, thereby achieving effects such as filtering CloudMonitor metric data using instance names. The specific effect is as follows:

Assume the original data collected by CloudMonitor is as follows:

{
  "measurement": "aliyun_acs_ecs_dashboard",
  "tags": {
    "instanceId": "i-001",
    "{...}"     : "{...}"
  },
  "fields": {
    "{...}"     : "{...}"
  }
}

At the same time, the custom object data collected by Alibaba Cloud ECS collector is as follows:

{
  "measurement": "aliyun_ecs",
  "tags": {
    "name"      : "i-001",
    "InstanceId": "i-001",
    "RegionId"  : "cn-hangzhou",
    "{...}"     : "{...}"
  },
  "fields": {
    "{...}"     : "{...}"
  }
}

Then, the final reported CloudMonitor data will be as follows:

{
"measurement": "aliyun_acs_ecs_dashboard",
  "tags": {
    "instanceId": "i-001",
    "RegionId"  : "cn-hangzhou",
    "{...}"     : "{...}"
  },
  "fields": {
    "{...}"     : "{...}"
  }
}

5. CloudMonitor API Call Count Explanation¶

Alibaba Cloud CloudMonitor has free quota limits for some API calls (currently: 1 million query API calls per month for free, exceeding part is charged at 0.12 yuan per 10,000 calls). The DescribeMetricLast used by this collector is also within the limit. The following explains the script set call count in detail:

1. The user has multiple resources and needs to collect multiple monitoring items, to determine whether the free quota will be exceeded:

This collector uses DescribeMetricLast (query the latest monitoring data of specified monitoring items) to get multiple (up to 1000, if exceeded, pagination is used) resources' monitoring items in one request. Example of request count:

If there are 1000 ecs resources in the account and need to collect 1 monitoring item CPUUtilization, 1 request is needed;
If there are 1000 ecs resources in the account and need to collect 2 monitoring items CPUUtilization and DiskReadBPS, 2 requests are needed (one request for each monitoring item);
If there are 1001 ecs resources in the account and need to collect 1 monitoring item CPUUtilization, 2 requests are needed (resources exceed 1000, pagination is used);
If there are 1001 ecs resources in the account and need to collect 2 monitoring items CPUUtilization and DiskReadBPS, 4 requests are needed;

2. View the actual call count in the task execution log:

The collector counts the API calls for each task execution result, which can be viewed in the log, for example:

[2023-04-21 15:32:13.194] [+0ms] The [1] account collection is completed, total execution time [274 ms], during which API was called [2 times]
[2023-04-21 15:32:13.194] [+0ms] Detailed calls are as follows:
[2023-04-21 15:32:13.194] [+0ms] -> metrics.aliyuncs.com/?Action=DescribeMetricMetaList: 1 time
[2023-04-21 15:32:13.194] [+0ms] -> metrics.aliyuncs.com/?Action=DescribeMetricLast: 1 time

Warning

Given the free quota for CloudMonitor API calls, it is recommended that users configure monitoring items as needed to avoid additional costs caused by wildcard matching.

Precautions¶

Common Errors and Solutions¶

The number of collected instances does not match the actual number of instances

Reason: The instance status is shut down.

Solution:

Start the instance.

X. Appendix¶

Please refer to Alibaba Cloud official documentation: