Collector "Tencent Cloud - Cloud Monitor" Configuration Manual¶
Before reading this document, please read:
Tip
Before using this collector, you must install "Integration Core" and its accompanying third-party dependencies.
Tip
Before collecting Tencent Cloud Cloud Monitor data, you must first configure the custom object collector for the corresponding product.
Tip
This collector supports enabling multi-threading by default (five threads are enabled by default). If you need to change the thread pool size, you can set the environment variable COLLECTOR_THREAD_POOL_SIZE
.
1. Configuration Structure¶
The configuration structure of this collector is as follows:
Field | Type | Required | Description |
---|---|---|---|
Regions |
List | Required | List of regions for cloud monitoring data collection |
regions[#] |
str | Required | Region ID such as: ap-shanghai See the appendix for the complete list |
targets |
list | Required | List of cloud monitoring object configurations The logical relationship between multiple configurations under the same namespace is "AND" |
targets[#].namespace |
str | Required | The namespace for cloud monitoring data collection. For example: QCE/CVM See the appendix for the complete list |
targets[#].metrics |
list | Required | List of cloud monitoring metric names See the appendix for the complete list |
targets[#].metrics[#] |
str | Required | Metric name pattern, supports "NOT" , wildcard matchingNormally, the logical relationship between multiple patterns is "OR". When "NOT" is included, the logical relationship is "AND". See below for details |
2. Configuration Examples¶
Specify Specific Metrics¶
Collect 2 metrics named WanOuttraffic
and WanOutpkg
from QCE/CVM
tencentcloud_monitor_configs = {
'regions': ['ap-shanghai'],
'targets': [
{
'namespace': 'QCE/CVM',
'metrics' : ['WanOuttraffic', 'WanOutpkg'],
}
]
}
Wildcard Matching Metrics¶
Metric names can use *
wildcard for matching.
In this example, the following metrics will be collected:
- Metric named
WanOutpkg
- Metrics starting with
Wan
- Metrics ending with
Outpkg
- Metrics containing
Out
tencentcloud_monitor_configs = {
'regions': ['ap-shanghai'],
'targets': [
{
'namespace': 'QCE/CVM',
'metrics' : ['WanOutpkg', 'Wan*', '*Outpkg', '*Out*']
}
]
}
Exclude Specific Metrics¶
Adding "NOT"
at the beginning indicates excluding the following metrics.
In this example, the following metrics [will not] be collected:
- Metric named
WanOutpkg
- Metrics starting with
Wan
- Metrics ending with
Outpkg
- Metrics containing
Out
tencentcloud_monitor_configs = {
'regions': ['ap-shanghai'],
'targets': [
{
'namespace': 'QCE/CVM',
'metrics' : ['NOT', 'WanOutpkg', 'Wan*', '*Outpkg', '*Out*']
}
]
}
Multiple Filters for Specified Metrics¶
The same namespace can be specified multiple times, and metrics will be filtered sequentially from top to bottom.
In this example, the following filtering steps are performed on the metric names:
-
Select all metrics containing
Out
in their names. -
From the previous result, exclude the metric named
WanOutpkg
.
tencentcloud_monitor_configs = {
'regions': ['ap-shanghai'],
'targets': [
{
'namespace': 'QCE/CVM',
'metrics' : ['*Out*']
},
{
'namespace': 'QCE/CVM',
'metrics' : ['NOT', 'WanOutpkg']
}
]
}
Configure Filters (Optional)¶
This collector script supports custom filters, allowing users to filter target resources based on object attributes. The filter function returns True or False.
- True: The target resource should be collected.
- False: The target resource should not be collected.
Tencent Cloud Monitor supports filtering attributes that are consistent with cloud server (CVM), cloud database (CDB, Redis, MongoDB), load balancer (CLB), object storage (COS), and other object attribute data. For details, please refer to the Tencent Cloud Custom Object Collector documentation.
# Example: Enable filter, filter based on the object's name and RegionId attributes, configure as follows:
def filter_instance(instance,namespace='QCE/COS'):
'''
Collect metrics for objects with names smart-xxxxa, smart-xxxxb and RegionId ap-nanjing
'''
instance_name = instance['name']
region_id = instance['RegionId']
if instance_name in ['smart-xxxxa', 'smart-xxxxb'] and region_id in ['ap-nanjing']:
return True
return False
from integration_core__runner import Runner
import integration_tencentcloud_monitor__main as main
def run():
Runner(main.DataCollector(account, collector_configs, filters=[filter_instance])).run()
Tip
When multiple filters are configured under the same namespace, all filters must be satisfied for the data to be reported.
3. Data Collection Instructions¶
Cloud Product Configuration Information¶
Product Name | Namespace (Namespace) | Dimension (Dimension) | Description |
---|---|---|---|
Cloud Server | QCE/CVM |
InstanceId |
vm_uuid , vmUuid , uuid , InstanceId are uniformly recognized as InstanceId in object data |
Cloud Database Mysql | QCE/CDB |
InstanceId , InstanceType |
|
Object Storage Monitoring | QCE/COS |
BucketName |
|
Public Load Balancer Monitoring | QCE/LB_PUBLIC |
vip |
The Address field in object data is recognized as vip |
Private Load Balancer Monitoring | QCE/LB_PRIVATE |
vip , vpcId |
|
Cloud Database Redis | QCE/REDIS_MEM |
InstanceId |
Currently only supports Redis instance monitoring, node monitoring is not supported yet |
Cloud Database MongoDB | QCE/CMONGO |
InstanceId |
Currently only supports MongoDB instance monitoring, replica set and node monitoring are not supported yet |
Monitoring Metric Configuration Information¶
Note
Currently, the collector only supports collecting instance-level metrics. It is recommended that users configure according to the metric names corresponding to each namespace.
QCE/CVM¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
WanInpkg | Outbound Packet Count |
WanIntraffic | Outbound Bandwidth |
WanOutpkg | Outbound Packet Count |
WanOuttraffic | Outbound Bandwidth |
AccOuttraffic | Outbound Traffic |
BaseCpuUsage | Base CPU Usage |
CpuLoadavg | CPU One-Minute Average Load |
CPUUsage | CPU Utilization |
Cpuloadavg5m | CPU Five-Minute Average Load |
Cpuloadavg15m | CPU Fifteen-Minute Average Load |
CvmDiskUsage | Disk Utilization |
LanInpkg | Inbound Packet Count |
LanOutpkg | Outbound Packet Count |
LanIntraffic | Inbound Bandwidth |
LanOuttraffic | Outbound Bandwidth |
MemUsage | Memory Utilization |
MemUsed | Memory Usage |
TcpCurrEstab | TCP Connection Count |
TimeOffset | UTC Time and NTP Time Difference on Sub-Machine |
GpuMemTotal | GPU Memory Total |
GpuMemUsage | GPU Memory Utilization |
GpuMemUsed | GPU Memory Usage |
GpuPowDraw | GPU Power Usage |
GpuPowLimit | GPU Power Limit |
GpuPowUsage | GPU Power Utilization |
GpuTemp | GPU Temperature |
GpuUtil | GPU Utilization |
QCE/CDB¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
BytesReceived | Inbound Traffic |
BytesSent | Outbound Traffic |
Capacity | Disk Usage Space |
ComCommit | Commit Count |
ComDelete | Delete Count |
ComInsert | Insert Count |
ComReplace | Replace Count |
ComRollback | Rollback Count |
ComUpdate | Update Count |
ConnectionUseRate | Connection Utilization |
CpuUseRate | CPU Utilization |
CreatedTmpDiskTables | Disk Temporary Table Count |
CreatedTmpFiles | Temporary File Count |
CreatedTmpTables | Memory Temporary Table Count |
HandlerCommit | Internal Commit Count |
HandlerReadRndNext | Read Next Row Request Count |
HandlerRollback | Internal Rollback Count |
InnodbBufferPoolPagesFree | InnoDB Free Page Count |
InnodbBufferPoolPagesTotal | InnoDB Total Page Count |
InnodbBufferPoolReadRequests | InnoDB Buffer Pool Pre-Read Page Count |
InnodbBufferPoolReads | InnoDB Disk Read Page Count |
InnodbCacheHitRate | InnoDB Cache Hit Rate |
InnodbCacheUseRate | InnoDB Cache Utilization |
InnodbDataReads | InnoDB Total Read Count |
InnodbDataWrites | InnoDB Total Write Count |
InnodbDataWritten | InnoDB Write Count |
InnodbNumOpenFiles | Current InnoDB Open Table Count |
InnodbOsFileReads | InnoDB Disk Read Count |
InnodbOsFileWrites | InnoDB Disk Write Count |
InnodbOsFsyncs | InnoDB fsync Count |
InnodbRowLockTimeAvg | InnoDB Average Row Lock Time (Milliseconds) |
InnodbRowLockWaits | InnoDB Row Lock Wait Count |
InnodbRowsDeleted | InnoDB Row Delete Count |
InnodbRowsInserted | InnoDB Row Insert Count |
InnodbRowsRead | InnoDB Row Read Count |
InnodbRowsUpdated | InnoDB Row Update Count |
IOPS | Input/Output Operations Per Second (or Read/Write Count) |
KeyBlocksUnused | Key Cache Unused Block Count |
KeyBlocksUsed | Key Cache Used Block Count |
KeyCacheHitRate | MyISAM Cache Hit Rate |
KeyCacheUseRate | MyISAM Cache Utilization |
KeyReadRequests | Key Cache Read Block Count |
KeyReads | Disk Read Block Count |
KeyWriteRequests | Block Write to Key Buffer Count |
KeyWrites | Block Write to Disk Count |
LogCapacity | Log Usage |
MasterSlaveSyncDistance | Master-Slave Delay Distance |
MaxConnections | Maximum Connection Count |
MemoryUseRate | Memory Utilization |
MemoryUse | Memory Usage |
OpenFiles | Open File Count |
OpenedTables | Opened Table Count |
Qps | Queries Per Second |
Queries | Total Query Count |
QueryRate | Query Ratio |
RealCapacity | Disk Usage Space |
SecondsBehindMaster | Master-Slave Delay Time |
SelectCount | Select Count |
SelectScan | Full Table Scan Count |
SlaveIoRunning | IO Thread State |
SlaveSqlRunning | SQL Thread State |
SlowQueries | Slow Query Count |
TableLocksImmediate | Immediate Table Lock Count |
TableLocksWaited | Table Lock Wait Count |
ThreadsConnected | Current Connection Count |
ThreadsCreated | Created Thread Count |
ThreadsRunning | Running Thread Count |
Tps | Transactions Per Second |
VolumeRate | Disk Utilization |
InnodbDataRead | InnoDB Read Count |
QCE/COS¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
StdReadRequests | Standard Storage Read Requests |
StdRetrieval | Standard Data Retrieval |
StdWriteRequests | Standard Storage Write Requests |
IaRetrieval | Infrequent Data Retrieval |
IaWriteRequests | Infrequent Storage Write Requests |
IaReadRequests | Infrequent Storage Read Requests |
NlWriteRequests | Nl Write Requests |
NlRetrieval | Nl Retrieval |
CdnOriginTraffic | CDN Origin Traffic |
InternetTraffic | Internet Outbound Traffic |
InternalTraffic | Internal Outbound Traffic |
InboundTraffic | Total Upload Traffic (Internet and Internal) |
QCE/LB_PRIVATE¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
ClientConnum | Active Connections from Client to LB |
ClientInactiveConn | Inactive Connections from Client to LB |
ClientConcurConn | Concurrent Connections from Client to LB |
ClientNewConn | New Connections from Client to LB |
ClientInpkg | Inbound Packets from Client to LB |
ClientOutpkg | Outbound Packets from Client to LB |
ClientAccIntraffic | Inbound Traffic from Client to LB |
ClientAccOuttraffic | Outbound Traffic from Client to LB |
ClientOuttraffic | Outbound Bandwidth from Client to LB |
ClientIntraffic | Inbound Bandwidth from Client to LB |
DropTotalConns | Dropped Connection Count |
InDropBits | Dropped Inbound Bandwidth |
OutDropBits | Dropped Outbound Bandwidth |
InDropPkts | Dropped Inbound Packets |
OutDropPkts | Dropped Outbound Packets |
IntrafficVipRatio | Inbound Bandwidth Utilization |
OuttrafficVipRatio | Outbound Bandwidth Utilization |
UnhealthRsCount | Health Check Exception Count |
QCE/LB_PUBLIC¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
ClientConnum | Active Connections from Client to LB |
ClientInactiveConn | Inactive Connections from Client to LB |
ClientConcurConn | Concurrent Connections from Client to LB |
ClientNewConn | New Connections from Client to LB |
ClientInpkg | Inbound Packets from Client to LB |
ClientOutpkg | Outbound Packets from Client to LB |
ClientAccIntraffic | Inbound Traffic from Client to LB |
ClientAccOuttraffic | Outbound Traffic from Client to LB |
ClientIntraffic | Inbound Bandwidth from Client to LB |
ClientOuttraffic | Outbound Bandwidth from Client to LB |
DropTotalConns | Dropped Connection Count |
IntrafficVipRatio | Public Inbound Bandwidth Utilization (May Not Exist) |
InDropBits | Dropped Inbound Bandwidth |
InDropPkts | Dropped Inbound Packets |
OuttrafficVipRatio | Public Outbound Bandwidth Utilization (May Not Exist) |
OutDropBits | Dropped Outbound Bandwidth |
OutDropPkts | Dropped Outbound Packets |
UnhealthRsCount | Health Check Exception Count |
QCE/REDIS_MEM¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
CpuUtil | CPU Utilization |
CpuMaxUtil | Node Maximum CPU Utilization |
MemUsed | Memory Usage |
MemUtil | Memory Utilization |
MemMaxUtil | Node Maximum Memory Utilization |
Keys | Total Key Count |
Expired | Key Expiration Count |
Evicted | Key Eviction Count |
Connections | Connection Count |
ConnectionsUtil | Connection Utilization |
InFlow | Inbound Flow |
InBandwidthUtil | Inbound Flow Utilization |
InFlowLimit | Inbound Flow Throttling Trigger |
OutFlow | Outbound Flow |
OutBandwidthUtil | Outbound Flow Utilization |
OutFlowLimit | Outbound Flow Throttling Trigger |
LatencyAvg | Average Execution Latency |
LatencyMax | Maximum Execution Latency |
LatencyRead | Read Average Latency |
LatencyWrite | Write Average Latency |
LatencyOther | Other Command Average Latency |
Commands | Total Requests |
CmdRead | Read Requests |
CmdWrite | Write Requests |
CmdOther | Other Requests |
CmdBigValue | Large Value Requests |
CmdKeyCount | Key Request Count |
CmdMget | Mget Request Count |
CmdSlow | Slow Queries |
CmdHits | Read Request Hits |
CmdMiss | Read Request Miss |
CmdErr | Execution Error |
CmdHitsRatio | Read Request Hit Rate |
QCE/CMONGO¶
Metric Name (MetricName) | Metric Chinese Name |
---|---|
Reads | Read Request Count |
Updates | Update Request Count |
Deletes | Delete Request Count |
Counts | Count Request Count |
Success | Success Request Count |
Commands | Command Request Count |
Qps | Requests Per Second |
Delay10 | Requests with Latency Between 10 - 50 Milliseconds |
Delay50 | Requests with Latency Between 50 - 100 Milliseconds |
Delay100 | Requests with Latency Above 100 Milliseconds |
ClusterConn | Cluster Connection Count |
Connper | Connection Utilization |
ClusterDiskusage | Disk Utilization |
4. Data Reporting Format¶
After data is synchronized normally, you can view the data in the "Metrics" section of Guance.
Take the following collector configuration as an example:
tencentcloud_monitor_configs = {
'regions': ['ap-shanghai'],
'targets': [
{
'namespace': 'QCE/CVM',
'metrics' : ['WanOutpkg']
}
]
}
The reported data example is as follows:
{
"measurement": "tencentcloud_QCE/CVM",
"tags": {
"InstanceId": "i-xxx"
},
"fields": {
"WanOutpkg_max": 0.005
}
}
Tip
All metric values will be reported as float type.
Tip
This collector collects the WanOutpkg metric data under the QCE/CVM namespace (Namespace). For details, see the Data Collection Instructions table.
5. Interaction with Custom Object Collectors¶
When other custom object collectors (such as CVM) are running in the same DataFlux Func, this collector will supplement fields based on the dimension information in the Data Collection Instructions. For example, CVM tries to match the tags.name
field in the custom object based on the InstanceId
field returned by cloud monitoring data.
Since it is necessary to first obtain custom object information before interacting in the cloud monitoring collector, it is generally recommended to place the cloud monitoring collector at the end of the list, such as:
# Create collectors
collectors = [
tencentcloud_cvm.DataCollector(account, common_tencentcloud_configs),
tencentcloud_monitor.DataCollector(account, tencentcloud_monitor_configs) # Cloud monitoring collector is usually placed last
]
When a successful match is made, additional fields from the custom object tags will be added to the cloud monitoring data tags, thereby achieving effects such as filtering cloud monitoring metric data using instance names. The specific effect is as follows:
Assume the original data collected by cloud monitoring is as follows:
At the same time, the custom object data collected by the Tencent Cloud CVM collector is as follows:
{
"measurement": "tencentcloud_cvm",
"tags": {
"name" : "i-xxx",
"InstanceType" : "c6g.xxx",
"PlatformDetails": "xxx",
"{...}"
},
"fields": { "..." }
}
Then, the final reported cloud monitoring data is as follows:
{
"measurement": "tencentcloud_QCE/CVM",
"tags": {
"name" : "i-xxx",
"InstanceId" : "i-xxx", // Original field from cloud monitoring
"InstanceType" : "c6g.xxx", // Field from custom object CVM
"PlatformDetails" : "xxx", // Field from custom object CVM
"{Other fields omitted}"
},
"fields": { "Content omitted" }
}
6. Cloud Monitoring API Call Count Explanation¶
Tencent Cloud Cloud Monitor has a free quota limit for some API call counts (this collector uses the GetMonitorData API to request monitoring data. It belongs to the free quota API. The free request quota for each primary account is 1 million times/month. The excess part is charged at 0.25 yuan/10,000 times. In addition, after exceeding the free quota, you will not be able to continue using it. If you need to continue calling the API, you need to manually enable "API Request Pay-As-You-Go".) The following is a detailed explanation of the script set call count:
1. Users have multiple resources and need to collect multiple monitoring items. Will it exceed the free quota:
This collector uses GetMonitorData (query the latest monitoring data of specified monitoring items). One request can obtain multiple (up to 10, if exceeded, pagination is used) resources for a certain monitoring item. Example of request count:
- There are 10 cvm resources under the account that need to collect 1 monitoring item CpuUsage, 1 request is needed;
- There are 10 cvm resources under the account that need to collect 2 monitoring items CpuUsage and BaseCpuUsage, 2 requests are needed (one request per monitoring item);
- There are 11 cvm resources under the account that need to collect 1 monitoring item CpuUsage, 2 requests are needed (resources exceed 10, pagination is used);
- There are 11 cvm resources under the account that need to collect 2 monitoring items CpuUsage and BaseCpuUsage, 4 requests are needed;
2. View the real call count through the task execution log:
The collector has statistics on the API calls made for each task execution result, which can be viewed in the log, for example:
[2023-04-24 19:02:02.359] [+1156ms] The [1]th account collection is completed, it took [1155 milliseconds], and [2] API calls were made during this period.
[2023-04-24 19:02:02.360] [+0ms] Detailed calls are as follows
[2023-04-24 19:02:02.360] [+0ms] -> monitor.tencentcloudapi.com/?Action=DescribeBaseMetrics: 1 time
[2023-04-24 19:02:02.565] [+0ms] -> monitor.tencentcloudapi.com/?Action=GetMonitorData: 1 time
!!! warning "Given that cloud monitoring API calls have a free quota, it is recommended that users configure monitoring items as needed to avoid additional costs caused by wildcards"
Precautions¶
Task Error Conditions and Solutions¶
HTTPClientError: An HTTP Client raised an unhandled exception: SoftTimeLimitExceeded()
Reason: The task execution time is too long timeout.
Solution:
-
Appropriately increase the timeout setting for the task (e.g.,
@DFF.API('Execute Collection', timeout=120, fixed_crontab="* * * * *")
, which means setting the task timeout to 120 seconds). -
[TencentCloudSDKException] code:InvalidParameterValue message:cannot find metricName=xxx configure
Reason: Tencent Cloud does not support the collection of this metric (this situation may occur where the metric exists in the Tencent Cloud documentation but is not actually supported)
Solution:
-
It is recommended to refer to the Monitoring Metric Configuration Information in this document to configure valid metric names.
-
[TencentCloudSDKException] code:InvalidParameterValue message: xxxxx does not belong to the developer ....
Reason: When collecting cloud monitoring data for a certain product under a certain account, the product has been released, causing the interface to throw an error, which can be ignored.
X. Appendix¶
Tencent Cloud Cloud Monitor¶
Please refer to the official Tencent Cloud documentation: