AWS ELB¶
Collect ELB metrics and log information.
Configuration¶
Install Func¶
It is recommended to activate Guance Integration - Extensions - DataFlux Func (Automata): All prerequisites are automatically installed, please proceed with script installation.
If deploying Func manually, refer to Manual Deployment of Func
Install Script¶
Note: Please prepare the required Amazon AK in advance (for simplicity, you can grant global read-only permissions
ReadOnlyAccess)
Activate Script in Automata¶
- Log in to the Guance console
- Click the [Integration] menu, select [Cloud Account Management]
- Click [Add Cloud Account], select [AWS], and fill in the required information on the interface. If the cloud account information has been configured before, ignore this step
- Click [Test], and after successful testing, click [Save]. If the test fails, please check if the relevant configuration information is correct and test again
- Click the [Cloud Account Management] list to see the added cloud account, click the corresponding cloud account to enter the details page
- Click the [Integration] button on the cloud account details page, find
AWS ELBunder theNot Installedlist, click the [Install] button, and install it in the pop-up installation interface.
Manual Script Activation¶
-
Log in to the Func console, click 【Script Market】, enter the Guance script market, and search for
integration_aws_elb, integration_aws_applicationelb, integration_aws_networkelb, integration_aws_gatewayelb. -
Click 【Install】, then enter the corresponding parameters: AWS AK ID, AK Secret, and account name.
-
Click 【Deploy Startup Script】, the system will automatically create the
Startupscript set and configure the corresponding startup scripts. -
After activation, you can see the corresponding automatic trigger configuration in 「Manage / Automatic Trigger Configuration」. Click 【Execute】 to execute once immediately without waiting for the scheduled time. After a while, you can check the execution task records and corresponding logs.
Verification¶
- In 「Manage / Automatic Trigger Configuration」, confirm whether the corresponding task has the corresponding automatic trigger configuration. At the same time, you can check the corresponding task records and logs to see if there are any exceptions.
- In Guance, check whether asset information exists in 「Infrastructure / Custom」.
- In Guance, check whether there is corresponding monitoring data in 「Metrics」.
Metrics¶
After configuring Amazon Cloud Monitoring, the default measurement sets are as follows. More metrics can be collected through configuration:
Amazon Cloud Monitoring Application Load Balancer Metric Details
Amazon Cloud Monitoring Network Load Balancer Metric Details
Amazon Cloud Monitoring Gateway Load Balancer Metric Details
Amazon Cloud Monitoring Classic Load Balancer Metric Details
Application Load Balancer Metrics¶
| Metric | Description |
|---|---|
ActiveConnectionCount |
The total number of concurrent active TCP connections from clients to the load balancer and from the load balancer to targets. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ClientTLSNegotiationErrorCount |
The number of TLS connections initiated by clients that failed to establish a session with the load balancer due to TLS errors. Possible causes include cipher or protocol mismatches or the client closing the connection because it could not verify the server certificate. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ConsumedLCUs |
The number of Load Balancer Capacity Units (LCUs) used by the load balancer. You are charged for the number of LCUs used per hour. For more information, see Elastic Load Balancing Pricing. Reporting Standard: Always reported Statistics: All Dimensions LoadBalancer |
DesyncMitigationMode_NonCompliant_Request_Count |
The number of requests that do not comply with RFC 7230 standards. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
DroppedInvalidHeaderRequestCount |
The number of requests with invalid header fields that were dropped by the load balancer before routing the request. The load balancer drops these headers only if the routing.http.drop_invalid_header_fields.enabled attribute is set to true. Reporting Standard: Non-zero values Statistics: All Dimensions AvailabilityZone, LoadBalancer |
ForwardedInvalidHeaderRequestCount |
The number of requests with invalid HTTP header fields that were routed by the load balancer. The load balancer forwards requests with these headers only if the routing.http.drop_invalid_header_fields.enabled attribute is set to false. Reporting Standard: Always reported Statistics: All Dimensions AvailabilityZone, LoadBalancer |
GrpcRequestCount |
The number of gRPC requests processed over IPv4 and IPv6. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Minimum, Maximum, and Average all return 1. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTP_Fixed_Response_Count |
The number of successful fixed response operations. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTP_Redirect_Count |
The number of successful redirect operations. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTP_Redirect_Url_Limit_Exceeded_Count |
The number of redirect operations that could not be completed because the URL in the response location header was larger than 8K. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_3XX_Count |
The number of HTTP 3XX redirection codes generated by the load balancer. This count does not include any response codes generated by targets. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_4XX_Count |
The number of HTTP 4XX client error codes generated by the load balancer. This count does not include any response codes generated by targets. Client errors are generated if the request is malformed or incomplete. Except for the case where the load balancer returns an HTTP 460 error code, targets do not receive these requests. This count does not include any response codes generated by targets. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Minimum, Maximum, and Average all return 1. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_5XX_Count |
The number of HTTP 5XX server error codes generated by the load balancer. This count does not include any response codes generated by targets. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Minimum, Maximum, and Average all return 1. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_500_Count |
The number of HTTP 500 error codes generated by the load balancer. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_502_Count |
The number of HTTP 502 error codes generated by the load balancer. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_503_Count |
The number of HTTP 503 error codes generated by the load balancer. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
HTTPCode_ELB_504_Count |
The number of HTTP 504 error codes generated by the load balancer. Reporting Standard: Non-zero values Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
IPv6ProcessedBytes |
The total number of bytes processed by the load balancer over IPv6. This count is included in ProcessedBytes. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
IPv6RequestCount |
The number of IPv6 requests received by the load balancer. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Minimum, Maximum, and Average all return 1. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
NewConnectionCount |
The total number of new TCP connections established from clients to the load balancer and from the load balancer to targets. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
NonStickyRequestCount |
The number of requests for which the load balancer chose a new target because it could not use an existing sticky session. For example, the request is the first request from a new client and no sticky cookie was provided, a sticky cookie was provided but did not specify a target registered to this target group, the sticky cookie was malformed or expired, or an internal error occurred that prevented the load balancer from reading the sticky cookie. Reporting Standard: Sticky is enabled on the target group. Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ProcessedBytes |
The total number of bytes processed by the load balancer over IPv4 and IPv6 (HTTP headers and HTTP payload). This count includes traffic to and from clients and Lambda functions, as well as traffic from identity providers (IdPs) if user authentication is enabled. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
RejectedConnectionCount |
The number of connections that were rejected because the load balancer reached its connection limit. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
RequestCount |
The number of requests processed over IPv4 and IPv6. This metric is incremented only for requests where the load balancer node is able to select a target. Requests that are rejected before a target is selected are not reflected in this metric. Reporting Standard: Always reported Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, TargetGroup, LoadBalancer |
RuleEvaluations |
The number of rules processed by the load balancer given an average request rate over 1 hour. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer |
AWS/ApplicationELB namespace includes the following metrics for targets.
| Metric | Description |
|---|---|
HealthyHostCount |
The number of targets that are considered healthy. Reporting Standard: Reported when health checks are enabled Statistics: The most useful statistics are Average, Minimum, and Maximum. Dimensions TargetGroup, LoadBalancer``TargetGroup, AvailabilityZone, LoadBalancer``AvailabilityZone, TargetGroup, LoadBalancer |
HTTPCode_Target_2XX_Count, HTTPCode_Target_3XX_Count, HTTPCode_Target_4XX_Count, HTTPCode_Target_5XX_Count |
The number of HTTP response codes generated by targets. It does not include any response codes generated by the load balancer. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Minimum, Maximum, and Average all return 1. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer``TargetGroup, LoadBalancer``TargetGroup, AvailabilityZone, LoadBalancer |
RequestCountPerTarget |
The average number of requests received by each target in the target group. You must specify the target group using the TargetGroup dimension. If the target is a Lambda function, this metric does not apply. Reporting Standard: Always reported Statistics: The only valid statistic is Sum. This represents the average, not the sum. Dimensions TargetGroup``AvailabilityZone, TargetGroup, LoadBalancer |
TargetConnectionErrorCount |
The number of unsuccessful connection attempts between the load balancer and targets. If the target is a Lambda function, this metric does not apply. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer``TargetGroup, LoadBalancer``TargetGroup, AvailabilityZone, LoadBalancer |
TargetResponseTime |
The time elapsed, in seconds, after the request leaves the load balancer until a response is received from the target. This is equivalent to the target_processing_time field in the access logs. Reporting Standard: Non-zero values Statistics: The most useful statistics are Average and pNN.NN (percentiles). Dimensions LoadBalancer``AvailabilityZone, LoadBalancer``TargetGroup, LoadBalancer``TargetGroup, AvailabilityZone, LoadBalancer |
TargetTLSNegotiationErrorCount |
The number of TLS connections initiated by the load balancer that failed to establish a session with the target. Possible causes include cipher or protocol mismatches. If the target is a Lambda function, this metric does not apply. Reporting Standard: Non-zero values Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer``TargetGroup, LoadBalancer``TargetGroup, AvailabilityZone, LoadBalancer |
UnHealthyHostCount |
The number of targets that are considered unhealthy. Reporting Standard: Reported when health checks are enabled Statistics: The most useful statistics are Average, Minimum, and Maximum. Dimensions TargetGroup, LoadBalancer``TargetGroup, AvailabilityZone, LoadBalancer``AvailabilityZone, TargetGroup, LoadBalancer |
AWS/ApplicationELB namespace includes the following metrics for target group health. For more information, see Target Group Health.
| Metric | Description |
|---|---|
HealthyStateDNS |
The number of zones that meet the DNS health good state requirements. Statistics: The most useful statistic is Min. Dimensions LoadBalancer, TargetGroup``AvailabilityZone, LoadBalancer, TargetGroup |
HealthyStateRouting |
The number of zones that meet the routing health good state requirements. Statistics: The most useful statistic is Min. Dimensions LoadBalancer, TargetGroup``AvailabilityZone, LoadBalancer, TargetGroup |
UnhealthyRoutingRequestCount |
The number of requests routed using the routing failover operation (fail open). Statistics: The most useful statistic is Sum. Dimensions LoadBalancer, TargetGroup``AvailabilityZone, LoadBalancer, TargetGroup |
UnhealthyStateDNS |
The number of zones that do not meet the DNS health good state requirements and are therefore marked as unhealthy in DNS. Statistics: The most useful statistic is Min. Dimensions LoadBalancer, TargetGroup``AvailabilityZone, LoadBalancer, TargetGroup |
UnhealthyStateRouting |
The number of zones that do not meet the routing health good state requirements and therefore the load balancer distributes traffic to all targets in the zone (including unhealthy targets). Statistics: The most useful statistic is Min. Dimensions LoadBalancer, TargetGroup``AvailabilityZone, LoadBalancer, TargetGroup |
AWS/ApplicationELB namespace includes the following metrics for Lambda functions registered as targets.
| Metric | Description |
|---|---|
LambdaInternalError |
The number of requests to Lambda functions that failed due to internal issues with the load balancer or AWS Lambda. To get the error reason code, check the error_reason field in the access logs. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions TargetGroup``TargetGroup, LoadBalancer |
LambdaTargetProcessedBytes |
The total number of bytes processed by the load balancer for requests to and responses from Lambda functions. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer |
LambdaUserError |
The number of requests to Lambda functions that failed due to issues with the Lambda function. For example, the load balancer does not have permission to invoke the function, the load balancer receives JSON from the function that is malformed or missing required fields, or the request body or response size exceeds the maximum size of 1 MB. To get the error reason code, check the error_reason field in the access logs. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions TargetGroup``TargetGroup, LoadBalancer |
AWS/ApplicationELB namespace includes the following metrics for user authentication.
| Metric | Description |
|---|---|
ELBAuthError |
The number of times user authentication could not be completed due to a misconfigured authentication action, the load balancer could not establish a connection with the IdP, or the load balancer could not complete the authentication flow due to an internal error. To get the error reason code, check the error_reason field in the access logs. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ELBAuthFailure |
The number of times user authentication could not be completed because the IdP denied user access or the authorization code was used multiple times. To get the error reason code, check the error_reason field in the access logs. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ELBAuthLatency |
The time elapsed, in milliseconds, to query the IdP for the ID token and user information. If one or more of these operations fail, this represents the failure time. Reporting Criteria: Non-zero value Statistics: All statistics are meaningful. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ELBAuthRefreshTokenSuccess |
The number of times the load balancer successfully refreshed user claims using a refresh token provided by the IdP. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ELBAuthSuccess |
The number of successful authentication operations. This metric increments when the load balancer retrieves user identity claims from the IdP and the authentication workflow ends. Reporting Criteria: Non-zero value Statistics: The most useful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
ELBAuthUserClaimsSizeExceeded |
The number of times the configured IdP returned user claims larger than 11K bytes. Reporting Criteria: Non-zero value Statistics: The only meaningful statistic is Sum. Dimensions LoadBalancer``AvailabilityZone, LoadBalancer |
Network Load Balancer Metrics¶
| Metric | Description |
|---|---|
ActiveFlowCount |
The total number of concurrent flows (or connections) from clients to targets. This metric includes connections in the SYN_SENT and ESTABLISHED states. TCP connections are not terminated on the load balancer, so a client with an open TCP connection to a target counts as one flow. Reporting Criteria: Always reported. Statistics: The most useful statistics are Average, Maximum, and Minimum. Dimensions: LoadBalancer``AvailabilityZone, LoadBalancer |
ActiveFlowCount_TCP |
The total number of concurrent TCP flows (or connections) from clients to targets. This metric includes connections in the SYN_SENT and ESTABLISHED states. TCP connections are not terminated on the load balancer, so a client with an open TCP connection to a target counts as one flow. Reporting Criteria: Non-zero value. Statistics: The most useful statistics are Average, Maximum, and Minimum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
ConsumedLCUs |
The number of load balancer capacity units (LCUs) used by the load balancer. You are charged for the number of LCUs used per hour. Reporting Criteria: Always reported. Statistics: All. Dimensions: LoadBalancer |
ConsumedLCUs_TCP |
The number of load balancer capacity units (LCUs) used by the load balancer for TCP. You are charged for the number of LCUs used per hour. Reporting Criteria: Non-zero value. Statistics: All. Dimensions: LoadBalancer |
NewFlowCount |
The total number of new flows (or connections) established from clients to targets during the period. Reporting Criteria: Always reported. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
NewFlowCount_TCP |
The total number of new TCP flows (or connections) established from clients to targets during the period. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
PeakPacketsPerSecond |
The highest average packet rate (packets processed per second), calculated every 10 seconds during the sampling window. This metric includes health check traffic. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Maximum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
ProcessedBytes |
The total number of bytes processed by the load balancer, including TCP/IP headers. This count includes traffic to and from targets, minus health check traffic. Reporting Criteria: Always reported. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
ProcessedBytes_TCP |
The total number of bytes processed by TCP listeners. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
ProcessedPackets |
The total number of packets processed by the load balancer. This count includes traffic to and from targets, as well as health check traffic. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
TCP_Client_Reset_Count |
The total number of reset (RST) packets sent from clients to targets. These resets are generated by clients and then forwarded by the load balancer. Reporting Criteria: Always reported. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
TCP_ELB_Reset_Count |
The total number of reset (RST) packets generated by the load balancer. Reporting Criteria: Always reported. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
TCP_Target_Reset_Count |
The total number of reset (RST) packets sent from targets to clients. These resets are generated by targets and then forwarded by the load balancer. Reporting Criteria: Always reported. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
UnhealthyRoutingFlowCount |
The number of flows (or connections) routed using the routing failover operation (fail open). Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer,AvailabilityZone LoadBalancer |
Gateway Load Balancer Metrics¶
| Metric | Description |
|---|---|
ActiveFlowCount |
The total number of concurrent flows (or connections) from clients to targets. Reporting Criteria: Non-zero value. Statistics: The most useful statistics are Average, Maximum, and Minimum. Dimensions: LoadBalancer``AvailabilityZone, LoadBalancer |
ConsumedLCUs |
The number of load balancer capacity units (LCU) used by the load balancer. You are charged for the number of LCUs used per hour. Reporting Criteria: Always reported. Statistics: All. Dimensions: LoadBalancer |
HealthyHostCount |
The number of targets that are considered healthy. Reporting Criteria: Reported when health checks are enabled. Statistics: The most useful statistics are Maximum and Minimum. Dimensions: LoadBalancer``TargetGroup,AvailabilityZone``LoadBalancer``TargetGroup |
NewFlowCount |
The total number of new flows (or connections) established from clients to targets during the period. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer``AvailabilityZone, LoadBalancer |
ProcessedBytes |
The total number of bytes processed by the load balancer. This count includes traffic to and from targets, but excludes health check traffic. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer``AvailabilityZone, LoadBalancer |
UnHealthyHostCount |
The number of targets that are considered unhealthy. Reporting Criteria: Reported when health checks are enabled. Statistics: The most useful statistics are Maximum and Minimum. Dimensions: LoadBalancer``TargetGroup,AvailabilityZone``LoadBalancer``TargetGroup |
Classic Load Balancer Metrics¶
| Metric | Description |
|---|---|
BackendConnectionErrors |
The number of unsuccessful connection attempts between the load balancer and registered instances. Because the load balancer retries connections when an error occurs, this count can exceed the request rate. Note that this count also includes all connection errors related to health checks. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Note that Average, Minimum, and Maximum are reported per load balancer node and are generally not useful. However, the difference between the minimum and maximum (or peak to average, average to trough) can be used to determine if a load balancer node is behaving abnormally. Example: Suppose your load balancer has 2 instances each in us-west-2a and us-west-2b, and a connection attempt to one instance in us-west-2a results in a backend connection error. The sum value for us-west-2a includes these connection errors, while the sum value for us-west-2b does not. Therefore, the sum value for the load balancer equals the sum value for us-west-2a. |
DesyncMitigationMode_NonCompliant_Request_Count |
[HTTP listener] The number of requests that do not comply with the RFC 7230 standard. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Dimensions: LoadBalancer``AvailabilityZone, LoadBalancer |
HealthyHostCount |
The number of healthy instances registered with the load balancer. Newly registered instances are considered healthy after passing their first health check. If cross-zone load balancing is enabled, the number of healthy instances is calculated across all availability zones for the LoadBalancerName dimension. Otherwise, it is calculated per availability zone. Reporting Criteria: There are registered instances. Statistics: The most useful statistics are Average and Maximum. These statistics are determined by the load balancer nodes. Note that some load balancer nodes may consider an instance unhealthy for a short period while other nodes consider it healthy. Example: Suppose your load balancer has 2 instances each in us-west-2a and us-west-2b, and one instance in us-west-2a is unhealthy while there are no unhealthy instances in us-west-2b. For the AvailabilityZone dimension, us-west-2a averages 1 healthy and 1 unhealthy instance, and us-west-2b averages 2 healthy and 0 unhealthy instances. |
HTTPCode_Backend_2XX, HTTPCode_Backend_3XX, HTTPCode_Backend_4XX, HTTPCode_Backend_5XX |
[HTTP listener] The number of HTTP response codes generated by registered instances. This count does not include any response codes generated by the load balancer. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Note that Minimum, Maximum, and Average are all 1. Example: Suppose your load balancer has 2 instances each in us-west-2a and us-west-2b, and a request sent to one instance in us-west-2a results in an HTTP 500 response. The sum value for us-west-2a includes these error responses, while the sum value for us-west-2b does not. Therefore, the sum value for the load balancer equals the sum value for us-west-2a. |
HTTPCode_ELB_4XX |
[HTTP listener] The number of HTTP 4XX client error codes generated by the load balancer. Client errors are generated if the request is malformed or incomplete. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Note that Minimum, Maximum, and Average are all 1. Example: Suppose your load balancer has us-west-2a and us-west-2b enabled, and a client request contains a malformed request URL. The result may be an increase in client errors across all availability zones. The sum value for the load balancer is the sum of the values for each availability zone. |
HTTPCode_ELB_5XX |
[HTTP listener] The number of HTTP 5XX server error codes generated by the load balancer. This count does not include any response codes generated by registered instances. This metric is reported if there are no healthy instances registered with the load balancer, or if the request rate exceeds the capacity of the instances or the load balancer (spillover). Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Note that Minimum, Maximum, and Average are all 1. Example: Suppose your load balancer has us-west-2a and us-west-2b enabled, and instances in us-west-2a have high latency and respond slowly to requests. As a result, the surge queue for the load balancer nodes in us-west-2a fills up, and clients receive 503 errors. If us-west-2b continues to respond normally, the sum value for the load balancer will equal the sum value for us-west-2a. |
Latency |
[HTTP listener] The total time elapsed, in seconds, from when the load balancer sends the request to the registered instance until the instance starts sending response headers. [TCP listener] The total time elapsed, in seconds, for the load balancer to successfully establish a connection with the registered instance. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Average. Maximum can be used to determine if some requests take significantly longer than the average time. Note that Minimum is generally not useful. Example: Suppose your load balancer has 2 instances each in us-west-2a and us-west-2b, and a request sent to one instance in us-west-2a has high latency. The average value for us-west-2a will be higher than the average value for us-west-2b. |
RequestCount |
The number of requests completed or connections made during the specified period (1 or 5 minutes). [HTTP listener] The number of requests received and routed, including HTTP error responses from registered instances. [TCP listener] The number of connections made to registered instances. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Note that Minimum, Maximum, and Average all return 1. Example: Suppose your load balancer has 2 instances each in us-west-2a and us-west-2b, and 100 requests are sent to the load balancer. 60 requests are sent to us-west-2a, with each instance receiving 30 requests, and 40 requests are sent to us-west-2b, with each instance receiving 20 requests. For the AvailabilityZone dimension, us-west-2a has a total of 60 requests, and us-west-2b has a total of 40 requests. For the LoadBalancerName dimension, there is a total of 100 requests. |
SpilloverCount |
The total number of requests rejected because the surge queue was full. [HTTP listener] The load balancer returns an HTTP 503 error code. [TCP listener] The load balancer closes the connection. Reporting Criteria: Non-zero value. Statistics: The most useful statistic is Sum. Note that Average, Minimum, and Maximum are reported per load balancer node and are generally not useful. Example: Suppose your load balancer has us-west-2a and us-west-2b enabled, and instances in us-west-2a have high latency and respond slowly to requests. As a result, the surge queue for the load balancer nodes in us-west-2a fills up, causing spillover. If us-west-2b continues to respond normally, the sum value for the load balancer will be the same as the sum value for us-west-2a. |
SurgeQueueLength |
The total number of requests (HTTP listener) or connections (TCP listener) waiting to be routed to healthy instances. The maximum size of the queue is 1024. When the queue is full, additional requests or connections are rejected. For more information, see SpilloverCount. Reporting Criteria: Non-zero value. Statistics: The most valuable statistic is Maximum, as it represents the peak of queued requests. Using the Average statistic in conjunction with Minimum and Maximum can determine the range of queued requests. Note that Sum is not useful. Example: Suppose your load balancer has us-west-2a and us-west-2b enabled, and instances in us-west-2a have high latency and respond slowly to requests. As a result, the surge queue for the load balancer nodes in us-west-2a fills up, likely increasing response times for clients. If this continues, the load balancer may spillover (see the SpilloverCount metric). If us-west-2b continues to respond normally, the max for the load balancer will be the same as the max for us-west-2a. |
UnHealthyHostCount |
The number of unhealthy instances registered with the load balancer. An instance is considered unhealthy if it exceeds the unhealthy threshold configured for health checks. Unhealthy instances are reconsidered healthy after meeting the healthy threshold configured for health checks. Reporting Criteria: There are registered instances. Statistics: The most useful statistics are Average and Minimum. These statistics are determined by the load balancer nodes. Note that some load balancer nodes may consider an instance unhealthy for a short period while other nodes consider it healthy. Example: See HealthyHostCount. |
Load Balancer Metric Dimensions¶
To filter metrics for Application / Network / Gateway load balancers, use the following dimensions.
| Dimension | Description |
|---|---|
AvailabilityZone |
Filter metric data by availability zone. |
LoadBalancer |
Filter metric data by load balancer. Specify the load balancer as follows: app/load-balancer-name/1234567890123456 (the end part of the load balancer ARN). |
TargetGroup |
Filter metric data by target group. Specify the target group as follows: targetgroup/target-group-name/1234567890123456 (the end part of the target group ARN). |
To filter metrics for Classic load balancers, use the following dimensions.
| Dimension | Description |
|---|---|
AvailabilityZone |
Filter metric data by availability zone. |
LoadBalancerName |
Filter metric data by the specified load balancer. |
Logging¶
ELB logs are not directly exposed for collection. You need to store the logs in an S3 bucket first, and then use Lambda to fetch the log data and report it to the platform.
Application ELB Log Activation¶
- Select the corresponding load balancer and enter the details page
- In the details page, click "Actions", then select "Edit Load Balancer Attributes"
- Under the "Monitoring" category, enable "Access Logs" and select the corresponding S3 bucket.

- Save
Lambda Configuration¶
Refer to: Lambda Fetch S3 Log Data
Log Pipeline¶
grok(_, "%{NOTSPACE:protocal} %{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:LoadBalancer} %{IP:client}:%{POSINT:client_port} %{IP:target}:%{POSINT:target_port} %{NUMBER:request_processing_time} %{NUMBER:target_processing_time} %{NUMBER:response_processing_time} %{NUMBER:elb_status_code} %{NUMBER:target_status_code} %{NUMBER:received_bytes} %{NUMBER:sent_bytes} \"%{NOTSPACE:http_method} %{NOTSPACE:uri} HTTP/%{NUMBER:http_version}\" \"%{DATA:user_agent}\" %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol} %{NOTSPACE:target_group_arn} \"%{NOTSPACE:x_amzn_trace_id}\" \"%{DATA:domain_name}\" \"%{DATA:chosen_cert_arn}\" %{NUMBER:matched_rule_priority} %{TIMESTAMP_ISO8601:request_creation_time} \"%{WORD:actions_executed}\" \"%{DATA:redirect_url}\" \"%{DATA:error_reason}\" \"%{IP:target_host}:%{POSINT:target_host_port_list}\" \"%{NUMBER:target_status_code_list}\" \"%{DATA:classification}\" \"%{DATA:classification_reason}\" %{NOTSPACE:conn_trace_id}")
status = "info"
cast(elb_status_code,"int")
if elb_status_code >=400 {
status = "error"
}
add_key("aws_log_type","access_log")
add_key(status,status)
aws_json=load_json(aws)
lambda_func_data = aws_json["invoked_function_arn"]
add_key(s3_bucket,aws_json["s3"]["bucket"])
grok(lambda_func_data, "arn:aws:lambda:%{DATA:region}:%{NUMBER:account_id}:function:%{NOTSPACE:lambda_func}")
Object¶
The collected AWS ELB object data structure can be seen in [Infrastructure - Custom]
{
"measurement": "aws_aelb",
"tags": {
"name" : "app/openway/8e8d762xxxxxx",
"RegionId" : "cn-northwest-1",
"LoadBalancerArn" : "arn:aws-cn:elasticloadbalancing:cn-northwest-1:588271xxxxx:loadbalancer/app/openway/8e8d762xxxxxx",
"State" : "active",
"Type" : "application",
"VpcId" : "vpc-2exxxxx",
"Scheme" : "internet-facing",
"DNSName" : "openway-203509xxxx.cn-northwest-1.elb.amazonaws.com.cn",
"LoadBalancerName" : "openway",
"CanonicalHostedZoneId": "ZM7IZAIxxxxxx"
},
"fields": {
"CreatedTime" : "2022-03-09T06:13:31Z",
"ListenerDescriptions": "{JSON data}",
"AvailabilityZones" : "{Availability Zone JSON data}",
"message" : "{Instance JSON data}"
}
}
Note: Fields in
tagsandfieldsmay change with subsequent updatesTip 1: AWS ELB metrics are divided into four types based on different load balancer types:
- Application ELB corresponds to the metric set
aws_aelb- Network ELB corresponds to the metric set
aws_nelb- Gateway ELB corresponds to the metric set
aws_gelb- Classic ELB corresponds to the metric set
aws_elbTip 2: The
tags.namevalue is determined in two ways:
- For Classic Load Balancers, the LoadBalancerName field is used.
- For Application, Network, and Gateway Load Balancers, the end part of the load balancer ARN (LoadBalancerArn) is used.
For example, for a Network Load Balancer:
`LoadBalancerArn` is `arn:awS-cn:elasticloadbalancing:cn-northwest-1:xxxx1335135:loadbalancer/net/k8s-forethou-kodongin-xxxxa46f01/xxxxe75ae81d08c2`The corresponding
tags.nameisnet/k8s-forethou-kodongin-xxxxa46f01/xxxxe75ae81d08c2Tip 3:
fields.message,tags.AvailabilityZonesare JSON serialized strings- The
tags.statefield indicates the state of the Load Balancers, with possible values:active,provisioning,active_impaired,failed(this field is not available for "classic" type load balancer instances)- The
tags.Typefield indicates the type of Load Balancers, with possible values:application,network,gateway,classic- The
tags.Schemefield indicates the mode of Load Balancers, with possible values:internet-facing,internal- The
fields.ListenerDescriptionsfield is the list of listeners for this load balancer- The
fields.AvailabilityZonesfield indicates the Amazon Route 53 availability zone information associated with the load balancer