Solving Alibaba Cloud API Signature Issues for Billing Analysis¶

The awareness of costs when purchasing multiple public cloud resources compared to buying private hosts is vastly different. Purchasing private hosts represents a one-time investment; whether you use them or not after purchase, how well you use them has no continuous impact on your subsequent investment. On the other hand, purchasing public cloud resources requires constant reminders: although the initial investment may be small, every day that passes incurs daily costs. Therefore, we urgently need methods to clearly view detailed cost expenditures and billing analyses across multiple cloud resources.

Collecting Alibaba Cloud Cost API¶

First, taking Alibaba Cloud billing as an example, if we want to collect and analyze Alibaba Cloud's cost bill information, we need sufficient knowledge of the transaction and billing management APIs. The most challenging part when calling Alibaba Cloud APIs is the API signature (Signature) mechanism. Alibaba Cloud also provides specialized documentation in its general documentation, but having only the signature mechanism documentation can be very challenging for developers with less experience. So, based on this incomplete documentation, how do we proceed with collecting billing costs?

API Request Principle¶

Simply put, calling the Alibaba Cloud API is an HTTP request (most are GET requests, and this is also based on GET requests), just requiring a series of parameters afterward. For example, a request to view snapshots looks like this:

http://ecs.aliyuncs.com/?SignatureVersion=1.0&Format=JSON&Timestamp=2017-08-07T05%3A50%3A57Z&RegionId=cn-hongkong&AccessKeyId=xxxxxxxxx&SignatureMethod=HMAC-SHA1&Version=2017-12-14&Signature=%2FeGgFfxxxxxtZ2w1FLt8%3D&Action=DescribeSnapshots&SignatureNonce=b5046ef2-7b2b-11e7-a3c5-00163e001831&ZoneId=cn-hongkong-b

The required common parameters (parameters needed for all API calls) are:

SignatureVersion # Signature algorithm version, currently 1.0 
Format # Format of the returned message, JSON or XML, default is XML
Timestamp # Request timestamp, UTC time, e.g.: 2021-12-16T12:00:00Z 
AccessKeyId # Account key ID
SignatureMethod # Signature method, currently HMAC-SHA1
Version # Version number, date format, e.g.: 2017-12-14 varies by product
Signature # The hardest part to handle is the signature
SignatureNonce # A unique random number, preventing network attacks. Different requests should use different random numbers.

Apart from Signature, the other parameters are relatively easy to obtain, some even have fixed values. Refer to the Alibaba Cloud documentation for more details. Besides common parameters, specific interface (Action) request parameters are also needed. Each Action interface parameter can be referenced from the corresponding product's interface documentation, such as QuerySettleBill. The Signature is based on both common parameters and interface parameters, so it is more complex.

Constructing Standardized Request Strings¶

Construct dict
In Python, parameters correspond one-to-one using a dict. Create a dict and insert the request parameters.

D = {
    'BillingCycle':str(time.strftime("%Y-%m", time.gmtime())),
    'Action':'QuerySettleBill',
    # 'PageNum':'5',
    'Format':'JSON',
    'Version':'2017-12-14',
    'AccessKeyId':'LTAI5tLumx55Vui4WJwZJneK',
    'SignatureMethod':'HMAC-SHA1',
    'MaxResults' : '300',
    # 'NextToken':"", #?
    'Timestamp':str(time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())),
    'SignatureVersion':'1.0'
    # 'SignatureNonce':str_seed
}

Sorting
Since signatures require uniqueness, including order, the parameters must be sorted by name.

  # Since signatures require uniqueness, including order, the parameters must be sorted by name
        sortedD = sorted(D.items(),key=lambda x: x[0])

URL Encoding
Since standard request strings require UTF-8 character sets, certain non-conforming characters in parameter names and values must be URL-encoded. Specific rules are:

Characters AZ, az, 0~9, as well as “-”, “_”, “.”, “~” are not encoded;
Other characters are encoded in %XY format, where XY is the hexadecimal representation of the ASCII code of the character. For example, English double quotes (“) are encoded as %22;
For extended UTF-8 characters, they are encoded in %XY%ZA… format;
English spaces ( ) are encoded as %20, not plus signs (+).

Note: Generally, libraries that support URL encoding (like java.net.URLEncoder in Java) encode according to the MIME type "application/x-www-form-urlencoded". During implementation, these methods can be used directly, replacing plus signs (+) in the encoded string with %20, asterisks (*) with %2A, and restoring %7E to tilde (~) to get the encoded string described above.

Here, the urllib library in Python is used for encoding:

# Use urllib in Python for encoding
def percentEncode(str):
    res = urllib.parse.quote(str.encode('utf8'), '')
    res = res.replace('+', '%20')
    res = res.replace('*', '%2A')
    res = res.replace('%7E', '~')
    return res

Generating Standardized Request Strings

# Generate standardized request strings
canstring = ''
for k,v in sortedD:
    canstring += '&' + percentEncode(k) + '=' + percentEncode(v)

Constructing the StringToSign¶

The rule is:

StringToSign=
HTTPMethod + “&” +
percentEncode(“/”) + ”&” +
percentEncode(CanonicalizedQueryString)

So in this instance:

        # Generate standardized request string
        stringToSign = 'GET&%2F&' + percentEncode(canstring[1:])

Calculating HMAC Value¶

# access_key_secret
access_key_secret = '<access_key_secret>'
# Calculate HMAC value
h = hmac.new((access_key_secret + "&").encode('utf8'), stringToSign.encode('utf8'), sha1)

Calculating the Signature Value¶

# Calculate the signature value to generate the signature
signature = base64.encodestring(h.digest()).strip()

At this point, the signature has been generated.

Adding the Signature¶

# Add the signature
D['Signature'] = signature

So in this instance, the final request URL is:

# Final API call
url = 'http://business.aliyuncs.com/?' + urllib.parse.urlencode(D)
http://business.aliyuncs.com/?BillingCycle=2021-12&Action=QuerySettleBill&Format=JSON&Version=2017-12-14&AccessKeyId=LTAI5tLumx55Vui4WJwZJneK&SignatureMethod=HMAC-SHA1&MaxResults=300&Timestamp=2021-12-16T12%3A27%3A58Z&SignatureVersion=1.0&SignatureNonce=0.30196531140307337&NextToken=&Signature=zFb4631sSGONvAeWD3xCIovMeoM%3D

Just open the browser and visit the link directly to get the results:

Complete Example¶

import sys, datetime
import time
import json
import urllib
import hmac
from hashlib import sha1
import base64
import random
import requests


# Common parameters required in the request (parameters needed for all API calls)
D = {
    'BillingCycle':str(time.strftime("%Y-%m", time.gmtime())),
    'Action':'QuerySettleBill',
    # 'PageNum':'5',
    'Format':'JSON',
    'Version':'2017-12-14',
    'AccessKeyId':'<AccessKeyId>',
    'SignatureMethod':'HMAC-SHA1',
    'MaxResults' : '300',
    # 'NextToken':"", #?
    'Timestamp':str(time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())),
    'SignatureVersion':'1.0'
    # 'SignatureNonce':str_seed
}
# Current time
now_time = str(time.strftime("%Y-%m-%d", time.gmtime()))

# Link local Datakit
datakit = DFF.SRC('datakit')

# Use the urllib library in Python for encoding
def percentEncode(str):
        res = urllib.parse.quote(str.encode('utf8'), '')
        res = res.replace('+', '%20')
        res = res.replace('*', '%2A')
        res = res.replace('%7E', '~')
        return res


# Get the bill
def getBill():
    # Current record position of the bill
    next_token = ""
    # Loop to get bills and write into DataKit
    for i in range(10000):
        random.seed()
        # Unique random number, used to prevent network replay attacks. Users should use different random values between different requests.
        D["SignatureNonce"] = str(random.random())
        D["NextToken"] = next_token
        # Since signatures require uniqueness, including order, the parameters must be sorted by name
        sortedD = sorted(D.items(),key=lambda x: x[0])
        # Generate standardized request strings
        canstring = ''
        for k,v in sortedD:
            canstring += '&' + percentEncode(k) + '=' + percentEncode(v)
        # Generate standardized request strings
        stringToSign = 'GET&%2F&' + percentEncode(canstring[1:])
        # access_key_secret
        access_key_secret = '<access_key_secret>'
        # Calculate HMAC value
        h = hmac.new((access_key_secret + "&").encode('utf8'), stringToSign.encode('utf8'), sha1)
        # Calculate the signature value to generate the signature
        signature = base64.encodestring(h.digest()).strip()
        # Add the signature
        D['Signature'] = signature
        # Final API call
        url = 'http://business.aliyuncs.com/?' + urllib.parse.urlencode(D)
        # Request Alibaba Cloud billing costs
        print(url)

Displaying Technical Route Selection¶

Architectural Concept¶

Adopting Crontab to schedule Python scripts periodically to acquire Alibaba Cloud billing data and store it in the MySQL storage engine, displaying the billing analysis data through Grafana. You can simplify operations by obtaining corresponding Bills templates via Grafana Dashboards. Below, we expand our technical research based on this architectural concept.

Technical Research¶

In the current open-source visualization field, Grafana is the most popular with the most visualization templates. Kibana is also a good visualization platform. Compared to Grafana, Kibana is better suited for ELK architecture. Based on our needs, using Kibana would not be as suitable. Grafana is an open-source visualization tool that can be used with various data storages. It is a rich replacement for Graphite-web, helping us easily create and edit dashboards. It includes a unique Graphite target parser that makes it easy to edit metrics and functions. Users can create comprehensive charts using smart axis formats (such as lines and points). Additionally, Grafana comes with a built-in alerting engine, allowing users to attach conditional rules to dashboard panels that will trigger alerts sent to selected notification endpoints (such as email, Slack, PagerDuty, custom Webhooks, etc.), which perfectly meets the early warning needs for Alibaba Cloud fees. However, Grafana is designed to analyze and visualize system CPU, memory, disk, and I/O utilization metrics. Grafana does not allow full-text data queries, making the user experience less friendly. After some searching in the open-source community, we found the product "Guance". This product not only covers all the advantages of Grafana and Kibana but also has many unique features. Moreover, it has Serverless online programming scheduling capabilities, solving the pain point of being unable to manage the scheduling of Python scripts to obtain Alibaba Cloud billing data. Additionally, as an open-source commercial product, its UI beauty surpasses Grafana by leaps and bounds. The free quota it provides can meet our needs, and we can also get official product support when there are usage issues.

Technical Comparison¶

	Grafana	Guance
Usage Complexity	Installation and configuration are relatively cumbersome and require additional storage engines	Installed with one command, ready to use within 30 minutes
Documentation Completeness	Grafana's website has complete documentation, but there are fewer Chinese documents, which can be a headache for those who aren't proficient in English.	Has very complete Chinese documentation and a large number of use case guidance courses.
Community Activity	Active community, strong development and maintenance teams, fast version upgrades and iterations.	Commercial product, very active community, strong development and maintenance teams, quick problem-solving, fast version upgrades and iterations.
Function Completeness	Has 54 data sources, 173+ Dashboards, rich dashboard plugins such as heatmaps, line charts, charts, etc., supports simple alerts, etc.	Has over 200 data source integrations, 200+ Dashboards, multiple operating system supports, provides a unified standard DQL query for various data types, unifies management of metric data, log data, APM layer data, infrastructure, containers, middleware, network performance, supports powerful anomaly detection and advanced permission functions, supports complex alert rule configurations, etc.
Development Trend	Market share is on an upward trend, and it continues to rapidly develop and improve.	As a mature commercial product and a leader in the observability domain, market share is on an upward trend, and it continues to rapidly develop and improve.
Performance	Low resource consumption	Unified management, low resource consumption, binary files for efficient transmission, low bandwidth usage.
Serverless Programming	None	Based on Python3.x sandbox environment
Cost	Free	Free
Service	Community assistance	Professional technical team support

Requirements Matching¶

Through the comparison above, we find that using "Guance" can significantly reduce usage costs. Installation, configuration, and management are extremely convenient. In contrast to Grafana, which only serves as a display platform and still depends on external storage engines as data sources, "Guance" collects and manages metric data, log data, APM layer data, infrastructure, containers, middleware, network performance uniformly, making it much more convenient. This reduces the hassle of installing and maintaining storage engines. Also, Grafana lacks complete Chinese documentation, which might be problematic for users not fluent in English. Conversely, "Guance" offers complete Chinese documentation and numerous instructional videos, making it easier to get started with the product and focus on actual requirements. As a commercial product, even when using it for free, "Guance" provides professional technical support and has a large community to exchange insights about product usage. Functionally, it exceeds Grafana with powerful anomaly detection, advanced permission functions, and support for complex alert rule configurations, meeting needs beyond fee analysis effectively. Furthermore, "Guance" allows unified component management through a visual interface, consuming low resources, with binary file data for high-efficiency transmission and low bandwidth usage. Additionally, for aesthetically inclined users, "Guance" offers a minimalist style design that stands out, making it the preferred choice for our needs.

Implementing Cost Management with Guance¶

Deployment Instructions¶

Example Linux version: CentOS Linux release 7.8.2003 (Core)

Collect all Alibaba Cloud billing cost data through a single server.

Prerequisites¶

Install DataKit¶

Before starting to monitor hosts with "Guance", you need to install DataKit first. DataKit is the officially released data collection application, supporting the collection of hundreds of types of data. By configuring data sources, real-time data can be collected, including host, process, container, log, application performance, user visits, and more.

Before installing DataKit, you need to register a "Guance" account first. After registration, log in to the "Guance" workspace to obtain the DataKit installation instructions and deploy the first DataKit.

Obtain Installation Instructions¶

You can log in to the "Guance" workspace, click sequentially on 「Integration」 - 「DataKit」, choose the DataKit installation method as shown below, then copy the 「Installation Instruction」 and execute it on the host.

Installation System: Linux
System Type: X86 amd64
DataWay Address: OpenWay

Execute Installation Instructions on Host¶

Open the command-line terminal tool, log in to the server, and execute the copied 「Installation Instruction」. After successful installation, it will prompt Install Success, and you can view the installation status, manual, and update records of DataKit via the provided link.

Start Using "Guance"¶

After successfully installing DataKit, the host object collector hostobject is already enabled by default. You can directly view the host installed with DataKit under the 「Infrastructure」 - 「Host」 section of the "Guance" workspace, including host status, hostname, operating system, CPU usage rate, MEM usage rate, CPU single-core load, etc. You can also click on the host to view more detailed information about the host.

Install Func Portable Edition¶

System and Environment Requirements¶

The host running DataFlux Func must meet the following conditions:

CPU core count >= 2
Memory capacity >= 4GB
Disk space >= 20GB
Network bandwidth >= 10 Mbps
Operating system Ubuntu 16.04 LTS/CentOS 7.2 or higher
Clean system (after installing the operating system, except for network configuration, no other operations have been performed)
Open 8088 port (the system defaults to using 8088 port, please ensure firewall, security groups, etc., allow 8088 inbound access)
When using external MySQL, the MySQL version must be 5.7 or higher
When using external Redis, the Redis version must be 4.0 or higher

Note: DataFlux Func does not support MacOS, Windows, you can choose to install DataFlux Func in a virtual machine, cloud host

Note: DataFlux Func does not support cluster Redis, for high availability, please choose master-slave

Note: If installing DataFlux Func on Alibaba Cloud ECS and the Alibaba Cloud Shield plugin is enabled, since the cloud shield itself consumes a lot of resources, the system configuration should be appropriately increased

Download Command for Portable Edition¶

/bin/bash -c "$(curl -fsSL https://t.guance.com/func-portable-download)"

Note: All shell commands mentioned in this article can be run directly under the root user, and non-root users need to add sudo

Note: This article only provides the most common operation steps, detailed installation deployment please refer to 「Maintenance Manual」

Execute Automatic Installation Script¶

In the already downloaded dataflux-func-portable directory,
run the following command to automatically configure and finally start the entire DataFlux Func:

Note: Please confirm system requirements and server configurations before installation

Note: DataFlux Func does not support Mac, please copy it to a Linux system and run the installation

sudo /bin/bash run-portable.sh

Using the automatic installation script can achieve quick installation and operation within minutes, with the following automatic configurations:

Running MySQL, Redis, DataFlux Func (including Server, Worker, Beat)
Automatically creating and saving all data under /usr/local/dataflux-func/ directory (including MySQL data, Redis data, DataFlux Func configuration, log files, etc.)
Randomly generating MySQL root user password, system Secret, and saving them in the DataFlux Func configuration file
No password set for Redis
No external access provided for MySQL, Redis

After completion, you can use a browser to access http://{server IP address/domain}:8088 for initialization operations.

Note: If the running environment performance is poor, please confirm all components are successfully started using the docker ps command before accessing (see the following list)

dataflux-func_mysql
dataflux-func_redis
dataflux-func_server
dataflux-func_worker-0
dataflux-func_worker-1-6
dataflux-func_worker-7
dataflux-func_worker-8-9
dataflux-func_beat

Obtain RAM Access Control¶

Log in to the RAM console https://ram.console.aliyun.com/users
Create a new user: Personnel Management - User - Create User
Save or download the AccessKeyID and AccessKey Secret CSV file (configuration file will use it)
User authorization (billing permissions)

Configuration Implementation¶

Log in to DataFlux Function¶

Log in to Func at http://ip:8088 (default admin/admin)

Create Script Sets¶

Enter title/description information

Edit Script¶

Write the script to write billing data into DataKit for report creation preparation.

Complete script as follows:

import sys, datetime
import time
import json
import urllib
import hmac
from hashlib import sha1
import base64
import random
import requests


# Common parameters required in the request (parameters needed for all API calls)
D = {
    'BillingCycle':str(time.strftime("%Y-%m", time.gmtime())),
    'Action':'QuerySettleBill',
    # 'PageNum':'5',
    'Format':'JSON',
    'Version':'2017-12-14',
    'AccessKeyId':'<AccessKeyId>',
    'SignatureMethod':'HMAC-SHA1',
    'MaxResults' : '300',
    # 'NextToken':"", #?
    'Timestamp':str(time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())),
    'SignatureVersion':'1.0'
    # 'SignatureNonce':str_seed
}
# Current time
now_time = str(time.strftime("%Y-%m-%d", time.gmtime()))

# Link local Datakit
datakit = DFF.SRC('datakit')

# Use urllib in Python for encoding
def percentEncode(str):
        res = urllib.parse.quote(str.encode('utf8'), '')
        res = res.replace('+', '%20')
        res = res.replace('*', '%2A')
        res = res.replace('%7E', '~')
        return res


# Get billing
@DFF.API('getBill')
def getBill():
    # Current record position of the bill
    next_token = ""
    # Loop to get bills and write into DataKit
    for i in range(10000):
        random.seed()
        # Unique random number, used to prevent network replay attacks. Users should use different random values between different requests.
        D["SignatureNonce"] = str(random.random())
        D["NextToken"] = next_token
        # Since signatures require uniqueness, including order, the parameters must be sorted by name
        sortedD = sorted(D.items(),key=lambda x: x[0])
        canstring = ''
        for k,v in sortedD:
            canstring += '&' + percentEncode(k) + '=' + percentEncode(v)
        # Generate standardized request strings
        stringToSign = 'GET&%2F&' + percentEncode(canstring[1:])
        # access_key_secret
        access_key_secret = '<access_key_secret>'
        # Calculate HMAC value
        h = hmac.new((access_key_secret + "&").encode('utf8'), stringToSign.encode('utf8'), sha1)
        # Calculate the signature value to generate the signature
        signature = base64.encodestring(h.digest()).strip()
        # Add the signature
        D['Signature'] = signature
        # Final API call
        url = 'http://business.aliyuncs.com/?' + urllib.parse.urlencode(D)
        # Request Alibaba Cloud billing costs
        response = requests.get(url)
        billing_cycle = response.json()["Data"]["BillingCycle"]
        account_id = response.json()["Data"]["AccountID"]
        next_token = response.json()["Data"]["NextToken"]
        if next_token is not None:
            bill = response.json()["Data"]["Items"]["Item"]
            print(bill)
            # Write the daily bill into Guance
            for i in bill:
                print(i["UsageEndTime"])
                time = i["UsageEndTime"].split(" ")[0]
                print(time, now_time)
                if time == now_time:
                    measurement = "aliyunSettleBill"
                    tags = {
                        "BillingCycle": billing_cycle,
                        "AccountID": account_id
                    }
                    fields = {
                        "ProductName":i["ProductName"],
                        "SubOrderId":i["SubOrderId"],
                        "BillAccountID":i["BillAccountID"],
                        "DeductedByCashCoupons":i["DeductedByCashCoupons"],
                        "PaymentTime":i["PaymentTime"],
                        "PaymentAmount":i["PaymentAmount"],
                        "DeductedByPrepaidCard":i["DeductedByPrepaidCard"],
                        "InvoiceDiscount":i["InvoiceDiscount"],
                        "UsageEndTime":i["UsageEndTime"],
                        "Item":i["Item"],
                        "SubscriptionType":i["SubscriptionType"],
                        "PretaxGrossAmount":i["PretaxGrossAmount"],
                        "Currency":i["Currency"],
                        "CommodityCode":i["CommodityCode"],
                        "UsageStartTime":i["UsageStartTime"],
                        "AdjustAmount":i["AdjustAmount"],
                        "Status":i["Status"],
                        "DeductedByCoupons":i["DeductedByCoupons"],
                        "RoundDownDiscount":i["RoundDownDiscount"],
                        "ProductDetail":i["ProductDetail"],
                        "ProductCode":i["ProductCode"],
                        "ProductType":i["ProductType"],
                        "OutstandingAmount":i["OutstandingAmount"],
                        "BizType":i["BizType"], 
                        "PipCode":i["PipCode"],
                        "PretaxAmount":i["PretaxAmount"],
                        "OwnerID":i["OwnerID"],
                        "BillAccountName":i["BillAccountName"],
                        "RecordID":i["RecordID"],
                        "CashAmount":i["CashAmount"],
                    }
                    try:
                        status_code, result = datakit.write_logging(measurement=measurement, tags=tags, fields=fields)
                        print(status_code,result)
                    except:
                        print("Insert failed!")
                else:
                    break
            else:
                continue
            break
        else:
            break

Publish Script¶

Save configuration and Publish

Create Scheduled Task¶

Add an auto-trigger task, Management - Auto Trigger Configuration - New Task. Since the bill is a daily bill, the collection frequency can be set once a day.

View Reported Data¶

Log preview

Create Explorer¶

Import Explorer