Log Streaming
Start an HTTP Server, receive the log text data and report it to Guance Cloud.
HTTP URL is fixed as: /v1/write/logstreaming
, that is, http://Datakit_IP:PORT/v1/write/logstreaming
Note: If DataKit is deployed in Kubernetes as a daemonset, it can be accessed as a Service at http://datakit-service.datakit:9529
Configuration¶
Collector Configuration¶
Go to the conf.d/log
directory under the DataKit installation directory, copy logstreaming.conf.sample
and name it logstreaming.conf
. Examples are as follows:
[inputs.logstreaming]
ignore_url_tags = false
## Threads config controls how many goroutines an agent cloud start to handle HTTP request.
## buffer is the size of jobs' buffering of worker channel.
## threads is the total number fo goroutines at running time.
# [inputs.logstreaming.threads]
# buffer = 100
# threads = 8
## Storage config a local storage space in hard dirver to cache trace data.
## path is the local file path used to cache data.
## capacity is total space size(MB) used to store data.
# [inputs.logstreaming.storage]
# path = "./log_storage"
# capacity = 5120
Once configured, restart DataKit.
The collector can now be turned on by ConfigMap Injection Collector Configuration.
Support Parameter¶
logstreaming supports adding parameters to the HTTP URL to manipulate log data. The list of parameters is as follows:
type
: Data format, currently only supportsinfluxdb
andfirelens
.- When
type
isinflxudb
(/v1/write/logstreaming?type=influxdb
), the data itself is in row protocol format (default precision iss
), and only built-in Tags will be added and nothing else will be done - When
type
isfirelens
(/v1/write/logstreaming?type=firelens
), the data format should be multiple logs in JSON format - When this value is empty, the data will be processed such as branching and Pipeline
- When
source
: Identify the source of the data, that is, the measurement of the line protocol. Such asnginx
orredis
(/v1/write/logstreaming?source=nginx
)- This value is not valid when
type
isinfluxdb
- Default is
default
- This value is not valid when
service
: Add a service label field, such as (/v1/write/logstreaming?service=nginx_service
)- Default to
source
parameter value.
- Default to
pipeline
: Specify the Pipeline name required for the data, such asnginx.p
(/v1/write/logstreaming?pipeline=nginx.p
)tags
: Add custom tags, split by,
, such askey1=value1
andkey2=value2
(/v1/write/logstreaming?tags=key1=value1,key2=value2
)
FireLens data source types¶
The log
, source
, and date
fields in this type of data will be treated specially. Data example:
[
{
"date": 1705485197.93957,
"container_id": "xxxxxxxxxxx-xxxxxxx",
"container_name": "nginx_demo",
"source": "stdout",
"log": "127.0.0.1 - - [19/Jan/2024:11:49:48 +0800] \"GET / HTTP/1.1\" 403 162 \"-\" \"curl/7.81.0\"",
"ecs_cluster": "Cluster_demo"
},
{
"date": 1705485197.943546,
"container_id": "f68a9aeb3d64493595e89f8821fa3f86-4093234565",
"container_name": "javatest",
"source": "stdout",
"log": "2024/01/19 11:49:48 [error] 1316#1316: *1 directory index of \"/var/www/html/\" is forbidden, client: 127.0.0.1, server: _, request: \"GET / HTTP/1.1\", host: \"localhost\"",
"ecs_cluster": "Cluster_Demo"
}
]
After extracting the two logs in the list, log
will be used as the message
field of the data, date
will be converted to the time of the log, and source
will be renamed to firelens_source
.
Usage¶
- Fluentd uses Influxdb Output doc
- Fluentd uses HTTP Output doc
- Logstash uses Influxdb Output doc
- Logstash uses HTTP Output doc
Simply configure Output Host as a logstreaming URL (http://Datakit_IP:PORT/v1/write/logstreaming
)and add corresponding parameters.
Metric¶
default
¶
Using source
field in the config file, default is default
.
- Tags
Tag | Description |
---|---|
ip_or_hostname |
Request IP or hostname. |
service |
Service name. Using the service parameter in the URL. |
- Metrics
Metric | Description | Type | Unit |
---|---|---|---|
message |
Message text, existed when default. Could use Pipeline to delete this field. | string | - |
status |
Log status. | string | - |