Logback Socket Log Collection Best Practices¶
Introduction¶
For a company, the Guance workspace will collect logs from multiple applications. Distinguishing which log comes from which Service is a challenge we face. Next, we will explore how to use Pipelines to add Service tags to logs in order to differentiate their sources.
DataKit has many ways to collect logs. This article focuses on collecting logs via Socket for Java Springboot applications, transmitting logs to DataKit through Logback's Socket. First, operations staff enable the Socket collector in DataKit and restart DataKit. Then developers add an Appender in the application’s logback-spring.xml file and declare springProperty to write the Service name into the logs when starting the jar. Developers then start the jar, passing the Service name that needs to be written into the logs. Finally, developers log into Guance, create a new Pipeline under the Pipeline tab of the Logs module, and specify the Source of the Socket collector enabled by the operations team. This way, the logs will be distinguished using Tags.
The following solution will be implemented from both the development and operations perspectives.
Solution¶
Operations¶
Linux Environment¶
1 Enable Collector¶
Log in to the Linux server where DataKit is deployed and create a logging-socket.conf file.
The content of the logging-socket.conf file is as follows:
[[inputs.logging]]
# only two protocols are supported: TCP and UDP
sockets = [
"tcp://0.0.0.0:9542",
#"udp://0.0.0.0:9531",
]
ignore = [""]
source = "socketdefault"
service = ""
pipeline = ""
ignore_status = []
character_encoding = ""
# multiline_match = '''^\S'''
remove_ansi_escape_codes = false
[inputs.logging.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
2 Restart DataKit¶
Kubernetes Environment¶
Log in to Guance, go to 【Integration】->【Datakit】-> 【Kubernetes】, and follow the instructions to install DataKit. You need to modify the datakit.yaml used for deployment. The steps involve creating a logging-socket.conf file and mounting it into DataKit.
1 Add Configuration to ConfigMap¶
logging-socket.conf: |-
[[inputs.logging]]
# only two protocols are supported: TCP and UDP
sockets = [
"tcp://0.0.0.0:9542",
#"udp://0.0.0.0:9531",
]
ignore = [""]
source = "pay-socket-service"
service = ""
pipeline = ""
ignore_status = []
character_encoding = ""
# multiline_match = '''^\S'''
remove_ansi_escape_codes = false
[inputs.logging.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
2 Mount logging-socket.conf¶
- mountPath: /usr/local/datakit/conf.d/log/logging-socket.conf
name: datakit-conf
subPath: logging-socket.conf
3 Restart DataKit¶
Parameter Explanation¶
- sockets: Protocol ports.
- ignore: File path filtering using glob rules; files matching any filter condition will not be collected.
- source: Data source.
- service: Additional tag; if empty, defaults to $source.
- pipeline: Script path when using Pipeline.
- character_encoding: Character encoding selection.
- multiline_match: Multi-line matching.
- remove_ansi_escape_codes: Whether to remove ANSI escape codes (e.g., text colors in standard output); values can be true or false.
Development¶
1 Add Dependencies¶
Add the following dependencies to the project's pom.xml:
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>4.9</version>
</dependency>
2 Add Appender to Log Configuration¶
In this step, we define DataKit address, Socket port, Service, and Source as external parameters. In the project's logback-spring.xml file, add an Appender and define datakitHostIP, datakitSocketPort, datakitSource, and datakitService, whose values are passed from external parameters guangce.datakit.host_ip, guangce.datakit.socket_port, guangce.datakit.source, and guangce.datakit.service.
<?xml version="1.0" encoding="UTF-8"?>
<configuration scan="true" scanPeriod="60 seconds" debug="false">
<springProperty scope="context" name="datakitHostIP" source="guangce.datakit.host_ip" />
<springProperty scope="context" name="datakitSocketPort" source="guangce.datakit.socket_port" />
<springProperty scope="context" name="datakitSource" source="guangce.datakit.source" />
<springProperty scope="context" name="datakitService" source="guangce.datakit.service" />
<contextName>logback</contextName>
<!-- Log root directory -->
<property name="log.path" value="./logs/order"/>
<!-- Log output format -->
<property name="log.pattern" value="%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{20} - [%method,%line] - - %msg%n" />
<!-- Print logs to console -->
<appender name="Console" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>${log.pattern}</pattern>
</encoder>
</appender>
...
<appender name="socket" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
<destination>${datakitHostIP:-}:${datakitSocketPort:-}</destination>
<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
<providers>
<timestamp>
<timeZone>UTC+8</timeZone>
</timestamp>
<pattern>
<pattern>
{
"severity": "%level",
"source": "${datakitSource}",
"service": "${datakitService}",
"method": "%method",
"line": "%line",
"thread": "%thread",
"class": "%logger{40}",
"msg": "%message\n%exception"
}
</pattern>
</pattern>
</providers>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="Console"/>
...
<appender-ref ref="socket" />
</root>
</configuration>
3 Configure Default Values¶
Add the following configuration to the application.yml file. These default parameters will be passed to Logback.
guangce:
datakit:
host_ip: 127.0.0.1 # DataKit address
socket_port: 9542 # DataKit socket port
#source: mySource # If not specified, it will use the source defined by the socket collector
#service: myService # If not specified, it will use the service defined by the socket collector
4 Run Application¶
1 Linux Environment¶
Execute the following command to start the application. If no parameters are passed, it will use the default values from application.yml.
java -jar pay-service-1.0-SNAPSHOT.jar --guangce.datakit.host_ip=172.26.0.231 --guangce.datakit.socket_port=9542 --guangce.datakit.source=pay-socket-source --guangce.datakit.service=pay-socket-service
2 Kubernetes Environment¶
Add PARAMS to the Dockerfile to receive external parameters.
FROM openjdk:8u292
RUN echo 'Asia/Shanghai' >/etc/timezone
ENV jar pay-service-1.0-SNAPSHOT.jar
ENV workdir /data/app/
RUN mkdir -p ${workdir}
COPY ${jar} ${workdir}
WORKDIR ${workdir}
ENTRYPOINT ["sh", "-ec", "exec java ${JAVA_OPTS} -jar ${jar} ${PARAMS} 2>&1 > /dev/null"]
docker build -t 172.16.0.238/df-demo/istio-pay:v1 -f DockerfilePay .
docker push 172.16.0.238/df-demo/istio-pay:v1
Write the pay-deployment.yaml file and add the PARAMS environment variable to pass guangce.datakit.host_ip, guangce.datakit.socket_port, guangce.datakit.source, and guangce.datakit.service values. If not passed, it will use the default values.
apiVersion: apps/v1
kind: Deployment
metadata:
name: istio-pay-deployment
spec:
replicas: 1
selector:
matchLabels:
app: istio-pay-pod
template:
metadata:
labels:
app: istio-pay-pod
spec:
containers:
- env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: PARAMS
value: "--guangce.datakit.host_ip=$(DD_AGENT_HOST) --guangce.datakit.socket_port=9542 --guangce.datakit.source=pay-socket-source --guangce.datakit.service=pay-socket-service"
name: pay-container
image: 172.16.0.238/df-demo/istio-pay:v1
ports:
- containerPort: 8091
protocol: TCP
resources:
limits:
memory: 512Mi
requests:
memory: 256Mi
restartPolicy: Always
volumes:
- emptyDir: {}
name: datadir
5 Configure Pipeline¶
Since the Socker Appender outputs logs in JSON format, DataKit needs to use Pipeline to parse the JSON string. The source and service are default Tags, so set_tag needs to be used.
Log in to Guance, go to 【Logs】->【Pipelines】, click 【Create Pipeline】, and select the source name socketdefault defined when enabling the Socket collector. Define the parsing rules as follows:
json(_,msg,"message")
json(_,class,"class")
json(_,thread,"thread")
json(_,severity,"status")
json(_,method,"method")
json(_,line,"line")
json(_,source,"source")
json(_,service,"service")
json(_,`@timestamp`,"time")
set_tag(service)
set_tag(source)
default_time(time)
After testing with log samples and ensuring it works, click 【Save】. Note that these parsing rules correspond to the pattern added in the logback-spring.xml file.
<pattern>
{
"severity": "%level",
"source": "${datakitSource}",
"service": "${datakitService}",
"method": "%method",
"line": "%line",
"thread": "%thread",
"class": "%logger{40}",
"msg": "%message\n%exception"
}
View Log Files¶
Access the application's interface to generate application logs. Log in to Guance, go to 【Logs】->【Data Collection】-> select pay-socket-source to view log details. Here you can see that the source and service have been replaced by the externally passed parameters.