LOG¶
In modern IT infrastructure, systems can generate thousands of log events per minute. These logs follow specific formats, typically contain timestamp information, and are output by servers to different files such as system logs, application logs, and security logs. Since logs are stored separately on individual server nodes, when a system failure occurs, operations personnel need to log into multiple servers to review logs, a process that significantly increases the complexity and time cost of troubleshooting.
Faced with massive log data, teams often encounter data management challenges:
Which logs should be sent to the log management platform in real-time?
Which ones can be archived for later processing?
If filtering is performed during the data collection phase, critical failure information might be missed or valuable data could be accidentally deleted, creating hidden risks for subsequent problem investigation.
To address these challenges, building a centralized log management platform is crucial. Through powerful log collection capabilities, log data from distributed environments can be uniformly reported to a workspace, enabling centralized storage, audit monitoring, intelligent alerting, and in-depth analysis of logs. This approach avoids the risk of data loss that may result from pre-filtering, while also significantly improving fault diagnosis efficiency through a unified search interface and correlation analysis functions.
The LOG feature of Guance is designed based on this concept. It transforms originally isolated log data into a "connector" that runs through the entire observability system, enabling operations teams to proactively grasp system status and quickly locate the root cause of problems in emergencies, thereby achieving a shift from reactive response to proactive prevention in operations mode.
Getting Started¶
Collection and Integration¶
Guance provides flexible log collection solutions through the DataKit collector. You can choose the appropriate method based on your environment:
-
Host Log Collection: After installing DataKit on a server, specify the log file path through collector configuration to collect text-format log files.
-
K8s Environment Collection: Deploy DataKit as a DaemonSet in a Kubernetes cluster to automatically collect logs from container standard output (stdout/stderr).
-
Receive Third-party Logs: Supports receiving log data from tools like Fluentd, Logstash, Kafka via HTTP/S or TCP protocols, compatible with existing technology stacks.
Processing and Parsing¶
Guance provides complete log processing capabilities through Pipeline:
-
Structured Parsing: Use Grok patterns, regular expressions to extract key fields such as status codes, timestamps from raw log text.
-
Data Standardization: Convert unstructured log text into a unified, standardized format, laying the foundation for subsequent analysis and querying.
Query and Analysis¶
Guance provides powerful query and analysis functions with the Log Explorer:
-
Supports precise filtering queries and data retrieval based on fields, enabling quick location of target logs.
-
Intelligent Analysis: Automatically identifies log patterns through clustering analysis, displays data trends through visual charts.
-
Problem Investigation: Supports viewing log context information, performing correlation analysis with corresponding traces and infrastructure metrics.
-
Team Collaboration: Provides data Snapshot functionality, supporting secure team collaboration and knowledge sharing.
Monitoring and Alerting¶
Guance provides intelligent log monitoring and alerting capabilities:
-
Intelligent Monitoring and Alerting: Create Monitors based on log data to achieve real-time detection of anomalies and second-level Alert notifications, ensuring issues can be discovered and handled promptly.
-
Fine-grained Cost Governance: For massive log scenarios, supports filtering invalid logs at the collection end via Blacklist, and optimizing storage costs at the storage end through Multi-index and tiered Storage Policies.
Security and Compliance¶
Guance provides a comprehensive log data security control solution:
-
Access Control: Through Data Access functionality, configure data query scopes based on member roles to achieve fine-grained permission control.
-
Sensitive Data Handling: Through Field Display Permissions (sensitive data masking) functionality, mask sensitive information in logs (such as ID numbers, keys, tokens, etc.) to ensure data compliance.
Storage and Archiving¶
Guance provides a comprehensive log storage management solution:
-
Multi-index Management: Supports creating multiple log indices, diverting data to different indices based on dimensions like log source, business attributes, and configuring differentiated storage policies for each index.
-
Data Forwarding and Archiving: Supports long-term archiving of log data to Guance object storage, or real-time forwarding to external storage systems, to meet requirements for data backup, auditing, or further processing.
Quick Start¶
Based on your usage stage, select the corresponding operation path:
| Stage | Operation Path | Key Documents |
|---|---|---|
| Initial Integration | Install DataKit and configure log collection | Host Log Collection, K8s Log Collection |
| Data Parsing | Write Pipeline scripts to extract key fields | Pipeline User Guide |
| Query and Analysis | Use the Log Explorer for search and analysis | Log Explorer |
| Monitoring and Alerting | Configure log detection monitors | Log Detection |
| Cost Optimization | Configure blacklist, multi-index, and storage policies | Blacklist, Multi-index, Data Storage Policies |
| Security and Compliance | Configure data access rules and sensitive field masking | Data Access, Field Display Permissions |