Flameshot
Flameshot is a lightweight automated profiling tool running in Sidecar mode. It monitors the resource usage (CPU/Memory) of target processes and automatically triggers underlying Profilers (such as async-profiler) when preset thresholds are reached, enabling non-intrusive on-site snapshot collection.
Core Concepts¶
Operating Mode¶
Flameshot is deployed using the Sidecar Container pattern. It must run in the same Pod as the main business container (Main Container) and have PID namespace sharing enabled.
- Monitor: Flameshot continuously polls the resource levels of target processes within the main container.
- Trigger: When thresholds are met (e.g., CPU > 80%) or an HTTP API request is received, a collection task is triggered.
- Execute: Based on the configured language type (currently supporting Java), it invokes the corresponding Profiler tool to attach to the target process.
- Collect: The generated Profile files (e.g.,
.jfr) are stored in a shared volume and subsequently uploaded to the data observability center.
Use Cases¶
- Production Safety Net: Automatically preserve on-site evidence before a service crashes due to CPU spikes or memory leaks.
- Performance Stress Test Analysis: Cooperate with stress testing platforms to automatically collect performance hotspots under high load.
Configuration¶
All Flameshot behaviors are controlled via environment variables. Configuration is divided into Global Settings and Profiling Policies.
Global Environment Variables¶
These variables control the basic behavior of the Sidecar container.
| Variable Name | Required | Default Value | Description |
|---|---|---|---|
FLAMESHOT_DATAKIT_ADDR |
Yes | - | DataKit's Profiling data receiving interface address. |
FLAMESHOT_PROFILING_PATH |
Yes | /data |
Shared directory path. Used to store tools and generated temporary files; must match the mount path in the main container. |
FLAMESHOT_MONITOR_INTERVAL |
No | 1 |
Monitoring polling interval (seconds). |
FLAMESHOT_LOG_LEVEL |
No | info |
Log level. Options: debug, info, warn, error. |
FLAMESHOT_HTTP_LOCAL_IP |
是 | - |
The Sidecar's own HTTP service listening host. |
FLAMESHOT_HTTP_LOCAL_PORT |
是 | 8089 |
The Sidecar's own HTTP service listening port. |
FLAMESHOT_SERVICE |
否 | - | Will replace the 'service' configuration in 'FLAMESHOT_PROCESSES' |
FLAMESHOT_TAGS |
否 | - | Suggest configuring host pod_name pod_namespace, such as: "host: host_name,pod_name:pod_a" |
Profiling Policy Configuration¶
Target monitoring rules are defined via the FLAMESHOT_PROCESSES environment variable. The value must be a standard JSON Array string.
To maintain readability in Kubernetes YAML, it is strongly recommended to use YAML's block scalar syntax (|) for writing the JSON configuration, as shown below:
env:
# ... other environment variables ...
- name: FLAMESHOT_PROCESSES
value: |
[
{
"service": "user-service",
"language": "java",
"command": "^java.*user-service\\.jar$",
"duration": "60s",
"events": "cpu,alloc",
"cpu_usage_percent": 80,
"mem_usage_percent": 80,
"mem_usage_mb": 1024,
"tags": [
"env:prod",
"version:v1.2"
]
}
]
Common Field Descriptions:
service(String): Service name reported to the observability center.language(String): Target process language. Currently supportsjava.command(String): Regular expression to match the process command line.duration(String): Duration of a single collection (e.g.,30s,1m). Note: To avoid execution timeouts, it is recommended not to exceed 5 minutes.tags(List): List of custom tags; recommended to include meta-information likeenv,version.cpu_usage_percent(Int): CPU trigger threshold (0-N). Values may exceed 100 in multi-core environments.mem_usage_percent(Int): Memory usage percentage trigger threshold (0-100).mem_usage_mb(Int): Memory usage absolute value trigger threshold (MB).
Language Specifics¶
Flameshot invokes different underlying tools depending on the technology stack of the monitored application.
Java Profiling¶
For Java applications, Flameshot includes async-profiler (supporting linux-amd64 / linux-arm64).
Key Configuration Fields (FLAMESHOT_PROCESSES):
language: Must be set tojava.events: Supportscpu(CPU cycles),alloc(memory allocation),lock(lock contention),cache-misses,nativemem. Defaults toall.jdk_version: (Optional) JDK version used for metadata display.
Notes:
- No reliance on JVM Safepoint; extremely low overhead.
- If using a non-standard JDK image, ensure the Sidecar mounts
/tmpor the corresponding Java library path from the main container.
Go Profiling¶
Planned: Integration with the pprof toolchain.
Expected Features:
- Support for Goroutine blocking analysis.
- Support for Heap memory snapshots.
Python Profiling¶
Planned: Integration with non-intrusive tools like py-spy.
Deployment¶
Kubernetes Sidecar Deployment¶
For Flameshot to work correctly, the Pod configuration must meet the following three conditions:
- Shared Process Namespace (
shareProcessNamespace: true). - Shared Storage Volume (EmptyDir).
- System Capabilities (Capabilities).
YAML Example:
apiVersion: v1
kind: Pod
metadata:
name: java-app-profiled
spec:
# 1. [Core] Enable PID sharing so Sidecar can see the Java process
shareProcessNamespace: true
volumes:
- name: shared-data
emptyDir: {}
containers:
# Business Container
- name: my-app
image: my-app:latest
volumeMounts:
- name: shared-data
mountPath: /data # Must match Sidecar configuration
# Flameshot Sidecar
- name: flameshot
image: pubrepo.jiagouyun.com/datakit/flameshot:latest
env:
- name: FLAMESHOT_PROFILING_PATH
value: "/data"
# ... other environment variables ...
# 2. [Core] Grant ptrace capability
securityContext:
capabilities:
add: ["SYS_PTRACE"]
# 3. [Core] Mount the same directory
volumeMounts:
- name: shared-data
mountPath: /data
Docker Local Testing¶
If you need to test in a local Docker environment, use the following command to start Flameshot and monitor the target container.
Prerequisites:
- The main container and Flameshot container must share
/opt/java/openjdk(or the actual JDK path). - Use
--pid="container:<target_id>"or shared volumes (depending on the specific Docker version).
Test Image: pubrepo.jiagouyun.com/datakit/flameshot:1.85.1-testing_testing-iss-2876
Startup Command Example:
docker run -d \
--name flameshot-debug \
--volumes-from <YOUR_JAVA_APP_CONTAINER> \
-e FLAMESHOT_DATAKIT_ADDR="http://datakit:9529/profiling/v1/input" \
-e FLAMESHOT_PROCESSES='[{"service":"local-test","command":"java","language":"java","cpu_usage_percent":10}]' \
pubrepo.jiagouyun.com/datakit/flameshot:1.85.1-testing_testing-iss-2876
API Reference¶
Flameshot provides an HTTP interface allowing users or automated O&M scripts to manually trigger collection tasks.
Manual Triggering¶
Interface Address: GET /v1/profile
Semantic Explanation: This interface is used to generate a Profile dataset on demand, not to retrieve monitoring metrics.
Request Parameters:
| Parameter | Required | Description | Example |
|---|---|---|---|
pid |
One of two | Target Process ID. Takes precedence over command. |
1234 |
command |
One of two | Target process name regex. Used to match the target process. | ^java.*app.jar$ |
duration |
No | Collection duration. Defaults to 30s. |
30s |
events |
No | Collection event types. Defaults to all. |
cpu,alloc |
Usage Examples:
-
Trigger collection by PID:
-
Trigger collection by process name regex:
Troubleshooting¶
-
Cannot collect data?
- Check if
shareProcessNamespace: trueis enabled in the Pod. - Check if the Sidecar has
SYS_PTRACEcapability.
- Check if
-
File not uploaded?
- Check if
FLAMESHOT_PROFILING_PATHis correctly mounted between the two containers. - The system automatically manages file life cycles and will attempt to delete temporary files after collection is complete.
- Check if
Changelog¶
0.1.0 (2025-12-17)¶
The first official release of Flameshot, focusing on providing automated profiling capabilities for Java applications in containerized environments.
New Features¶
- Core Architecture:
- Support for Kubernetes Sidecar Mode deployment, utilizing shared PID namespaces for non-intrusive monitoring.
- Support for Linux AMD64 and ARM64 multi-architecture execution.
- Language Support:
- Java: Deep integration with
async-profiler, supporting various event collections like CPU, Alloc, Lock, etc. - Automatic detection and adaptation to the target container's JDK environment.
- Java: Deep integration with
- Trigger Mechanism:
- Threshold Trigger: Support for automatic triggering based on CPU usage (
cpu_usage_percent) and memory usage/amount (mem_usage_percent/mem_usage_mb). - API Trigger: Provided HTTP interface
GET /v1/monitor(Note: should be/v1/profileas per API section), supporting manual trigger by PID or regex process name matching.
- Threshold Trigger: Support for automatic triggering based on CPU usage (
- Data Integration:
- Support for automatically reporting generated
.jfror flame graph data to DataKit. - Support for flexible multi-process monitoring policies and tags (
tags) via theFLAMESHOT_PROCESSESenvironment variable.
- Support for automatically reporting generated