Langfuse
Langfuse 简介¶
Langfuse 是一个专为 LLM (Large Language Model) 应用设计的开源可观测性平台,主要提供以下核心功能:
-
Trace 全链路追踪
- 记录 LLM 调用链(Prompt→LLM→Output)
- 支持多步骤复杂工作流跟踪
-
指标监控
- Token 使用量统计
- 请求延迟监控
- 成本计算(按模型定价)
-
数据标注与分析
- 人工标注功能
- 输出质量评分
- A/B 测试支持
Langfuse 接入站点¶
| 站点名(别名) | 域名 |
|---|---|
| 杭州(CN1) | https://llm-openway.guance.com |
| 宁夏(CN2) | https://aws-llm-openway.guance.com |
| 北京(CN3) | https://cn3-llm-openway.guance.com |
| 广州(CN4) | https://cn4-llm-openway.guance.com |
| 香港(CN6) | https://cn6-llm-openway.guance.com |
| 俄勒冈(US1) | https://us1-llm-openway.guance.com |
| 法兰克福(EU1) | https://eu1-llm-openway.guance.com |
| 新加坡(EU1) | https://ap1-llm-openway.guance.com |
| 新加坡(AP1) | https://id1-llm-openway.guance.com |
| 中东(ME1) | https://me1-llm-openway.guance.com |
Python 接入¶
安装依赖¶
# 核心 SDK
pip install langfuse
# 可选:异步支持(推荐生产环境使用)
pip install langfuse[async]
# 开发工具(测试用)
pip install pytest langfuse-test
Langfuse 集成接入说明
-
Langfuse 支持非常多的大模型集成,目前我们测试了如下数据接入,更多模型的支持尚有待测试。
- Dify
- LangChain
- Ollama
- Gemini
- OpenAI
-
下文的
YOUR_LLM_APP_ID和YOUR_LLM_APP_TOKEN分别对应 Langfuse 的公钥和密钥。
Python SDK 接入示例¶
- 初始化客户端
- 接入验证
from langfuse import Langfuse
# Initialize with constructor arguments
langfuse = Langfuse(
public_key="YOUR_LLM_APP_ID",
secret_key="YOUR_LLM_APP_TOKEN",
host="https://llm-openway.guance.com"
)
# Verify connection, do not use in production as this is a synchronous call
if langfuse.auth_check():
print("Langfuse client is authenticated and ready!")
如果接入失败,则会有类似如下报错:
langfuse.api.resources.commons.errors.unauthorized_error.UnauthorizedError: status_code: 401, body: {}
Python 应用示例¶
以下列举几个简单的示例。
Ollama 接入¶
如果我们本地部署了 Ollama,那么可以通过 Langfuse 来跟踪 Ollama 的 API 调用情况:
import os
from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI
os.environ["LANGFUSE_PUBLIC_KEY"] = "YOUR_LLM_APP_ID"
os.environ["LANGFUSE_SECRET_KEY"] = "YOUR_LLM_APP_TOKEN"
os.environ["LANGFUSE_HOST"] = "https://llm-openway.guance.com"
# Configure the OpenAI client to use http://localhost:11434/v1 as base url
client = OpenAI(
base_url = 'http://localhost:11434/v1', # local deployed ollama service
api_key='ollama', # required, but unused
)
stream=False # use stream mode
response = client.chat.completions.create(
#model="llama3.1:latest",
model="gemma3:4b", # specify gemma3:4b
stream=stream,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "解释下核聚变/核裂变工作原理。"},
]
)
if stream:
for chk in response:
content = chk.choices[0].delta.content
if content is not None:
print(content, end="", flush=True)
else:
print(response)
DeepSeek 接入¶
import os
from langfuse.openai import OpenAI
from langfuse import observe
from langfuse import get_client
os.environ["LANGFUSE_PUBLIC_KEY"] = "YOUR_LLM_APP_ID"
os.environ["LANGFUSE_SECRET_KEY"] = "YOUR_LLM_APP_TOKEN"
os.environ["LANGFUSE_HOST"] = "https://llm-openway.guance.com"
# Your DeepSeek API key (get it from https://platform.deepseek.com/api_keys)
os.environ["DEEPSEEK_API_KEY"] = "YOUR_DEEPSEEK_API_KEY" # Replace with your DeepSeek API key
client = OpenAI(
base_url="https://api.deepseek.com",
api_key=os.getenv('DEEPSEEK_API_KEY'),
)
langfuse = get_client()
@observe()
def my_llm_call(input):
completion = client.chat.completions.create(
name="story-generator",
model="deepseek-chat",
messages=[
{"role": "system", "content": "You are a creative storyteller."},
{"role": "user", "content": input}
],
metadata={"genre": "adventure"},
)
return completion.choices[0].message.content
with langfuse.start_as_current_span(name="my-ds-trace") as span:
# Run your application here
output = my_llm_call("Tell me a short story about a token that got lost on its way to the language model. Answer in 100 words or less.")
# Pass additional attributes to the span
span.update_trace(
input=input,
output=output,
user_id="user_123",
session_id="session_abc",
tags=["agent", "my-trace"],
metadata={"email": "user@langfuse.com"},
version="1.0.0"
)
# Flush events in short-lived applications
langfuse.flush()
更多 Langfuse 接入示例,参见这里。
JavaScript 接入¶
下面是一个 Langfuse JavaScript SDK(v4)调用本地 ollama 的示例。
Warning
Langfuse JavaScript SDK 只支持 v4 版本,因为 v4 版本使用的是 OpenTelemetry 协议上报,其数据格式更符合 trace 的数据规范。
- Node.js/npm: Node.js (Version 18+)
- 已经部署了 ollama 本地服务(
http://localhost:11434/v1) -
初始化项目
-
安装依赖
-
设置 .env 文件
-
创建 demo.js
demo.js
// --- demo.js --- // An interactive command-line app using Langfuse v4 SDK with OpenTelemetry. // ----------------------------------------------------------------------------- // PART 1: SETUP (Executed once at the start) // ----------------------------------------------------------------------------- import 'dotenv/config'; import { observeOpenAI } from "@langfuse/openai"; import { LangfuseSpanProcessor } from "@langfuse/otel"; import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node"; import { trace, context } from "@opentelemetry/api"; import { LangfuseClient } from "@langfuse/client"; import OpenAI from "openai"; import readline from "node:readline"; // --- Manual OpenTelemetry Provider Setup --- const langfuseSpanProcessor = new LangfuseSpanProcessor(); const tracerProvider = new NodeTracerProvider({ // Note: BatchSpanProcessor is used internally by LangfuseSpanProcessor for efficiency spanProcessors: [langfuseSpanProcessor], }); tracerProvider.register(); const lfscore = new LangfuseClient(); console.log("OpenTelemetry provider configured and registered globally."); const OLLAMA_HOST = process.env.OLLAMA_HOST; // --- Tracer and OpenAI Client Setup --- const tracer = trace.getTracer("my-llm-app", "1.0.0"); const openai = new OpenAI({ baseURL: `http://${OLLAMA_HOST}/v1`, apiKey: "ollama", }); // The tracedOpenAI client will automatically create child spans for any API call const tracedOpenAI = observeOpenAI(openai); // set stream mode or not const streamMode = false // --- Readline Interface for CLI --- const rl = readline.createInterface({ input: process.stdin, output: process.stdout, }); // ----------------------------------------------------------------------------- // PART 2: APPLICATION LOGIC & INTERACTIVE LOOP // ----------------------------------------------------------------------------- /** * A dedicated shutdown function for the OpenTelemetry provider. * Ensures all buffered spans are sent to Langfuse. */ async function shutdown() { console.log("\nShutting down gracefully..."); console.log("Flushing remaining traces to Langfuse..."); await tracerProvider.shutdown(); console.log("Shutdown complete. Goodbye!"); } /** * The core function that processes a single user prompt. * It creates a trace and calls the instrumented OpenAI client. * @param {string} userPrompt - The input from the user. * @param {string} sessionId - The session ID for this run. */ async function processPrompt(userPrompt, sessionId) { // 1. Manually create the Root Span for this conversation turn. const traceSpan = tracer.startSpan("user-chat-turn"); // 2. Use context.with to ensure the auto-instrumented LLM call // becomes a child of our manual `traceSpan`. await context.with(trace.setSpan(context.active(), traceSpan), async () => { try { const spanContext = traceSpan.spanContext(); const traceId = spanContext.traceId; console.log("\n--- Trace Details ---"); console.log(`Trace ID: ${traceId}`); console.log(`You can view this trace in Langfuse at: ${process.env.LANGFUSE_BASEURL}/trace/${traceId}`); console.log("Raw Span Context:", spanContext); console.log("-----------------------"); console.log("...thinking..."); // 3. Set Langfuse-specific attributes on the root span. traceSpan.setAttributes({ "langfuse.session.id": sessionId, "langfuse.input": userPrompt, "langfuse.tags": ["interactive-cli", "ollama"], "langfuse.hello": "world", }); // 4. Make the instrumented LLM call. This automatically creates a child span. const resp = await tracedOpenAI.chat.completions.create({ messages: [{ "role": "user", "content": userPrompt }], model: process.env.OLLAMA_MODEL || "llama3", stream: streamMode, }); let fullResponse = ""; if (streamMode) { process.stdout.write("\nOllama: "); for await (const chunk of resp) { const content = chunk.choices[0]?.delta?.content || ""; fullResponse += content; process.stdout.write(content); // Write to console without newline } // Add a newline to the console after the stream is complete console.log(); } else { fullResponse = resp.choices[0].message.content; console.log(`\nOllama: ${fullResponse}`); } // 5. Add the final result as the output of the root trace. traceSpan.setAttribute("langfuse.output", fullResponse); await lfscore.score.create({ traceId: traceId, comment: "comment example", sessionId: sessionId, name: "accuracy", value: 0.9, tags: ["tag1", "tag2"], }) } catch(e) { console.error("An error occurred:", e.message); traceSpan.recordException(e); traceSpan.setStatus({ code: 2, message: e.message }); // 2 = ERROR in OTEL } finally { // 6. End the root span for this turn. traceSpan.end(); } }); } /** * The main function to start and manage the interactive loop. */ function main() { // A unique session ID for this entire chat session const sessionId = `cli-session-${Date.now()}`; console.log(`Starting chat session: ${sessionId}`); console.log('Type your prompt and press Enter. Type "exit" to quit.'); // This recursive function creates the interactive loop const askQuestion = () => { rl.question("\n> ", async (prompt) => { // Exit condition if (prompt.toLowerCase() === "exit") { rl.close(); await shutdown(); return; } // Process the user's prompt await processPrompt(prompt, sessionId); // Ask the next question askQuestion(); }); }; // Start the conversation askQuestion(); } // --- Graceful exit on Ctrl+C --- process.on('SIGINT', async () => { rl.close(); await shutdown(); process.exit(0); }); // --- Start the application --- main(); -
运行示例:
进入交互式界面后,即可发起对话。每轮对话完成,会触发对应的 LLM 可观测数据。修改代码中的
streamMode,即可切换回答模式。
数据字段¶
Langfuse 上报的数据分成如下几类:
llm_trace:LLM 可观测 trace 数据score:LLM 可观测 trace 数据上附加的 score 数据
llm_trace¶
| Tags & Fields | Description |
|---|---|
| app_id | The LLM application ID. Type: string | - Unit: - |
| app_name | The LLM application name. Type: string | - Unit: - |
| completion_tokens | The completion tokens of current generation. Type: int | (gauge) Unit: count |
| duration | The duration of current span . Type: int | (gauge) Unit: time,usec |
| error_message | The error message of current trace(if some exception throwed). Type: string | - Unit: - |
| error_stack | The call stack of current trace(if some exception throwed). Type: string | - Unit: - |
| error_type | The error type of current trace(if some exception throwed). Type: string | - Unit: - |
| input | The input prompt of current generation. Type: string | - Unit: - |
| input_cache_read_tokens | The cached tokens of current generation. Type: int | (gauge) Unit: count |
| input_tokens | The input tokens of current generation. Type: int | (gauge) Unit: count |
| llm_provider | The provider of current generation. Type: string | - Unit: - |
| message | The JSON dump of the span. Type: string | - Unit: - |
| mode_name | The model name of current generation. Type: string | - Unit: - |
| model_parameters | The model parameters of current generation. Type: string | - Unit: - |
| observation_type | The model parameters of current generation. Type: string | - Unit: - |
| operation | Same as span name of current span . Type: string | - Unit: - |
| output | The output of current generation. Type: string | - Unit: - |
| output_tokens | The output tokens of current generation. Type: int | (gauge) Unit: count |
| prompt_tokens | The prompt tokens of current generation. Type: int | (gauge) Unit: count |
| reasoning_tokens | The reasoning tokens of current generation. Type: int | (gauge) Unit: count |
| resource | The span name of current span . Type: string | - Unit: - |
| scope_name | Langfuse SDK name. Type: string | - Unit: - |
| scope_version | Langfuse SDK version. Type: string | - Unit: - |
| sdk_language | SDK language, such as nodejs/python.Type: string | - Unit: - |
| sdk_name | SDK name, such as opentelemetry.Type: string | - Unit: - |
| sdk_version | OpenTelemetry SDK version. Type: string | - Unit: - |
| service_name | Service name. Type: string | - Unit: - |
| sesison_id | The session ID of current generation. Type: string | - Unit: - |
| span_id | The span ID of current span . Type: string | - Unit: - |
| start | The start time of current span . Type: int | - Unit: timeStamp,usec |
| status | The status of the span. Type: string | - Unit: - |
| stream_mode | Is it stream mode during prompt. Type: int | (gauge) Unit: - |
| total_tokens | The total tokens of current generation. Type: int | (gauge) Unit: count |
| trace_id | The trace ID of current span . Type: string | - Unit: - |
| ttft | For stream mode, the waiting time of the first token. Type: int | (gauge) Unit: time,μs |
| user_id | The user ID of current generation. Type: string | - Unit: - |
| vendor | Which vendor the LLM tracing vendor, for Langfuse, it's langfuse |
除了上述列举的字段外,span 中自定义增加的以 langfuse. 开头的字段也会被提取到一级字段(. 会被 _ 替换)。比如,在如下的 span 中,我们可以给 span 追加自定义字段:
traceSpan.setAttributes({
"langfuse.session.id": <your-session-id>, // => session_id: <your-session-id>
"langfuse.input": <user-prompt>, // => input: <user-promtpt>
"langfuse.tags": ["tag1", "tag2"], // => tags: values:{string_value:"tag1"} values:{string_value:"tag2"}
});
score¶
| Tags & Fields | Description |
|---|---|
| app_id | The LLM application ID. Type: string | - Unit: - |
| app_name | The LLM application name. Type: string | - Unit: - |
| comment | The score comment. Type: string | - Unit: - |
| name | The score name. Type: string | - Unit: - |
| score_data_type | The data type of the score, available types are NUMERIC/BOOLEAN/CATEGORICAL.Type: string | - Unit: - |
| score_value | The value the NUMERIC and BOOLEAN score. Type: float | - Unit: - |
| score_value_str | The value the CATEGORICAL score. Type: float | - Unit: - |
| score_observation_id | The obervation ID the score. Type: string | - Unit: - |
| trace_id | The trace ID of the score. Type: string | - Unit: - |
| config_id | The config ID of the score. Type: string | - Unit: - |
| session_id | The session ID of the score. Type: string | - Unit: - |