Langfuse
Introduction to Langfuse¶
Langfuse is an open-source observability platform designed specifically for LLM (Large Language Model) applications, offering the following core features:
-
Full-Link Trace Tracing
- Records LLM call chains (Prompt→LLM→Output)
- Supports tracking of multi-step complex workflows
-
Metric Monitoring
- Token usage statistics
- Request latency monitoring
- Cost calculation (based on model pricing)
-
Data Annotation and Analysis
- Manual annotation functionality
- Output quality scoring
- A/B testing support
Langfuse Access Sites¶
| Site Name (Alias) | Domain Name |
|---|---|
| Hangzhou (CN1) | https://llm-openway.guance.com |
| Ningxia (CN2) | https://aws-llm-openway.guance.com |
| Beijing (CN3) | https://cn3-llm-openway.guance.com |
| Guangzhou (CN4) | https://cn4-llm-openway.guance.com |
| Hong Kong (CN6) | https://cn6-llm-openway.guance.com |
| Oregon (US1) | https://us1-llm-openway.guance.com |
| Frankfurt (EU1) | https://eu1-llm-openway.guance.com |
| Singapore (EU1) | https://ap1-llm-openway.guance.com |
| Singapore (AP1) | https://id1-llm-openway.guance.com |
| Middle East (ME1) | https://me1-llm-openway.guance.com |
Python Integration¶
Install Dependencies¶
# Core SDK
pip install langfuse
# Optional: Async support (recommended for production environments)
pip install langfuse[async]
# Development tools (for testing)
pip install pytest langfuse-test
Langfuse Integration Instructions
-
Langfuse supports a wide range of LLM integrations. Currently, we have tested the following data ingestion methods, and support for more models is pending further testing.
- Dify
- LangChain
- Ollama
- Gemini
- OpenAI
-
In the following text,
YOUR_LLM_APP_IDandYOUR_LLM_APP_TOKENcorrespond to Langfuse's public key and private key respectively.
Python SDK Integration Example¶
- Initialize the Client
- Integration Verification
from langfuse import Langfuse
# Initialize with constructor arguments
langfuse = Langfuse(
public_key="YOUR_LLM_APP_ID",
secret_key="YOUR_LLM_APP_TOKEN",
host="https://llm-openway.guance.com"
)
# Verify connection, do not use in production as this is a synchronous call
if langfuse.auth_check():
print("Langfuse client is authenticated and ready!")
If integration fails, an error similar to the following will occur:
langfuse.api.resources.commons.errors.unauthorized_error.UnauthorizedError: status_code: 401, body: {}
Python Application Examples¶
The following are several simple examples.
Ollama Integration¶
If you have Ollama deployed locally, you can use Langfuse to track Ollama API calls:
import os
from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI
os.environ["LANGFUSE_PUBLIC_KEY"] = "YOUR_LLM_APP_ID"
os.environ["LANGFUSE_SECRET_KEY"] = "YOUR_LLM_APP_TOKEN"
os.environ["LANGFUSE_HOST"] = "https://llm-openway.guance.com"
# Configure the OpenAI client to use http://localhost:11434/v1 as base url
client = OpenAI(
base_url = 'http://localhost:11434/v1', # local deployed ollama service
api_key='ollama', # required, but unused
)
stream=False # use stream mode
response = client.chat.completions.create(
#model="llama3.1:latest",
model="gemma3:4b", # specify gemma3:4b
stream=stream,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the working principles of nuclear fusion and nuclear fission."},
]
)
if stream:
for chk in response:
content = chk.choices[0].delta.content
if content is not None:
print(content, end="", flush=True)
else:
print(response)
DeepSeek Integration¶
import os
from langfuse.openai import OpenAI
from langfuse import observe
from langfuse import get_client
os.environ["LANGFUSE_PUBLIC_KEY"] = "YOUR_LLM_APP_ID"
os.environ["LANGFUSE_SECRET_KEY"] = "YOUR_LLM_APP_TOKEN"
os.environ["LANGFUSE_HOST"] = "https://llm-openway.guance.com"
# Your DeepSeek API key (get it from https://platform.deepseek.com/api_keys)
os.environ["DEEPSEEK_API_KEY"] = "YOUR_DEEPSEEK_API_KEY" # Replace with your DeepSeek API key
client = OpenAI(
base_url="https://api.deepseek.com",
api_key=os.getenv('DEEPSEEK_API_KEY'),
)
langfuse = get_client()
@observe()
def my_llm_call(input):
completion = client.chat.completions.create(
name="story-generator",
model="deepseek-chat",
messages=[
{"role": "system", "content": "You are a creative storyteller."},
{"role": "user", "content": input}
],
metadata={"genre": "adventure"},
)
return completion.choices[0].message.content
with langfuse.start_as_current_span(name="my-ds-trace") as span:
# Run your application here
output = my_llm_call("Tell me a short story about a token that got lost on its way to the language model. Answer in 100 words or less.")
# Pass additional attributes to the span
span.update_trace(
input=input,
output=output,
user_id="user_123",
session_id="session_abc",
tags=["agent", "my-trace"],
metadata={"email": "user@langfuse.com"},
version="1.0.0"
)
# Flush events in short-lived applications
langfuse.flush()
For more Langfuse integration examples, refer to here.
JavaScript Integration¶
The following is an example of using the Langfuse JavaScript SDK (v4) to call local Ollama.
Warning
The Langfuse JavaScript SDK only supports version v4, as it uses the OpenTelemetry protocol for data reporting, whose data format is more compliant with trace data specifications.
- Node.js/npm: Node.js (Version 18+)
- Local Ollama service deployed (
http://localhost:11434/v1) -
Install dependencies
-
Set up the .env file
-
Create demo.js
demo.js
// --- demo.js --- // An interactive command-line app using Langfuse v4 SDK with OpenTelemetry. // ----------------------------------------------------------------------------- // PART 1: SETUP (Executed once at the start) // ----------------------------------------------------------------------------- import 'dotenv/config'; import { observeOpenAI } from "@langfuse/openai"; import { LangfuseSpanProcessor } from "@langfuse/otel"; import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node"; import { trace, context } from "@opentelemetry/api"; import { LangfuseClient } from "@langfuse/client"; import OpenAI from "openai"; import readline from "node:readline"; // --- Manual OpenTelemetry Provider Setup --- const langfuseSpanProcessor = new LangfuseSpanProcessor(); const tracerProvider = new NodeTracerProvider({ // Note: BatchSpanProcessor is used internally by LangfuseSpanProcessor for efficiency spanProcessors: [langfuseSpanProcessor], }); tracerProvider.register(); const lfscore = new LangfuseClient(); console.log("OpenTelemetry provider configured and registered globally."); const OLLAMA_HOST = process.env.OLLAMA_HOST; // --- Tracer and OpenAI Client Setup --- const tracer = trace.getTracer("my-llm-app", "1.0.0"); const openai = new OpenAI({ baseURL: `http://${OLLAMA_HOST}/v1`, apiKey: "ollama", }); // The tracedOpenAI client will automatically create child spans for any API call const tracedOpenAI = observeOpenAI(openai); // set stream mode or not const streamMode = false // --- Readline Interface for CLI --- const rl = readline.createInterface({ input: process.stdin, output: process.stdout, }); // ----------------------------------------------------------------------------- // PART 2: APPLICATION LOGIC & INTERACTIVE LOOP // ----------------------------------------------------------------------------- /** * A dedicated shutdown function for the OpenTelemetry provider. * Ensures all buffered spans are sent to Langfuse. */ async function shutdown() { console.log("\nShutting down gracefully..."); console.log("Flushing remaining traces to Langfuse..."); await tracerProvider.shutdown(); console.log("Shutdown complete. Goodbye!"); } /** * The core function that processes a single user prompt. * It creates a trace and calls the instrumented OpenAI client. * @param {string} userPrompt - The input from the user. * @param {string} sessionId - The session ID for this run. */ async function processPrompt(userPrompt, sessionId) { // 1. Manually create the Root Span for this conversation turn. const traceSpan = tracer.startSpan("user-chat-turn"); // 2. Use context.with to ensure the auto-instrumented LLM call // becomes a child of our manual `traceSpan`. await context.with(trace.setSpan(context.active(), traceSpan), async () => { try { const spanContext = traceSpan.spanContext(); const traceId = spanContext.traceId; console.log("\n--- Trace Details ---"); console.log(`Trace ID: ${traceId}`); console.log(`You can view this trace in Langfuse at: ${process.env.LANGFUSE_BASEURL}/trace/${traceId}`); console.log("Raw Span Context:", spanContext); console.log("-----------------------"); console.log("...thinking..."); // 3. Set Langfuse-specific attributes on the root span. traceSpan.setAttributes({ "langfuse.session.id": sessionId, "langfuse.input": userPrompt, "langfuse.tags": ["interactive-cli", "ollama"], "langfuse.hello": "world", }); // 4. Make the instrumented LLM call. This automatically creates a child span. const resp = await tracedOpenAI.chat.completions.create({ messages: [{ "role": "user", "content": userPrompt }], model: process.env.OLLAMA_MODEL || "llama3", stream: streamMode, }); let fullResponse = ""; if (streamMode) { process.stdout.write("\nOllama: "); for await (const chunk of resp) { const content = chunk.choices[0]?.delta?.content || ""; fullResponse += content; process.stdout.write(content); // Write to console without newline } // Add a newline to the console after the stream is complete console.log(); } else { fullResponse = resp.choices[0].message.content; console.log(`\nOllama: ${fullResponse}`); } // 5. Add the final result as the output of the root trace. traceSpan.setAttribute("langfuse.output", fullResponse); await lfscore.score.create({ traceId: traceId, comment: "comment example", sessionId: sessionId, name: "accuracy", value: 0.9, tags: ["tag1", "tag2"], }) } catch(e) { console.error("An error occurred:", e.message); traceSpan.recordException(e); traceSpan.setStatus({ code: 2, message: e.message }); // 2 = ERROR in OTEL } finally { // 6. End the root span for this turn. traceSpan.end(); } }); } /** * The main function to start and manage the interactive loop. */ function main() { // A unique session ID for this entire chat session const sessionId = `cli-session-${Date.now()}`; console.log(`Starting chat session: ${sessionId}`); console.log('Type your prompt and press Enter. Type "exit" to quit.'); // This recursive function creates the interactive loop const askQuestion = () => { rl.question("\n> ", async (prompt) => { // Exit condition if (prompt.toLowerCase() === "exit") { rl.close(); await shutdown(); return; } // Process the user's prompt await processPrompt(prompt, sessionId); // Ask the next question askQuestion(); }); }; // Start the conversation askQuestion(); } // --- Graceful exit on Ctrl+C --- process.on('SIGINT', async () => { rl.close(); await shutdown(); process.exit(0); }); // --- Start the application --- main(); -
Run the example:
After entering the interactive interface, you can initiate conversations. After each conversation round, the corresponding LLM observability data will be triggered. Modify
streamModein the code to switch the response mode.
Data Fields¶
Data reported by Langfuse is divided into the following categories:
llm_trace: LLM observability trace datascore: Score data attached to LLM observability trace data
llm_trace¶
| Tags & Fields | Description |
|---|---|
| app_id | The LLM application ID. Type: string | - Unit: - |
| app_name | The LLM application name. Type: string | - Unit: - |
| completion_tokens | The completion tokens of current generation. Type: int | (gauge) Unit: count |
| duration | The duration of current span . Type: int | (gauge) Unit: time,usec |
| error_message | The error message of current trace(if some exception throwed). Type: string | - Unit: - |
| error_stack | The call stack of current trace(if some exception throwed). Type: string | - Unit: - |
| error_type | The error type of current trace(if some exception throwed). Type: string | - Unit: - |
| input | The input prompt of current generation. Type: string | - Unit: - |
| input_cache_read_tokens | The cached tokens of current generation. Type: int | (gauge) Unit: count |
| input_tokens | The input tokens of current generation. Type: int | (gauge) Unit: count |
| llm_provider | The provider of current generation. Type: string | - Unit: - |
| message | The JSON dump of the span. Type: string | - Unit: - |
| mode_name | The model name of current generation. Type: string | - Unit: - |
| model_parameters | The model parameters of current generation. Type: string | - Unit: - |
| observation_type | The model parameters of current generation. Type: string | - Unit: - |
| operation | Same as span name of current span . Type: string | - Unit: - |
| output | The output of current generation. Type: string | - Unit: - |
| output_tokens | The output tokens of current generation. Type: int | (gauge) Unit: count |
| prompt_tokens | The prompt tokens of current generation. Type: int | (gauge) Unit: count |
| reasoning_tokens | The reasoning tokens of current generation. Type: int | (gauge) Unit: count |
| resource | The span name of current span . Type: string | - Unit: - |
| scope_name | Langfuse SDK name. Type: string | - Unit: - |
| scope_version | Langfuse SDK version. Type: string | - Unit: - |
| sdk_language | SDK language, such as nodejs/python.Type: string | - Unit: - |
| sdk_name | SDK name, such as opentelemetry.Type: string | - Unit: - |
| sdk_version | OpenTelemetry SDK version. Type: string | - Unit: - |
| service_name | Service name. Type: string | - Unit: - |
| sesison_id | The session ID of current generation. Type: string | - Unit: - |
| span_id | The span ID of current span . Type: string | - Unit: - |
| start | The start time of current span . Type: int | - Unit: timeStamp,usec |
| status | The status of the span. Type: string | - Unit: - |
| stream_mode | Is it stream mode during prompt. Type: int | (gauge) Unit: - |
| total_tokens | The total tokens of current generation. Type: int | (gauge) Unit: count |
| trace_id | The trace ID of current span . Type: string | - Unit: - |
| ttft | For stream mode, the waiting time of the first token. Type: int | (gauge) Unit: time,μs |
| user_id | The user ID of current generation. Type: string | - Unit: - |
| vendor | Which vendor the LLM tracing vendor, for Langfuse, it's langfuse |
In addition to the fields listed above, custom fields prefixed with langfuse. in the span will also be extracted into top-level fields (. will be replaced with _). For example, in the following span, we can add custom fields to the span:
traceSpan.setAttributes({
"langfuse.session.id": <your-session-id>, // => session_id: <your-session-id>
"langfuse.input": <user-prompt>, // => input: <user-promtpt>
"langfuse.tags": ["tag1", "tag2"], // => tags: values:{string_value:"tag1"} values:{string_value:"tag2"}
});
score¶
| Tags & Fields | Description |
|---|---|
| app_id | The LLM application ID. Type: string | - Unit: - |
| app_name | The LLM application name. Type: string | - Unit: - |
| comment | The score comment. Type: string | - Unit: - |
| name | The score name. Type: string | - Unit: - |
| score_data_type | The data type of the score, available types are NUMERIC/BOOLEAN/CATEGORICAL.Type: string | - Unit: - |
| score_value | The value the NUMERIC and BOOLEAN score. Type: float | - Unit: - |
| score_value_str | The value the CATEGORICAL score. Type: float | - Unit: - |
| score_observation_id | The obervation ID the score. Type: string | - Unit: - |
| trace_id | The trace ID of the score. Type: string | - Unit: - |
| config_id | The config ID of the score. Type: string | - Unit: - |
| session_id | The session ID of the score. Type: string | - Unit: - |