Skip to content

Langfuse

Introduction to Langfuse

Langfuse is an open-source observability platform designed specifically for LLM (Large Language Model) applications, offering the following core features:

  1. Full-Link Trace Tracing

    • Records LLM call chains (Prompt→LLM→Output)
    • Supports tracking of multi-step complex workflows
  2. Metric Monitoring

    • Token usage statistics
    • Request latency monitoring
    • Cost calculation (based on model pricing)
  3. Data Annotation and Analysis

    • Manual annotation functionality
    • Output quality scoring
    • A/B testing support

Langfuse Access Sites

Site Name (Alias) Domain Name
Hangzhou (CN1) https://llm-openway.guance.com
Ningxia (CN2) https://aws-llm-openway.guance.com
Beijing (CN3) https://cn3-llm-openway.guance.com
Guangzhou (CN4) https://cn4-llm-openway.guance.com
Hong Kong (CN6) https://cn6-llm-openway.guance.com
Oregon (US1) https://us1-llm-openway.guance.com
Frankfurt (EU1) https://eu1-llm-openway.guance.com
Singapore (EU1) https://ap1-llm-openway.guance.com
Singapore (AP1) https://id1-llm-openway.guance.com
Middle East (ME1) https://me1-llm-openway.guance.com

Python Integration

Install Dependencies

# Core SDK
pip install langfuse

# Optional: Async support (recommended for production environments)
pip install langfuse[async]

# Development tools (for testing)
pip install pytest langfuse-test
Langfuse Integration Instructions
  • Langfuse supports a wide range of LLM integrations. Currently, we have tested the following data ingestion methods, and support for more models is pending further testing.

    • Dify
    • LangChain
    • Ollama
    • Gemini
    • OpenAI
  • In the following text, YOUR_LLM_APP_ID and YOUR_LLM_APP_TOKEN correspond to Langfuse's public key and private key respectively.

Python SDK Integration Example

  • Initialize the Client
LANGFUSE_PUBLIC_KEY="YOUR_LLM_APP_ID"
LANGFUSE_SECRET_KEY="YOUR_LLM_APP_TOKEN"
LANGFUSE_HOST="https://llm-openway.guance.com"
from langfuse import Langfuse
langfuse = Langfuse(
    public_key="YOUR_LLM_APP_ID",
    secret_key="YOUR_LLM_APP_TOKEN",
    host="https://llm-openway.guance.com"
)
  • Integration Verification
from langfuse import Langfuse

# Initialize with constructor arguments
langfuse = Langfuse(
    public_key="YOUR_LLM_APP_ID",
    secret_key="YOUR_LLM_APP_TOKEN",
    host="https://llm-openway.guance.com"
)

# Verify connection, do not use in production as this is a synchronous call
if langfuse.auth_check():
    print("Langfuse client is authenticated and ready!")

If integration fails, an error similar to the following will occur:

langfuse.api.resources.commons.errors.unauthorized_error.UnauthorizedError: status_code: 401, body: {}

Python Application Examples

The following are several simple examples.

Ollama Integration

If you have Ollama deployed locally, you can use Langfuse to track Ollama API calls:

import os

from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI

os.environ["LANGFUSE_PUBLIC_KEY"] = "YOUR_LLM_APP_ID"
os.environ["LANGFUSE_SECRET_KEY"] = "YOUR_LLM_APP_TOKEN"
os.environ["LANGFUSE_HOST"] = "https://llm-openway.guance.com"

# Configure the OpenAI client to use http://localhost:11434/v1 as base url
client = OpenAI(
    base_url = 'http://localhost:11434/v1', # local deployed ollama service
    api_key='ollama', # required, but unused
)

stream=False # use stream mode

response = client.chat.completions.create(
        #model="llama3.1:latest",
        model="gemma3:4b", # specify gemma3:4b
        stream=stream,
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Explain the working principles of nuclear fusion and nuclear fission."},
            ]
        )

if stream:
    for chk in response:
        content = chk.choices[0].delta.content
        if content is not None:
            print(content, end="", flush=True)
else:
    print(response)

DeepSeek Integration

import os
from langfuse.openai import OpenAI
from langfuse import observe
from langfuse import get_client

os.environ["LANGFUSE_PUBLIC_KEY"] = "YOUR_LLM_APP_ID" 
os.environ["LANGFUSE_SECRET_KEY"] = "YOUR_LLM_APP_TOKEN" 
os.environ["LANGFUSE_HOST"] = "https://llm-openway.guance.com"

# Your DeepSeek API key (get it from https://platform.deepseek.com/api_keys)
os.environ["DEEPSEEK_API_KEY"] = "YOUR_DEEPSEEK_API_KEY"  # Replace with your DeepSeek API key

client = OpenAI(
    base_url="https://api.deepseek.com",
    api_key=os.getenv('DEEPSEEK_API_KEY'),
)

langfuse = get_client()

@observe()
def my_llm_call(input):
    completion = client.chat.completions.create(
        name="story-generator",
        model="deepseek-chat",
        messages=[
            {"role": "system", "content": "You are a creative storyteller."},
            {"role": "user", "content": input}
        ],
        metadata={"genre": "adventure"},
    )
    return completion.choices[0].message.content

with langfuse.start_as_current_span(name="my-ds-trace") as span:
    # Run your application here
    output = my_llm_call("Tell me a short story about a token that got lost on its way to the language model. Answer in 100 words or less.")

    # Pass additional attributes to the span
    span.update_trace(
        input=input,
        output=output,
        user_id="user_123",
        session_id="session_abc",
        tags=["agent", "my-trace"],
        metadata={"email": "user@langfuse.com"},
        version="1.0.0"
        )

# Flush events in short-lived applications
langfuse.flush()

For more Langfuse integration examples, refer to here.

JavaScript Integration

The following is an example of using the Langfuse JavaScript SDK (v4) to call local Ollama.

Warning

The Langfuse JavaScript SDK only supports version v4, as it uses the OpenTelemetry protocol for data reporting, whose data format is more compliant with trace data specifications.

  • Node.js/npm: Node.js (Version 18+)
  • Local Ollama service deployed (http://localhost:11434/v1)
  • Install dependencies

    npm install dotenv openai @langfuse/openai @langfuse/otel @opentelemetry/sdk-trace-node @opentelemetry/api @langfuse/client
    
  • Set up the .env file

    # Replace with the keys from your Langfuse project's settings
    LANGFUSE_SECRET_KEY=llm_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    LANGFUSE_PUBLIC_KEY=<YOUR-LLM-APP-NAME>
    
    LANGFUSE_BASEURL="https://llm-openway.guance.com"
    
    OLLAMA_HOST=localhost:11434
    OLLAMA_MODEL=<your-model-name> # such as qwen3:1.7b
    
  • Create demo.js

    demo.js
    // --- demo.js ---
    // An interactive command-line app using Langfuse v4 SDK with OpenTelemetry.
    
    // -----------------------------------------------------------------------------
    // PART 1: SETUP (Executed once at the start)
    // -----------------------------------------------------------------------------
    
    import 'dotenv/config';
    import { observeOpenAI } from "@langfuse/openai";
    import { LangfuseSpanProcessor } from "@langfuse/otel";
    import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
    import { trace, context } from "@opentelemetry/api";
    import { LangfuseClient } from "@langfuse/client";
    import OpenAI from "openai";
    import readline from "node:readline";
    
    // --- Manual OpenTelemetry Provider Setup ---
    const langfuseSpanProcessor = new LangfuseSpanProcessor();
    const tracerProvider = new NodeTracerProvider({
        // Note: BatchSpanProcessor is used internally by LangfuseSpanProcessor for efficiency
        spanProcessors: [langfuseSpanProcessor],
    });
    tracerProvider.register();
    
    const lfscore = new LangfuseClient();
    
    console.log("OpenTelemetry provider configured and registered globally.");
    
    const OLLAMA_HOST = process.env.OLLAMA_HOST;
    
    // --- Tracer and OpenAI Client Setup ---
    const tracer = trace.getTracer("my-llm-app", "1.0.0");
    const openai = new OpenAI({
        baseURL: `http://${OLLAMA_HOST}/v1`,
        apiKey: "ollama",
    });
    // The tracedOpenAI client will automatically create child spans for any API call
    const tracedOpenAI = observeOpenAI(openai);
    
    // set stream mode or not
    const streamMode = false
    
    // --- Readline Interface for CLI ---
    const rl = readline.createInterface({
        input: process.stdin,
        output: process.stdout,
    });
    
    // -----------------------------------------------------------------------------
    // PART 2: APPLICATION LOGIC & INTERACTIVE LOOP
    // -----------------------------------------------------------------------------
    
    /**
     * A dedicated shutdown function for the OpenTelemetry provider.
     * Ensures all buffered spans are sent to Langfuse.
     */
    async function shutdown() {
        console.log("\nShutting down gracefully...");
        console.log("Flushing remaining traces to Langfuse...");
        await tracerProvider.shutdown();
        console.log("Shutdown complete. Goodbye!");
    }
    
    /**
     * The core function that processes a single user prompt.
     * It creates a trace and calls the instrumented OpenAI client.
     * @param {string} userPrompt - The input from the user.
     * @param {string} sessionId - The session ID for this run.
     */
    async function processPrompt(userPrompt, sessionId) {
        // 1. Manually create the Root Span for this conversation turn.
        const traceSpan = tracer.startSpan("user-chat-turn");
    
        // 2. Use context.with to ensure the auto-instrumented LLM call
        //    becomes a child of our manual `traceSpan`.
        await context.with(trace.setSpan(context.active(), traceSpan), async () => {
            try {
                const spanContext = traceSpan.spanContext();
                const traceId = spanContext.traceId;
    
                console.log("\n--- Trace Details ---");
                console.log(`Trace ID: ${traceId}`);
                console.log(`You can view this trace in Langfuse at: ${process.env.LANGFUSE_BASEURL}/trace/${traceId}`);
                console.log("Raw Span Context:", spanContext);
                console.log("-----------------------");
    
                console.log("...thinking...");
    
                // 3. Set Langfuse-specific attributes on the root span.
                traceSpan.setAttributes({
                    "langfuse.session.id": sessionId,
                    "langfuse.input": userPrompt,
                    "langfuse.tags": ["interactive-cli", "ollama"],
                    "langfuse.hello": "world",
                });
    
                // 4. Make the instrumented LLM call. This automatically creates a child span.
                const resp = await tracedOpenAI.chat.completions.create({
                    messages: [{ "role": "user", "content": userPrompt }],
                    model: process.env.OLLAMA_MODEL || "llama3",
                    stream: streamMode,
                });
    
                let fullResponse = "";
                if (streamMode) {
                    process.stdout.write("\nOllama: ");
                    for await (const chunk of resp) {
                        const content = chunk.choices[0]?.delta?.content || "";
                        fullResponse += content;
                        process.stdout.write(content); // Write to console without newline
                    }
    
                    // Add a newline to the console after the stream is complete
                    console.log();
                } else {
                    fullResponse = resp.choices[0].message.content;
                    console.log(`\nOllama: ${fullResponse}`);
                }
    
                // 5. Add the final result as the output of the root trace.
                traceSpan.setAttribute("langfuse.output", fullResponse);
    
                await lfscore.score.create({
                    traceId: traceId,
                    comment: "comment example",
                    sessionId: sessionId,
                    name: "accuracy",
                    value: 0.9,
                    tags: ["tag1", "tag2"],
                })
            } catch(e) {
                console.error("An error occurred:", e.message);
                traceSpan.recordException(e);
                traceSpan.setStatus({ code: 2, message: e.message }); // 2 = ERROR in OTEL
            } finally {
                // 6. End the root span for this turn.
                traceSpan.end();
            }
        });
    }
    
    /**
     * The main function to start and manage the interactive loop.
     */
    function main() {
        // A unique session ID for this entire chat session
        const sessionId = `cli-session-${Date.now()}`;
        console.log(`Starting chat session: ${sessionId}`);
        console.log('Type your prompt and press Enter. Type "exit" to quit.');
    
        // This recursive function creates the interactive loop
        const askQuestion = () => {
            rl.question("\n> ", async (prompt) => {
                // Exit condition
                if (prompt.toLowerCase() === "exit") {
                    rl.close();
                    await shutdown();
                    return;
                }
    
                // Process the user's prompt
                await processPrompt(prompt, sessionId);
    
                // Ask the next question
                askQuestion();
            });
        };
    
        // Start the conversation
        askQuestion();
    }
    
    // --- Graceful exit on Ctrl+C ---
    process.on('SIGINT', async () => {
        rl.close();
        await shutdown();
        process.exit(0);
    });
    
    // --- Start the application ---
    main();
    
  • Run the example:

    node demo.js
    

    After entering the interactive interface, you can initiate conversations. After each conversation round, the corresponding LLM observability data will be triggered. Modify streamMode in the code to switch the response mode.

Data Fields

Data reported by Langfuse is divided into the following categories:

  • llm_trace: LLM observability trace data
  • score: Score data attached to LLM observability trace data

llm_trace

Tags & Fields Description
app_id The LLM application ID.
Type: string | -
Unit: -
app_name The LLM application name.
Type: string | -
Unit: -
completion_tokens The completion tokens of current generation.
Type: int | (gauge)
Unit: count
duration The duration of current span .
Type: int | (gauge)
Unit: time,usec
error_message The error message of current trace(if some exception throwed).
Type: string | -
Unit: -
error_stack The call stack of current trace(if some exception throwed).
Type: string | -
Unit: -
error_type The error type of current trace(if some exception throwed).
Type: string | -
Unit: -
input The input prompt of current generation.
Type: string | -
Unit: -
input_cache_read_tokens The cached tokens of current generation.
Type: int | (gauge)
Unit: count
input_tokens The input tokens of current generation.
Type: int | (gauge)
Unit: count
llm_provider The provider of current generation.
Type: string | -
Unit: -
message The JSON dump of the span.
Type: string | -
Unit: -
mode_name The model name of current generation.
Type: string | -
Unit: -
model_parameters The model parameters of current generation.
Type: string | -
Unit: -
observation_type The model parameters of current generation.
Type: string | -
Unit: -
operation Same as span name of current span .
Type: string | -
Unit: -
output The output of current generation.
Type: string | -
Unit: -
output_tokens The output tokens of current generation.
Type: int | (gauge)
Unit: count
prompt_tokens The prompt tokens of current generation.
Type: int | (gauge)
Unit: count
reasoning_tokens The reasoning tokens of current generation.
Type: int | (gauge)
Unit: count
resource The span name of current span .
Type: string | -
Unit: -
scope_name Langfuse SDK name.
Type: string | -
Unit: -
scope_version Langfuse SDK version.
Type: string | -
Unit: -
sdk_language SDK language, such as nodejs/python.
Type: string | -
Unit: -
sdk_name SDK name, such as opentelemetry.
Type: string | -
Unit: -
sdk_version OpenTelemetry SDK version.
Type: string | -
Unit: -
service_name Service name.
Type: string | -
Unit: -
sesison_id The session ID of current generation.
Type: string | -
Unit: -
span_id The span ID of current span .
Type: string | -
Unit: -
start The start time of current span .
Type: int | -
Unit: timeStamp,usec
status The status of the span.
Type: string | -
Unit: -
stream_mode Is it stream mode during prompt.
Type: int | (gauge)
Unit: -
total_tokens The total tokens of current generation.
Type: int | (gauge)
Unit: count
trace_id The trace ID of current span .
Type: string | -
Unit: -
ttft For stream mode, the waiting time of the first token.
Type: int | (gauge)
Unit: time,μs
user_id The user ID of current generation.
Type: string | -
Unit: -
vendor Which vendor the LLM tracing vendor, for Langfuse, it's langfuse

In addition to the fields listed above, custom fields prefixed with langfuse. in the span will also be extracted into top-level fields (. will be replaced with _). For example, in the following span, we can add custom fields to the span:

traceSpan.setAttributes({
        "langfuse.session.id": <your-session-id>, // => session_id: <your-session-id>
        "langfuse.input": <user-prompt>,          // => input: <user-promtpt>
        "langfuse.tags": ["tag1", "tag2"],        // => tags: values:{string_value:"tag1"}  values:{string_value:"tag2"}
        });

score

Tags & Fields Description
app_id The LLM application ID.
Type: string | -
Unit: -
app_name The LLM application name.
Type: string | -
Unit: -
comment The score comment.
Type: string | -
Unit: -
name The score name.
Type: string | -
Unit: -
score_data_type The data type of the score, available types are NUMERIC/BOOLEAN/CATEGORICAL.
Type: string | -
Unit: -
score_value The value the NUMERIC and BOOLEAN score.
Type: float | -
Unit: -
score_value_str The value the CATEGORICAL score.
Type: float | -
Unit: -
score_observation_id The obervation ID the score.
Type: string | -
Unit: -
trace_id The trace ID of the score.
Type: string | -
Unit: -
config_id The config ID of the score.
Type: string | -
Unit: -
session_id The session ID of the score.
Type: string | -
Unit: -

Further References

Feedback

Is this page helpful? ×