Analysis Dashboard¶

In LLM Monitoring > Analysis Dashboard, you can view multiple metrics for the selected application in chart form, including core metrics such as request volume, error rate, Token consumption, response latency, and usage share of various models, providing a comprehensive overview of application performance and cost distribution.

Note

When there is no data within the selected time period, the chart area will display a "No data" status. You can adjust the viewing range using the time widget in the upper right corner.

Core Metrics Overview¶

The top of the dashboard displays key metric statistics for the selected time range:

Request Count: Total number of requests
Span Count: Total number of associated Span calls
Request Error Rate: Percentage of failed requests out of total requests
Total Token Consumption: Total Token consumption for all requests
Average Response Latency: Average processing time for all requests

Trend Analysis Charts¶

Through multi-dimensional trend charts, you can track changes in application performance and usage trends:

Request Trend Chart: View the distribution of request volume across different time periods to identify usage peaks
Request Error Rate: Display the change trend of error rate to quickly detect abnormal fluctuations
Request Response Time: Display latency percentiles (Avg, P75, P90, P99) to locate long-tail latency issues
Token Consumption Trend Chart: Display the time distribution of Token usage to assist in cost forecasting
Average Time to First Token (TTFT): Measure first-token latency, reflecting model response speed

Model and User Analysis¶

Analyze resource usage and performance from the model dimension and user dimension:

Request Share by Model: Donut chart showing the call distribution of different models (e.g., qwen3:1.7b)
Token Usage Ranking by Model: Compare resource consumption across different models
Request Response Analysis by Model (Top 20): Table showing detailed performance metrics for each model, such as request count, average/maximum latency, percentile latency, etc.
User Request Ranking (Top 10): Identify high-frequency users
User Token Consumption Ranking (Top 10): Locate high-cost users to assist in cost allocation

Analysis Dashboard¶

Core Metrics Overview¶

Trend Analysis Charts¶

Model and User Analysis¶

Further Reading¶

Is this page helpful? ×