Monitoring & Analytics
Comprehensive monitoring and analytics tools to track your DeepSeek API usage, performance, and costs in real-time.
Overview
DeepSeek provides powerful monitoring capabilities to help you:
- Track API Usage: Monitor requests, tokens, and response times
- Analyze Performance: Identify bottlenecks and optimization opportunities
- Control Costs: Monitor spending and set budget alerts
- Ensure Reliability: Track uptime and error rates
- Optimize Applications: Gain insights into usage patterns
Dashboard Overview
Real-time Metrics
Access your monitoring dashboard at https://console.deepseek.com/monitoring
Key Metrics Display
- Request Volume: Real-time API request counts
- Response Times: Average and percentile response latencies
- Token Usage: Input and output token consumption
- Error Rates: Success/failure ratios and error types
- Cost Tracking: Real-time spending and projections
Time Range Selection
- Real-time: Live data updates every 30 seconds
- Last Hour: Detailed minute-by-minute breakdown
- Last 24 Hours: Hourly aggregated data
- Last 7 Days: Daily summaries and trends
- Last 30 Days: Weekly and monthly patterns
- Custom Range: Flexible date range selection
Usage Analytics
Request Metrics
Volume Tracking
json
{
"total_requests": 15420,
"successful_requests": 15180,
"failed_requests": 240,
"success_rate": 98.44,
"requests_per_minute": 127.5,
"peak_requests_per_minute": 245
}
Model Usage Distribution
json
{
"model_usage": {
"deepseek-chat": {
"requests": 8500,
"percentage": 55.1,
"tokens": 2150000
},
"deepseek-coder": {
"requests": 4200,
"percentage": 27.2,
"tokens": 980000
},
"deepseek-vision": {
"requests": 2720,
"percentage": 17.7,
"tokens": 1200000
}
}
}
Token Analytics
Token Consumption
- Input Tokens: Tokens sent in requests
- Output Tokens: Tokens generated in responses
- Total Tokens: Combined input and output usage
- Average Tokens per Request: Efficiency metrics
- Token Rate Trends: Usage patterns over time
Token Usage Breakdown
json
{
"token_usage": {
"input_tokens": 1250000,
"output_tokens": 3080000,
"total_tokens": 4330000,
"average_input_per_request": 81.1,
"average_output_per_request": 199.7,
"average_total_per_request": 280.8
}
}
Performance Metrics
Response Time Analysis
- Average Response Time: Mean latency across all requests
- P50 Response Time: Median response time
- P95 Response Time: 95th percentile latency
- P99 Response Time: 99th percentile latency
- Maximum Response Time: Slowest response recorded
Performance Trends
json
{
"performance": {
"average_response_time_ms": 245,
"p50_response_time_ms": 180,
"p95_response_time_ms": 450,
"p99_response_time_ms": 850,
"max_response_time_ms": 1200,
"timeout_rate": 0.02
}
}
Error Monitoring
Error Classification
Error Types
- Authentication Errors: Invalid API keys or permissions
- Rate Limit Errors: Quota exceeded or rate limiting
- Validation Errors: Invalid request parameters
- Server Errors: Internal service issues
- Timeout Errors: Request timeout exceeded
Error Rate Tracking
json
{
"error_analysis": {
"total_errors": 240,
"error_rate": 1.56,
"error_types": {
"authentication": 45,
"rate_limit": 120,
"validation": 35,
"server_error": 25,
"timeout": 15
}
}
}
Error Details
Error Response Format
json
{
"error_id": "err_abc123",
"timestamp": "2025-01-15T10:30:00Z",
"error_type": "rate_limit",
"error_code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please try again later.",
"request_id": "req_xyz789",
"model": "deepseek-chat",
"user_id": "user_123"
}
Error Trend Analysis
- Error Rate Over Time: Track error patterns
- Error Distribution: Breakdown by error type
- Recovery Time: Time to resolve issues
- Impact Analysis: Affected users and requests
Cost Monitoring
Billing Analytics
Cost Breakdown
json
{
"cost_analysis": {
"total_cost": 127.45,
"currency": "USD",
"billing_period": "2025-01",
"cost_by_model": {
"deepseek-chat": 68.20,
"deepseek-coder": 35.15,
"deepseek-vision": 24.10
},
"cost_by_token_type": {
"input_tokens": 42.30,
"output_tokens": 85.15
}
}
}
Cost Trends
- Daily Spending: Track daily cost patterns
- Monthly Projections: Forecast monthly expenses
- Cost per Request: Efficiency metrics
- Budget Utilization: Progress against set budgets
- Cost Optimization: Recommendations for savings
Budget Management
Budget Alerts
json
{
"budget_settings": {
"monthly_budget": 500.00,
"current_usage": 127.45,
"usage_percentage": 25.49,
"alerts": [
{
"threshold": 50,
"status": "not_triggered"
},
{
"threshold": 80,
"status": "not_triggered"
},
{
"threshold": 100,
"status": "not_triggered"
}
]
}
}
Real-time Monitoring
Live Dashboard
WebSocket Connection
javascript
const ws = new WebSocket('wss://api.deepseek.com/monitoring/live');
ws.onmessage = function(event) {
const data = JSON.parse(event.data);
switch(data.type) {
case 'request_count':
updateRequestCounter(data.count);
break;
case 'response_time':
updateLatencyChart(data.latency);
break;
case 'error_alert':
showErrorAlert(data.error);
break;
case 'cost_update':
updateCostDisplay(data.cost);
break;
}
};
Real-time Metrics
- Live Request Counter: Real-time request volume
- Response Time Graph: Live latency visualization
- Error Rate Monitor: Real-time error tracking
- Cost Meter: Live spending updates
- System Health: Service status indicators
Alerting System
Alert Configuration
json
{
"alerts": [
{
"name": "High Error Rate",
"condition": "error_rate > 5%",
"duration": "5 minutes",
"channels": ["email", "webhook"],
"enabled": true
},
{
"name": "High Latency",
"condition": "p95_response_time > 1000ms",
"duration": "3 minutes",
"channels": ["email", "slack"],
"enabled": true
},
{
"name": "Budget Alert",
"condition": "monthly_cost > 80% of budget",
"channels": ["email"],
"enabled": true
}
]
}
API Monitoring
Programmatic Access
Monitoring API Endpoints
bash
# Get usage statistics
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://api.deepseek.com/monitoring/usage?period=24h"
# Get performance metrics
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://api.deepseek.com/monitoring/performance?period=7d"
# Get cost analysis
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://api.deepseek.com/monitoring/costs?period=30d"
Python SDK Integration
python
from deepseek import DeepSeek
client = DeepSeek(api_key="your-api-key")
# Get usage metrics
usage = client.monitoring.usage(period="24h")
print(f"Total requests: {usage.total_requests}")
print(f"Total tokens: {usage.total_tokens}")
# Get performance data
performance = client.monitoring.performance(period="7d")
print(f"Average response time: {performance.avg_response_time}ms")
# Get cost information
costs = client.monitoring.costs(period="30d")
print(f"Total cost: ${costs.total_cost}")
Custom Metrics
Custom Event Tracking
python
# Track custom events
client.monitoring.track_event(
event_name="user_signup",
properties={
"user_id": "user_123",
"plan": "pro",
"source": "api"
}
)
# Track custom metrics
client.monitoring.track_metric(
metric_name="response_quality",
value=4.5,
tags={
"model": "deepseek-chat",
"user_type": "premium"
}
)
Integration Examples
Grafana Dashboard
Grafana Configuration
json
{
"dashboard": {
"title": "DeepSeek API Monitoring",
"panels": [
{
"title": "Request Volume",
"type": "graph",
"targets": [
{
"expr": "deepseek_requests_total",
"legendFormat": "Total Requests"
}
]
},
{
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "deepseek_response_time_p95",
"legendFormat": "95th Percentile"
}
]
}
]
}
}
Datadog Integration
Datadog Metrics
python
from datadog import initialize, statsd
# Initialize Datadog
initialize(api_key='your-datadog-api-key')
# Send custom metrics
statsd.increment('deepseek.requests.total')
statsd.histogram('deepseek.response_time', response_time)
statsd.gauge('deepseek.tokens.used', token_count)
Prometheus Metrics
Metrics Export
python
from prometheus_client import Counter, Histogram, Gauge
# Define metrics
REQUEST_COUNT = Counter('deepseek_requests_total', 'Total requests')
RESPONSE_TIME = Histogram('deepseek_response_time_seconds', 'Response time')
TOKEN_USAGE = Gauge('deepseek_tokens_used', 'Tokens used')
# Update metrics
REQUEST_COUNT.inc()
RESPONSE_TIME.observe(response_time)
TOKEN_USAGE.set(token_count)
Best Practices
Monitoring Strategy
- Set Up Alerts: Configure alerts for critical metrics
- Monitor Trends: Track long-term usage patterns
- Cost Optimization: Regular cost analysis and optimization
- Performance Tuning: Use metrics to optimize application performance
- Capacity Planning: Plan for future usage based on trends
Performance Optimization
- Response Time Monitoring: Track and optimize latency
- Error Rate Analysis: Identify and fix error patterns
- Token Efficiency: Optimize token usage for cost savings
- Rate Limit Management: Monitor and manage rate limits
- Caching Strategy: Implement caching based on usage patterns
Security Monitoring
- Authentication Monitoring: Track authentication failures
- Usage Anomalies: Detect unusual usage patterns
- Access Patterns: Monitor API access patterns
- Rate Limit Violations: Track rate limit violations
- Error Pattern Analysis: Identify potential security issues
Troubleshooting
Common Issues
High Error Rates
- Check authentication credentials
- Verify request parameters
- Monitor rate limits
- Review server status
Performance Issues
- Analyze response time trends
- Check for rate limiting
- Review request patterns
- Optimize request parameters
Cost Overruns
- Review usage patterns
- Optimize token usage
- Implement caching
- Set budget alerts
Support Resources
- Monitoring Documentation: /en/monitoring
- API Reference: /en/api-reference
- Support Portal: https://support.deepseek.com
- Community Forum: https://community.deepseek.com
Next Steps
- Set Up Monitoring: Configure your monitoring dashboard
- Create Alerts: Set up critical alerts for your use case
- Analyze Patterns: Review your usage patterns and trends
- Optimize Performance: Use insights to improve your application
- Plan Capacity: Use data for future capacity planning
Start monitoring your DeepSeek API usage today to ensure optimal performance, cost efficiency, and reliability for your applications.