Skip to content

API FAQ

Frequently asked questions about the DeepSeek API.

General Questions

What is the DeepSeek API?

The DeepSeek API provides programmatic access to DeepSeek's advanced language models. You can use it to build applications that leverage AI for text generation, conversation, code generation, and more.

How do I get started?

  1. Sign up for a DeepSeek account
  2. Generate an API key from your dashboard
  3. Make your first API call using our SDKs or direct HTTP requests
  4. Explore our documentation and examples

What models are available?

Currently available models include:

  • deepseek-chat: General-purpose conversational AI
  • deepseek-coder: Specialized for code generation and programming tasks
  • deepseek-math: Optimized for mathematical reasoning

Is there a free tier?

Yes, we offer a free tier with limited usage to help you get started. Check our pricing page for current limits and upgrade options.

Authentication & API Keys

How do I authenticate API requests?

Use your API key in the Authorization header:

bash
curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://api.deepseek.com/v1/chat/completions

Or use the api-key header:

bash
curl -H "api-key: YOUR_API_KEY" \
  https://api.deepseek.com/v1/chat/completions

Where do I find my API key?

  1. Log in to your DeepSeek account
  2. Go to the API Keys section in your dashboard
  3. Create a new key or copy an existing one

Can I regenerate my API key?

Yes, you can regenerate your API key at any time from your dashboard. Note that regenerating will invalidate the old key immediately.

How do I keep my API key secure?

  • Never commit API keys to version control
  • Use environment variables to store keys
  • Rotate keys regularly
  • Restrict key permissions when possible
  • Monitor usage for unusual activity

Rate Limits & Usage

What are the rate limits?

Rate limits vary by plan:

  • Free tier: 10 requests per minute
  • Pro tier: 100 requests per minute
  • Enterprise: Custom limits

How do I handle rate limiting?

Implement exponential backoff when you receive a 429 status code:

python
import time
import random

def make_request_with_backoff(func, max_retries=3):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError:
            if attempt == max_retries - 1:
                raise
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait_time)

How is usage calculated?

Usage is measured in tokens:

  • Input tokens: Text you send to the API
  • Output tokens: Text generated by the model
  • Both count toward your usage quota

Can I monitor my usage?

Yes, check your dashboard for real-time usage statistics, including:

  • Total requests made
  • Tokens consumed
  • Rate limit status
  • Billing information

Technical Questions

What's the maximum context length?

Context length varies by model:

  • deepseek-chat: 32,768 tokens
  • deepseek-coder: 16,384 tokens
  • deepseek-math: 8,192 tokens

How do I handle long conversations?

For conversations that exceed context limits:

  1. Summarization: Summarize older messages
  2. Sliding window: Keep only recent messages
  3. Chunking: Break long inputs into smaller pieces
python
def manage_conversation_length(messages, max_tokens=30000):
    total_tokens = estimate_tokens(messages)
    
    if total_tokens > max_tokens:
        # Keep system message and recent messages
        system_msg = [msg for msg in messages if msg['role'] == 'system']
        recent_msgs = messages[-10:]  # Keep last 10 messages
        return system_msg + recent_msgs
    
    return messages

Can I use streaming responses?

Yes, set stream: true in your request:

python
stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

How do I control response randomness?

Use the temperature parameter:

  • 0.0: Deterministic, focused responses
  • 1.0: Balanced creativity (default)
  • 2.0: Highly creative, more random
json
{
  "model": "deepseek-chat",
  "messages": [{"role": "user", "content": "Write a poem"}],
  "temperature": 1.5
}

What's the difference between temperature and top_p?

  • Temperature: Controls randomness across all possible tokens
  • Top_p: Nucleus sampling - considers only top tokens that sum to p probability

Use temperature for general creativity control, top_p for more precise probability distribution control.

Error Handling

Why am I getting a 401 error?

401 errors indicate authentication issues:

  • Invalid or missing API key
  • Expired API key
  • Incorrect header format

Verify your API key and header format:

bash
# Correct format
curl -H "Authorization: Bearer sk-..." \
  https://api.deepseek.com/v1/chat/completions

What does "context length exceeded" mean?

This error occurs when your input plus requested output tokens exceed the model's context limit. Solutions:

  1. Reduce input length
  2. Lower max_tokens parameter
  3. Summarize conversation history
  4. Use a model with larger context window

How do I handle network timeouts?

Implement timeout handling in your requests:

python
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retry_strategy = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504],
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("http://", adapter)
session.mount("https://", adapter)

Integration & SDKs

Which SDKs are available?

Official SDKs:

  • Python: pip install openai
  • Node.js: npm install openai
  • Go: Community maintained
  • Java: Community maintained

Can I use the OpenAI SDK?

Yes! Our API is compatible with OpenAI's SDK. Just change the base URL:

python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_DEEPSEEK_API_KEY",
    base_url="https://api.deepseek.com/v1"
)

How do I integrate with my existing OpenAI code?

Minimal changes required:

  1. Update the base URL
  2. Replace your API key
  3. Use compatible model names
python
# Before (OpenAI)
client = OpenAI(api_key="openai-key")

# After (DeepSeek)
client = OpenAI(
    api_key="deepseek-key",
    base_url="https://api.deepseek.com/v1"
)

Can I use curl or other HTTP clients?

Absolutely! Our API is RESTful and works with any HTTP client:

bash
curl -X POST "https://api.deepseek.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Performance & Optimization

How can I improve response speed?

  1. Use streaming: Get partial responses immediately
  2. Optimize prompts: Shorter, clearer prompts
  3. Reduce max_tokens: Limit response length
  4. Choose appropriate model: Use specialized models for specific tasks

Should I cache responses?

Yes, for repeated queries:

python
import hashlib
import json
from functools import lru_cache

@lru_cache(maxsize=100)
def cached_completion(prompt_hash, model, temperature):
    # Your API call here
    pass

def get_completion(prompt, model="deepseek-chat", temperature=0.7):
    prompt_hash = hashlib.md5(
        f"{prompt}{model}{temperature}".encode()
    ).hexdigest()
    return cached_completion(prompt_hash, model, temperature)

How do I optimize token usage?

  1. Be concise: Remove unnecessary words
  2. Use system messages: Set context once
  3. Implement conversation management: Summarize old messages
  4. Choose appropriate max_tokens: Don't over-allocate

Billing & Pricing

How am I charged?

Billing is based on token usage:

  • Input tokens: Text you send
  • Output tokens: Text generated
  • Different models have different rates

Can I set spending limits?

Yes, configure spending limits in your dashboard to prevent unexpected charges.

What happens if I exceed my quota?

Your requests will be rejected with a quota exceeded error until:

  • Your quota resets (for free tier)
  • You upgrade your plan
  • You purchase additional credits

Do you offer volume discounts?

Yes, enterprise customers can get volume discounts. Contact our sales team for custom pricing.

Security & Privacy

Is my data secure?

Yes, we implement industry-standard security measures:

  • Encryption in transit and at rest
  • Regular security audits
  • Compliance with data protection regulations

Do you store my API requests?

We may temporarily store requests for:

  • Service improvement
  • Abuse prevention
  • Debugging purposes

Data retention policies are detailed in our privacy policy.

Can I use the API for sensitive data?

For sensitive data, consider:

  • Using our enterprise deployment options
  • Implementing additional encryption
  • Reviewing our data processing agreements

Is the API GDPR compliant?

Yes, we are GDPR compliant. See our privacy policy for details on data processing and your rights.

Troubleshooting

My requests are slow. What can I do?

  1. Check your internet connection
  2. Try a different model
  3. Reduce input/output length
  4. Use streaming for better perceived performance
  5. Check our status page for service issues

I'm getting inconsistent responses. Why?

This is normal for AI models. To get more consistent responses:

  • Lower the temperature parameter
  • Use more specific prompts
  • Set a seed parameter (if available)

How do I report bugs or issues?

  1. Check our documentation first
  2. Search existing issues in our community forum
  3. Contact support with detailed information:
    • Request/response examples
    • Error messages
    • Steps to reproduce

Where can I get help?

  • Documentation: Comprehensive guides and examples
  • Community Forum: Ask questions and share solutions
  • Support: Direct help for technical issues
  • Discord: Real-time community chat

Best Practices

Prompt Engineering

  1. Be specific: Clear, detailed instructions
  2. Use examples: Show the desired format
  3. Set context: Use system messages effectively
  4. Iterate: Test and refine your prompts

Error Handling

  1. Implement retries: Handle transient errors
  2. Validate inputs: Check parameters before sending
  3. Log errors: Track issues for debugging
  4. Graceful degradation: Provide fallbacks

Performance

  1. Batch requests: When possible, combine multiple queries
  2. Use appropriate models: Match model to task
  3. Monitor usage: Track performance and costs
  4. Implement caching: Avoid redundant requests

Security

  1. Secure API keys: Use environment variables
  2. Validate inputs: Sanitize user inputs
  3. Monitor usage: Watch for unusual activity
  4. Regular rotation: Update keys periodically

Getting Help

Documentation

Support


Can't find what you're looking for? Contact our support team for personalized assistance.

基于 DeepSeek AI 大模型技术