API FAQ

Frequently asked questions about the DeepSeek API.

General Questions

What is the DeepSeek API?

The DeepSeek API provides programmatic access to DeepSeek's advanced language models. You can use it to build applications that leverage AI for text generation, conversation, code generation, and more.

How do I get started?

Sign up for a DeepSeek account
Generate an API key from your dashboard
Make your first API call using our SDKs or direct HTTP requests
Explore our documentation and examples

What models are available?

Currently available models include:

deepseek-chat: General-purpose conversational AI
deepseek-coder: Specialized for code generation and programming tasks
deepseek-math: Optimized for mathematical reasoning

Is there a free tier?

Yes, we offer a free tier with limited usage to help you get started. Check our pricing page for current limits and upgrade options.

Authentication & API Keys

How do I authenticate API requests?

Use your API key in the Authorization header:

bash

curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://api.deepseek.com/v1/chat/completions

Or use the api-key header:

bash

curl -H "api-key: YOUR_API_KEY" \
  https://api.deepseek.com/v1/chat/completions

Where do I find my API key?

Log in to your DeepSeek account
Go to the API Keys section in your dashboard
Create a new key or copy an existing one

Can I regenerate my API key?

Yes, you can regenerate your API key at any time from your dashboard. Note that regenerating will invalidate the old key immediately.

How do I keep my API key secure?

Never commit API keys to version control
Use environment variables to store keys
Rotate keys regularly
Restrict key permissions when possible
Monitor usage for unusual activity

Rate Limits & Usage

What are the rate limits?

Rate limits vary by plan:

Free tier: 10 requests per minute
Pro tier: 100 requests per minute
Enterprise: Custom limits

How do I handle rate limiting?

Implement exponential backoff when you receive a 429 status code:

python

import time
import random

def make_request_with_backoff(func, max_retries=3):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError:
            if attempt == max_retries - 1:
                raise
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait_time)

How is usage calculated?

Usage is measured in tokens:

Input tokens: Text you send to the API
Output tokens: Text generated by the model
Both count toward your usage quota

Can I monitor my usage?

Yes, check your dashboard for real-time usage statistics, including:

Total requests made
Tokens consumed
Rate limit status
Billing information

Technical Questions

What's the maximum context length?

Context length varies by model:

deepseek-chat: 32,768 tokens
deepseek-coder: 16,384 tokens
deepseek-math: 8,192 tokens

How do I handle long conversations?

For conversations that exceed context limits:

Summarization: Summarize older messages
Sliding window: Keep only recent messages
Chunking: Break long inputs into smaller pieces

python

def manage_conversation_length(messages, max_tokens=30000):
    total_tokens = estimate_tokens(messages)
    
    if total_tokens > max_tokens:
        # Keep system message and recent messages
        system_msg = [msg for msg in messages if msg['role'] == 'system']
        recent_msgs = messages[-10:]  # Keep last 10 messages
        return system_msg + recent_msgs
    
    return messages

Can I use streaming responses?

Yes, set stream: true in your request:

python

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

How do I control response randomness?

Use the temperature parameter:

0.0: Deterministic, focused responses
1.0: Balanced creativity (default)
2.0: Highly creative, more random

json

{
  "model": "deepseek-chat",
  "messages": [{"role": "user", "content": "Write a poem"}],
  "temperature": 1.5
}

What's the difference between temperature and top_p?

Temperature: Controls randomness across all possible tokens
Top_p: Nucleus sampling - considers only top tokens that sum to p probability

Use temperature for general creativity control, top_p for more precise probability distribution control.

Error Handling

Why am I getting a 401 error?

401 errors indicate authentication issues:

Invalid or missing API key
Expired API key
Incorrect header format

Verify your API key and header format:

bash

# Correct format
curl -H "Authorization: Bearer sk-..." \
  https://api.deepseek.com/v1/chat/completions

What does "context length exceeded" mean?

This error occurs when your input plus requested output tokens exceed the model's context limit. Solutions:

Reduce input length
Lower max_tokens parameter
Summarize conversation history
Use a model with larger context window

How do I handle network timeouts?

Implement timeout handling in your requests:

python

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retry_strategy = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504],
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("http://", adapter)
session.mount("https://", adapter)

Integration & SDKs

Which SDKs are available?

Official SDKs:

Python: pip install openai
Node.js: npm install openai
Go: Community maintained
Java: Community maintained

Can I use the OpenAI SDK?

Yes! Our API is compatible with OpenAI's SDK. Just change the base URL:

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_DEEPSEEK_API_KEY",
    base_url="https://api.deepseek.com/v1"
)

How do I integrate with my existing OpenAI code?

Minimal changes required:

Update the base URL
Replace your API key
Use compatible model names

python

# Before (OpenAI)
client = OpenAI(api_key="openai-key")

# After (DeepSeek)
client = OpenAI(
    api_key="deepseek-key",
    base_url="https://api.deepseek.com/v1"
)

Can I use curl or other HTTP clients?

Absolutely! Our API is RESTful and works with any HTTP client:

bash

curl -X POST "https://api.deepseek.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Performance & Optimization

How can I improve response speed?

Use streaming: Get partial responses immediately
Optimize prompts: Shorter, clearer prompts
Reduce max_tokens: Limit response length
Choose appropriate model: Use specialized models for specific tasks

Should I cache responses?

Yes, for repeated queries:

python

import hashlib
import json
from functools import lru_cache

@lru_cache(maxsize=100)
def cached_completion(prompt_hash, model, temperature):
    # Your API call here
    pass

def get_completion(prompt, model="deepseek-chat", temperature=0.7):
    prompt_hash = hashlib.md5(
        f"{prompt}{model}{temperature}".encode()
    ).hexdigest()
    return cached_completion(prompt_hash, model, temperature)

How do I optimize token usage?

Be concise: Remove unnecessary words
Use system messages: Set context once
Implement conversation management: Summarize old messages
Choose appropriate max_tokens: Don't over-allocate

Billing & Pricing

How am I charged?

Billing is based on token usage:

Input tokens: Text you send
Output tokens: Text generated
Different models have different rates

Can I set spending limits?

Yes, configure spending limits in your dashboard to prevent unexpected charges.

What happens if I exceed my quota?

Your requests will be rejected with a quota exceeded error until:

Your quota resets (for free tier)
You upgrade your plan
You purchase additional credits

Do you offer volume discounts?

Yes, enterprise customers can get volume discounts. Contact our sales team for custom pricing.

Security & Privacy

Is my data secure?

Yes, we implement industry-standard security measures:

Encryption in transit and at rest
Regular security audits
Compliance with data protection regulations

Do you store my API requests?

We may temporarily store requests for:

Service improvement
Abuse prevention
Debugging purposes

Data retention policies are detailed in our privacy policy.

Can I use the API for sensitive data?

For sensitive data, consider:

Using our enterprise deployment options
Implementing additional encryption
Reviewing our data processing agreements

Yes, we are GDPR compliant. See our privacy policy for details on data processing and your rights.

Troubleshooting

My requests are slow. What can I do?

Check your internet connection
Try a different model
Reduce input/output length
Use streaming for better perceived performance
Check our status page for service issues

I'm getting inconsistent responses. Why?

This is normal for AI models. To get more consistent responses:

Lower the temperature parameter
Use more specific prompts
Set a seed parameter (if available)

How do I report bugs or issues?

Check our documentation first
Search existing issues in our community forum
Contact support with detailed information:
- Request/response examples
- Error messages
- Steps to reproduce

Where can I get help?

Documentation: Comprehensive guides and examples
Community Forum: Ask questions and share solutions
Support: Direct help for technical issues
Discord: Real-time community chat

Best Practices

Prompt Engineering

Be specific: Clear, detailed instructions
Use examples: Show the desired format
Set context: Use system messages effectively
Iterate: Test and refine your prompts

Error Handling

Implement retries: Handle transient errors
Validate inputs: Check parameters before sending
Log errors: Track issues for debugging
Graceful degradation: Provide fallbacks

Performance

Batch requests: When possible, combine multiple queries
Use appropriate models: Match model to task
Monitor usage: Track performance and costs
Implement caching: Avoid redundant requests

Security

Secure API keys: Use environment variables
Validate inputs: Sanitize user inputs
Monitor usage: Watch for unusual activity
Regular rotation: Update keys periodically

Getting Help

Documentation

Support

Can't find what you're looking for? Contact our support team for personalized assistance.

API FAQ ​

General Questions ​

What is the DeepSeek API? ​

How do I get started? ​

What models are available? ​

Is there a free tier? ​

Authentication & API Keys ​

How do I authenticate API requests? ​

Where do I find my API key? ​

Can I regenerate my API key? ​

How do I keep my API key secure? ​

Rate Limits & Usage ​

What are the rate limits? ​

How do I handle rate limiting? ​

How is usage calculated? ​

Can I monitor my usage? ​

Technical Questions ​

What's the maximum context length? ​

How do I handle long conversations? ​

Can I use streaming responses? ​

How do I control response randomness? ​

What's the difference between temperature and top_p? ​

Error Handling ​

Why am I getting a 401 error? ​

What does "context length exceeded" mean? ​

How do I handle network timeouts? ​

Integration & SDKs ​

Which SDKs are available? ​

Can I use the OpenAI SDK? ​

How do I integrate with my existing OpenAI code? ​

Can I use curl or other HTTP clients? ​

Performance & Optimization ​

How can I improve response speed? ​

Should I cache responses? ​

How do I optimize token usage? ​

Billing & Pricing ​

How am I charged? ​

Can I set spending limits? ​

What happens if I exceed my quota? ​

Do you offer volume discounts? ​

Security & Privacy ​

Is my data secure? ​

Do you store my API requests? ​

Can I use the API for sensitive data? ​

Is the API GDPR compliant? ​

Troubleshooting ​

My requests are slow. What can I do? ​

I'm getting inconsistent responses. Why? ​

How do I report bugs or issues? ​

Where can I get help? ​

Best Practices ​

Prompt Engineering ​

Error Handling ​

Performance ​

Security ​

Getting Help ​

Documentation ​

Support ​

API FAQ

General Questions

What is the DeepSeek API?

How do I get started?

What models are available?

Is there a free tier?

Authentication & API Keys

How do I authenticate API requests?

Where do I find my API key?

Can I regenerate my API key?

How do I keep my API key secure?

Rate Limits & Usage

What are the rate limits?

How do I handle rate limiting?

How is usage calculated?

Can I monitor my usage?

Technical Questions

What's the maximum context length?

How do I handle long conversations?

Can I use streaming responses?

How do I control response randomness?

What's the difference between temperature and top_p?

Error Handling

Why am I getting a 401 error?

What does "context length exceeded" mean?

How do I handle network timeouts?

Integration & SDKs

Which SDKs are available?

Can I use the OpenAI SDK?

How do I integrate with my existing OpenAI code?

Can I use curl or other HTTP clients?

Performance & Optimization

How can I improve response speed?

Should I cache responses?

How do I optimize token usage?

Billing & Pricing

How am I charged?

Can I set spending limits?

What happens if I exceed my quota?

Do you offer volume discounts?

Security & Privacy

Is my data secure?

Do you store my API requests?

Can I use the API for sensitive data?

Is the API GDPR compliant?

Troubleshooting

My requests are slow. What can I do?

I'm getting inconsistent responses. Why?

How do I report bugs or issues?

Where can I get help?

Best Practices

Prompt Engineering

Error Handling

Performance

Security

Getting Help

Documentation

Support