DeerFlow Low-Cost API Access: Best Practices for Self-Hosted AI Agents

DeerFlow is a powerful open-source Super Agent framework built on LangGraph and LangChain, capable of helping you create fully functional AI assistants. Unlike ordinary chatbots, DeerFlow has its own file system, sandbox execution environment, and long-term memory capabilities, enabling it to truly assist you in completing complex multi-step tasks.

Introduction

DeerFlow (Deep Exploration and Efficient Research Flow) is an AI Agent framework open-sourced by ByteDance. Upon its release on February 28, 2026, DeerFlow 2.0 quickly topped GitHub Trending, showcasing its strong community appeal.

The core features of this framework lie in its scalability and completeness. It is not just a chat interface; it is a comprehensive AI workstation:

Skills System: Built-in skills for research, report generation, slide creation, web generation, image and video generation, and more.
Sub-Agents: The main agent can dynamically generate sub-agents to process complex tasks in parallel.
Sandbox: Each task runs in an isolated Docker container, ensuring safety and control.
Memory: Long-term memory across sessions that remembers your preferences and work habits.
Tools: A complete set of tools including web search, file operations, and Bash execution.

DeerFlow is compatible with any LLM provider that implements the OpenAI protocol, allowing you the flexibility to choose the most cost-effective model service.

Method 1: Using Defapi (Recommended)

Defapi is a burgeoning LLM API aggregation platform, with its biggest advantage being that it offers services at half the price of the official rates. This can significantly reduce costs for developers who require frequent AI usage.

Why Choose Defapi?

Cost Advantage: Same model at merely half the official price.
Strong Compatibility: Fully compatible with the OpenAI API protocol, without the need for code modifications.
Multi-Model Support: Supports various mainstream models like GPT, Claude, and Gemini.
Stable and Reliable: Provides enterprise-level service guarantees.

Configuration Steps

Step 1: Obtain API Key

Visit https://defapi.org to register an account and obtain your API Key.

Step 2: Configure Environment Variables

Create a .env file in the root directory of your project:

DEFAPI_API_KEY="dk-xxxxxxxxxxxxxxxx"

Step 3: Edit config.yaml

models:
  - name: gpt-4o-mini
    display_name: GPT-4o Mini (Defapi)
    use: langchain_openai:ChatOpenAI
    model: openai/gpt-4o-mini
    api_key: $DEFAPI_API_KEY
    base_url: https://api.defapi.org
    max_tokens: 4096
    temperature: 0.7
    supports_vision: true

Supported Models List

Defapi offers a rich selection of models:

Model	Identifier	Features
GPT-4o Mini	openai/gpt-4o-mini	High cost-performance ratio, supports vision
GPT-4o	openai/gpt-4o	Powerful and balanced, supports vision
Claude Sonnet 4.5	anthropic/claude-sonnet-4.5	Strong programming abilities
Gemini 2.0 Flash	google/gemini-2.0-flash	Fast speed

Method 2: OpenAI Official API

If you prefer using official services, you can directly configure OpenAI.

Configure config.yaml

models:
  - name: gpt-4o
    display_name: GPT-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY
    max_tokens: 4096
    temperature: 0.7
    supports_vision: true

Set the environment variable:

export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxx"

Method 3: DeepSeek

DeepSeek has gained attention for its outstanding reasoning capabilities and very competitive pricing.

Through Official API

models:
  - name: deepseek-v3
    display_name: DeepSeek V3
    use: langchain_openai:ChatOpenAI
    model: deepseek/deepseek-v3
    api_key: $DEEPSEEK_API_KEY
    base_url: https://api.deepseek.com
    max_tokens: 4096
    supports_thinking: true
    when_thinking_enabled:
      extra_body:
        thinking:
          type: enabled

Through Defapi (at half price)

models:
  - name: deepseek-v3
    display_name: DeepSeek V3
    use: langchain_openai:ChatOpenAI
    model: deepseek/deepseek-v3
    api_key: $DEFAPI_API_KEY
    base_url: https://api.defapi.org
    max_tokens: 4096
    supports_thinking: true
    when_thinking_enabled:
      extra_body:
        thinking:
          type: enabled

Method 4: Anthropic Claude

Claude is well-known for its excellent programming skills and safety features.

Configuration Example

models:
  - name: claude-3-5-sonnet
    display_name: Claude 3.5 Sonnet
    use: langchain_anthropic:ChatAnthropic
    model: claude-3-5-sonnet-20241022
    api_key: $ANTHROPIC_API_KEY
    max_tokens: 8192
    supports_vision: true

Through Defapi (at half price)

models:
  - name: claude-3-5-sonnet
    display_name: Claude 3.5 Sonnet (Defapi)
    use: langchain_openai:ChatOpenAI
    model: anthropic/claude-sonnet-4.5
    api_key: $DEFAPI_API_KEY
    base_url: https://api.defapi.org
    max_tokens: 8192
    supports_vision: true

Method 5: Novita AI

Novita AI provides a rich set of open-source models at highly competitive prices.

Configuration Example

models:
  - name: novita-deepseek-v3.2
    display_name: Novita DeepSeek V3.2
    use: langchain_openai:ChatOpenAI
    model: deepseek/deepseek-v3.2
    api_key: $NOVITA_API_KEY
    base_url: https://api.novita.ai/openai
    max_tokens: 4096
    temperature: 0.7
    supports_thinking: true
    supports_vision: true
    when_thinking_enabled:
      extra_body:
        thinking:
          type: enabled

Method 6: OpenRouter

OpenRouter aggregates various global LLM services, allowing you to access hundreds of models through a single interface.

Configuration Example

models:
  - name: openrouter-gpt-4o
    display_name: GPT-4o (OpenRouter)
    use: langchain_openai:ChatOpenAI
    model: openai/gpt-4o
    api_key: $OPENROUTER_API_KEY
    base_url: https://openrouter.ai/api/v1
    max_tokens: 4096
    temperature: 0.7

Verify if DeerFlow is Working Correctly

Start the Service

# Docker method (recommended)
make docker-init
make docker-start

# Or local development
make dev

Access the Interface

Open a browser and go to http://localhost:2026

Testing Methods

Enter in the chat box: Please introduce yourself
DeerFlow should respond with a detailed introduction about its functions and architecture.
Try more complex tasks: Help me search for the latest AI news and summarize

Test Tools

Try using the built-in tools:

Please list the file structure of the current directory

Internal Mechanism: LangGraph Agent Architecture

The core of DeerFlow is based on the Agent system of LangGraph. Understanding its architecture helps in better customizing and extending it.

Workflow of the Lead Agent

User Input → Middleware Chain → LLM → Tools → Subagents → Output

Middleware Chain consists of 9 middleware, executed in order:

ThreadDataMiddleware: Creates isolated directories for each session.
UploadsMiddleware: Handles uploaded files.
SandboxMiddleware: Acquires the sandbox execution environment.
SummarizationMiddleware: Compresses context when nearing token limits.
TodoListMiddleware: Tracks multi-step tasks (planning mode).
TitleMiddleware: Automatically generates session titles.
MemoryMiddleware: Asynchronously updates long-term memory.
ViewImageMiddleware: Injects image data for visual models.
ClarificationMiddleware: Intercepts clarification requests.

Sandbox System

DeerFlow employs an abstract sandbox interface, supporting various execution modes:

LocalSandbox: Executes directly on the host.
AioSandboxProvider: Executes in isolation within Docker containers.
Provisioner: Executes within Kubernetes Pods (production environment).

Virtual Path System:

What the Agent sees: /mnt/user-data/{workspace,uploads,outputs}
Actual path: backend/.deer-flow/threads/{thread_id}/user-data/...

Sub-Agent System

Sub-agents allow for the parallel processing of complex tasks:

general-purpose: General agents with access to all tools.
bash: Command execution specialists.
Max 3 concurrent sub-agents.
15-minute timeout.

Common Use Cases

1. In-depth Research

DeerFlow's original intended use is for in-depth research. You can use it to:

Search for the latest advancements in a particular technical field.
Read and summarize multiple papers.
Generate detailed research reports.

2. Code Development and Debugging

With the sandbox environment, DeerFlow can:

Help you write and test code.
Debug program errors.
Refactor existing codebases.

3. Content Creation

Utilizing the built-in Skills:

Generate professional technical reports.
Create presentation slides.
Write blog articles.

4. Data Analysis

Through file operations and code execution:

Read and analyze data files.
Generate data visualizations.
Create automated data pipelines.

5. Workflow Automation

Using the sub-agent system:

Execute multiple search tasks in parallel.
Complex multi-step automation processes.
Execute scheduled tasks.