Complete Guide to OpenFang API Integration: Claude & Gemini

March 5, 2026

Introduction

OpenFang is an open-source Agent Operating System, written in Rust and compiled into a single binary file of approximately 32MB. It inherently supports 20 mainstream LLM providers, including Anthropic Claude, Google Gemini, OpenAI GPT, DeepSeek, Groq, and more. With simple configuration, you can quickly integrate various AI capabilities into your application.

This article will detail how to configure and use various LLM APIs in OpenFang, along with some practical tips.


If you want to use high-quality AI models at a lower cost, Defapi is an excellent choice. Defapi provides APIs for all mainstream models at just 50% of the official prices. For example:

  • Gemini 2.5 Pro: Official price $1.25/M tokens โ†’ Defapi only $0.625/M tokens
  • Claude Sonnet 4: Official price $3.00/M tokens โ†’ Defapi only $1.50/M tokens

Configuration Steps

  1. Obtain Defapi API Key (visit https://defapi.org)
  2. Set in the configuration file:
# ~/.openfang/config.toml
[default_model]
provider = "openai"  # or "anthropic", "gemini"
model = "claude-sonnet-4-20250514"  # or other models
base_url = "https://api.defapi.org/v1"

[env]
DEFAPI_API_KEY = "your-defapi-key"

Using Custom Endpoints

[[providers]]
name = "defapi"
base_url = "https://api.defapi.org/v1"
api_key_env = "DEFAPI_API_KEY"

[default_model]
provider = "defapi"
model = "claude-sonnet-4-20250514"

Defapi Supported Protocols

Defapi supports various API protocols, perfectly compatible with OpenFang:

  • v1/chat/completions - OpenAI compatible interface
  • v1/messages - Anthropic Claude interface
  • v1beta/models/*:generateContent - Google Gemini interface

Method 2: Directly Using Official API

Quick Configuration with Environment Variables

OpenFang supports automatic identification of multiple providers through environment variables:

# Anthropic Claude
export ANTHROPIC_API_KEY="sk-ant-..."

# OpenAI GPT
export OPENAI_API_KEY="sk-..."

# Google Gemini (also supports free quota)
export GEMINI_API_KEY="AIza..."

# DeepSeek
export DEEPSEEK_API_KEY="sk-..."

# Groq (free quota, extremely fast)
export GROQ_API_KEY="gsk_..."

Configuration via Configuration Files

Detail the configuration in ~/.openfang/config.toml:

# Global default model
[default_model]
provider = "groq"
model = "llama-3.3-70b-versatile"

# Cost Control
[agents.defaults]
max_cost_per_hour_usd = 10.00

Quick Reference of Available Models

ProviderRecommended ModelContextFeatures
Anthropicclaude-sonnet-4-20250514200KHigh cost-performance
OpenAIgpt-4o-mini128KFast and affordable
Geminigemini-2.5-flash1MFree quota
DeepSeekdeepseek-chat64KStrong reasoning capabilities
Groqllama-3.3-70b-versatile128KExtremely fast

Method 3: Using OpenRouter Aggregation Platform

OpenRouter supports over 200 models, suitable for scenarios requiring flexible model switching:

export OPENROUTER_API_KEY="sk-or-..."
[default_model]
provider = "openrouter"
model = "openrouter/auto"  # Automatically select the best model

Method 4: Integrating Local Models

# Install and start Ollama
ollama serve
ollama pull llama3.2
[default_model]
provider = "ollama"
model = "llama3.2"

vLLM (Production-grade Local Deployment)

python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-3.1-70B-Instruct
[default_model]
provider = "vllm"
model = "meta-llama/Llama-3.1-70B-Instruct"

Method 5: Connecting to Any OpenAI-Compatible API

If you have a custom API endpoint, OpenFang supports fully customizable configurations:

[[providers]]
name = "custom-llm"
base_url = "https://your-api-endpoint.com/v1"
api_key_env = "CUSTOM_API_KEY"

[default_model]
provider = "custom-llm"
model = "your-model"

Verifying Functionality

1. Check Health Status

curl http://127.0.0.1:4200/api/health

2. View Available Models

curl http://127.0.0.1:4200/api/models

3. Check Provider Status

curl http://127.0.0.1:4200/api/providers

4. Send Test Message

curl -X POST http://127.0.0.1:4200/api/agents/{agent-id}/message \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello, say hi in 3 words"}'

5. Test Using OpenAI-Compatible Interface

curl -X POST http://127.0.0.1:4200/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hello-world",
    "messages": [{"role": "user", "content": "Hi"}]
  }'

Internal Mechanism Analysis

Driver Architecture

OpenFang has a three-layer driver architecture to support various LLMs:

  1. Native Drivers: Anthropic and Gemini drivers optimized for specific API protocols
  2. OpenAI-Compatible Drivers: Support all providers adhering to OpenAI API format
  3. Backup Drivers: Support multi-provider chained calls, automatically switching when the primary provider fails

Intelligent Model Routing Selection

OpenFang has a built-in smart routing mechanism that automatically selects the appropriate model based on task complexity:

  • Simple (Score < 100): Use Haiku or Gemini Flash
  • Medium (100-500): Use Sonnet or Gemini Pro
  • Complex (>= 500): Use Opus or GPT-4

Scoring is based on factors such as message length, tool count, code tags, dialogue depth, and system prompt length.

Cost Tracking

After each API call, OpenFang automatically calculates the cost:

Cost: $0.0042 | Tokens: 1,200 in / 340 out | Model: claude-sonnet-4-20250514

This is made possible by the built-in model directory, which contains precise pricing information for all models.


Common Use Cases

1. Intelligent Customer Service Bot

Configure low-cost models (like gpt-4o-mini or llama-3.1-8b) to handle a large volume of simple inquiries, reducing operational costs.

2. Code Review Assistant

Use Claude Opus or GPT-4 for in-depth code analysis, achieving rapid feedback with Groq's high-speed inference.

3. Content Creation Assistant

Utilize the extensive context window (1M tokens) of Gemini 2.5 Pro for writing long documents and handling complex creative tasks.

4. Data Analysis Assistant

Utilize the online search capability of the Perplexity Sonar model to obtain the most up-to-date data for statistical analysis in real-time.

5. Multilingual Translation Service

Deploy translation models using local Ollama to ensure data privacy, suitable for internal use within enterprises.

Updated March 5, 2026