AI Provider Integration
AI Provider Integration
OneClaw supports three AI providers through a unified adapter interface. Each provider has its own adapter that handles the differences in request format, streaming protocol, and tool definitions.
Supported Providers
| Provider | Type | Auth | Streaming |
|---|---|---|---|
| OpenAI (and compatible) | OPENAI |
Bearer token header | SSE via /chat/completions |
| Anthropic | ANTHROPIC |
x-api-key header |
SSE via /messages |
| Google Gemini | GEMINI |
API key query parameter | SSE via :streamGenerateContent |
ModelApiAdapter Interface
All adapters implement ModelApiAdapter, which defines five operations:
sendMessageStream(apiBaseUrl, apiKey, modelId, messages, tools, systemPrompt, webSearchEnabled, temperature)
-> Flow<StreamEvent>
listModels(apiBaseUrl, apiKey) -> AppResult<List<AiModel>>
testConnection(apiBaseUrl, apiKey) -> AppResult<ConnectionTestResult>
formatToolDefinitions(tools) -> Any
generateSimpleCompletion(apiBaseUrl, apiKey, modelId, prompt, maxTokens) -> AppResult<String>
ModelApiAdapterFactory creates the correct adapter based on ProviderType:
ProviderType.OPENAI -> OpenAiAdapter
ProviderType.ANTHROPIC -> AnthropicAdapter
ProviderType.GEMINI -> GeminiAdapter
StreamEvent
The streaming response is modeled as a Flow<StreamEvent> with these event types:
| Event | Description |
|---|---|
TextDelta(text) |
Incremental text chunk |
ThinkingDelta(text) |
Extended thinking content (Anthropic) |
ToolCallStart(toolCallId, toolName) |
Tool call begins |
ToolCallDelta(toolCallId, argumentsDelta) |
Tool call arguments chunk |
ToolCallEnd(toolCallId) |
Tool call complete |
Usage(inputTokens, outputTokens) |
Token usage report |
Error(message, code) |
Error during streaming |
Done |
Stream complete |
WebSearchStart(query) |
Web search initiated |
Citations(citations) |
Citation references from web search |
OpenAI Adapter
Endpoint: POST {apiBaseUrl}/chat/completions
Request format:
{
"model": "gpt-4-turbo",
"stream": true,
"stream_options": { "include_usage": true },
"messages": [...],
"tools": [{ "type": "function", "function": {...} }],
"web_search_options": { "search_context_size": "medium" }
}
Key behaviors:
- System prompt sent as a message with role
system - Tool definitions wrapped in
{ "type": "function", "function": {...} } - Streams parsed via SSE;
[DONE]token signals completion - Tool calls accumulated by index from streamed chunks
- Model list filtered to
gpt-*,o1,o3,o4,chatgpt-*prefixes - Supports multimodal input (images as
image_urlwith base64) - Web search via
web_search_options; citations parsed fromurl_citationannotations
Anthropic Adapter
Endpoint: POST {apiBaseUrl}/messages
Request format:
{
"model": "claude-sonnet-4-20250514",
"max_tokens": 16000,
"stream": true,
"system": "...",
"thinking": { "type": "enabled", "budget_tokens": 10000 },
"messages": [...],
"tools": [{ "name": "...", "description": "...", "input_schema": {...} }]
}
Key behaviors:
- System prompt sent as a separate top-level
systemfield - Tool definitions use
input_schemainstead ofparameters - Requires merging consecutive
ToolResultmessages into a single user message - Extended thinking supported: temperature must be 1.0 when enabled
- SSE event types:
content_block_start,content_block_delta,content_block_stop,message_start,message_delta,message_stop - Web search via
web_search_20250305server tool withmax_uses: 5 - Supports PDF attachments as base64
documentcontent blocks
Gemini Adapter
Endpoint: POST {apiBaseUrl}/models/{modelId}:streamGenerateContent?key={apiKey}&alt=sse
Request format:
{
"system_instruction": { "parts": [{ "text": "..." }] },
"contents": [...],
"tools": [
{ "function_declarations": [...] },
{ "google_search": {} }
],
"generationConfig": { "temperature": 0.7 }
}
Key behaviors:
- API key passed as query parameter, not in headers
- System prompt sent as
system_instruction - Tool definitions wrapped in
function_declarationsarray - Tool calls emitted inline (not streamed) as
functionCallobjects - Tool call IDs generated as
call_{toolName}_{timestamp} - Model list filtered to those supporting
generateContent - Supports multimodal input (images as
inlineData) - Web search via
google_searchtool; citations fromgroundingMetadata
SSE Parser
All providers use a shared SseParser that converts an HTTP response body into a Flow<SseEvent>:
ResponseBody.asSseFlow() -> Flow<SseEvent>
The parser:
- Reads the response body line-by-line on
Dispatchers.IO - Recognizes
event:anddata:prefixes - Emits an
SseEvent(type, data)on each blank line (event boundary) - Uses
channelFlowfor backpressure-aware streaming - Flushes remaining data when the stream ends
API Message Types
Messages sent to adapters use a sealed class hierarchy:
ApiMessage.User(content, attachments)– User messages, optionally with image/file attachmentsApiMessage.Assistant(content, toolCalls)– AI responses, optionally with tool callsApiMessage.ToolResult(toolCallId, content)– Tool execution results
Attachments are modeled as ApiAttachment(type, mimeType, base64Data, fileName).
DTO Structure
Each provider has its own DTO package for model listing:
data/remote/dto/
openai/ -> OpenAiModelListResponse, OpenAiModelDto
anthropic/ -> AnthropicModelListResponse, AnthropicModelDto
gemini/ -> GeminiModelListResponse, GeminiModelDto
All DTOs use @Serializable from kotlinx.serialization.
Configuring a Provider
- Go to Settings > Providers
- Add a provider with name, type, and API base URL
- Enter the API key (stored in
EncryptedSharedPreferences, never in Room) - Fetch available models from the API
- Set a default model for conversations
Alternatively, use the create_provider, fetch_models, and set_default_model tools via chat.