Chat API
AI conversation with streaming responses and agentic tool calling.
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/agents/{id}/chat | Get message history |
| POST | /api/agents/{id}/chat | Send message |
| DELETE | /api/agents/{id}/chat | Clear all history |
| GET | /api/agents/{id}/chat/models | List available models |
| GET | /api/agents/{id}/chat/tools | List available tools |
| GET | /api/agents/{id}/chat/conversations | List conversations |
| POST | /api/agents/{id}/chat/conversations | Create conversation |
| DELETE | /api/agents/{id}/chat/conversations/{conversationId} | Delete conversation |
| DELETE | /api/agents/{id}/chat/messages/{messageId} | Delete message |
| POST | /api/agents/{id}/chat/messages/{messageId}/regenerate | Regenerate response |
Send Message
POST /api/agents/{id}/chat
Content-Type: application/jsonRequest Body
{
"content": "What's the weather like?",
"conversationId": "conv-123",
"model": "llama-4-scout",
"enableTools": true,
"autoRag": true,
"stream": false
}Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
content | string | Yes | The message content |
conversationId | string | No | Continue existing conversation |
model | string | No | Model to use (defaults to agent setting) |
enableTools | boolean | No | Enable tool calling (default: true) |
autoRag | boolean | No | Enable automatic RAG context (default: true) |
stream | boolean | No | Enable streaming response (default: false) |
Non-Streaming Response
{
"userMessage": {
"id": "msg_abc123",
"role": "user",
"content": "What's the weather like?",
"conversationId": "conv-123",
"createdAt": "2024-12-15T10:00:00Z"
},
"assistantMessage": {
"id": "msg_def456",
"role": "assistant",
"content": "I don't have access to real-time weather data...",
"conversationId": "conv-123",
"toolCalls": null,
"toolResults": null,
"metadata": {
"model": "llama-4-scout"
},
"createdAt": "2024-12-15T10:00:01Z"
}
}With Tool Calls
{
"userMessage": { ... },
"assistantMessage": {
"id": "msg_def456",
"role": "assistant",
"content": "The weather in Tokyo is currently 22°C and sunny.",
"toolCalls": [
{
"id": "tc_abc123",
"name": "get_weather",
"arguments": {"location": "Tokyo"}
}
],
"toolResults": [
{
"toolCallId": "tc_abc123",
"name": "get_weather",
"result": {"temperature": 22, "condition": "sunny"}
}
]
}
}Streaming Response
For real-time streaming, set stream: true:
POST /api/agents/{id}/chat
Content-Type: application/json
{
"content": "Tell me a story",
"stream": true
}SSE Event Types
The streaming response uses Server-Sent Events with the following event types:
message_start
Sent when streaming begins.
event: message
data: {"type": "message_start", "userMessageId": "msg_abc123", "model": "llama-4-scout"}thinking
Sent when the model is processing (for each agentic loop iteration).
event: message
data: {"type": "thinking", "iteration": 1}tool_call
Sent when a tool is being called.
event: message
data: {"type": "tool_call", "id": "tc_abc123", "name": "search_knowledge", "arguments": {"query": "React tutorials"}}tool_result
Sent when a tool execution completes.
event: message
data: {"type": "tool_result", "toolCallId": "tc_abc123", "name": "search_knowledge", "success": true, "result": [...], "resultPreview": "Found 5 results..."}For errors:
event: message
data: {"type": "tool_result", "toolCallId": "tc_abc123", "name": "search_knowledge", "success": false, "error": "Search failed"}content_delta
Sent for each chunk of the response text.
event: message
data: {"type": "content_delta", "delta": "Once upon a time", "index": 0}message_complete
Sent when streaming finishes.
event: message
data: {"type": "message_complete", "assistantMessageId": "msg_def456", "toolCalls": [...], "toolResults": [...]}error
Sent if an error occurs.
event: message
data: {"type": "error", "message": "Model request failed"}Streaming Example
const response = await fetch('/api/agents/{id}/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
content: 'Tell me a story',
stream: true
})
})
const reader = response.body.getReader()
const decoder = new TextDecoder()
let buffer = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
buffer += decoder.decode(value, { stream: true })
const lines = buffer.split('\n')
buffer = lines.pop() // Keep incomplete line in buffer
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = JSON.parse(line.slice(6))
switch (data.type) {
case 'message_start':
console.log('Started:', data.model)
break
case 'thinking':
console.log('Thinking... iteration', data.iteration)
break
case 'tool_call':
console.log(`Calling: ${data.name}`)
break
case 'tool_result':
if (data.success) {
console.log(`Result: ${data.resultPreview}`)
} else {
console.log(`Error: ${data.error}`)
}
break
case 'content_delta':
process.stdout.write(data.delta)
break
case 'message_complete':
console.log('\nDone!')
break
case 'error':
console.error('Error:', data.message)
break
}
}
}
}Aborting Streams
Use AbortController to cancel streaming:
const controller = new AbortController()
const response = await fetch('/api/agents/{id}/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ content: 'Tell me a story', stream: true }),
signal: controller.signal
})
// Later, to abort:
controller.abort()Get Messages
GET /api/agents/{id}/chat?limit=50&conversationId=conv-123Query Parameters
| Parameter | Type | Description |
|---|---|---|
limit | number | Max messages (default: 50) |
conversationId | string | Filter by conversation |
before | string | Messages before ID |
after | string | Messages after ID |
Response
{
"messages": [
{
"id": "msg_abc123",
"role": "user",
"content": "Hello!",
"conversationId": "conv-123",
"createdAt": "2024-12-15T10:00:00Z"
},
{
"id": "msg_def456",
"role": "assistant",
"content": "Hi there! How can I help you?",
"conversationId": "conv-123",
"toolCalls": null,
"toolResults": null,
"metadata": {
"model": "llama-4-scout"
},
"createdAt": "2024-12-15T10:00:01Z"
}
],
"meta": {
"total": 100,
"hasMore": true
}
}Clear History
DELETE /api/agents/{id}/chatQuery Parameters
| Parameter | Type | Description |
|---|---|---|
conversationId | string | Clear specific conversation only |
Response
{
"success": true,
"deleted": 150
}List Available Models
GET /api/agents/{id}/chat/modelsResponse
{
"models": [
{
"id": "llama-4-scout",
"name": "Llama 4 Scout",
"provider": "workers-ai",
"contextWindow": 131072,
"supportsTools": true,
"supportsStreaming": true
},
{
"id": "gpt-4o",
"name": "GPT-4o",
"provider": "openai",
"contextWindow": 128000,
"supportsTools": true,
"supportsStreaming": true
},
{
"id": "claude-3-5-sonnet",
"name": "Claude 3.5 Sonnet",
"provider": "anthropic",
"contextWindow": 200000,
"supportsTools": true,
"supportsStreaming": true
}
]
}List Available Tools
GET /api/agents/{id}/chat/toolsResponse
{
"tools": [
{
"name": "search_knowledge",
"description": "Search the knowledge base for relevant information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
}
},
"required": ["query"]
}
},
{
"name": "browse_url",
"description": "Fetch and read content from a URL",
"parameters": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "URL to browse"
}
},
"required": ["url"]
}
},
{
"name": "execute_sql",
"description": "Execute a SQL query against the database",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "SQL query to execute"
}
},
"required": ["query"]
}
}
]
}Conversations API
List Conversations
GET /api/agents/{id}/chat/conversationsResponse
{
"conversations": [
{
"id": "conv-123",
"title": "Product Questions",
"messageCount": 15,
"lastMessageAt": "2024-12-15T10:30:00Z",
"createdAt": "2024-12-15T10:00:00Z"
},
{
"id": "conv-456",
"title": "Technical Support",
"messageCount": 8,
"lastMessageAt": "2024-12-14T15:20:00Z",
"createdAt": "2024-12-14T15:00:00Z"
}
]
}Create Conversation
POST /api/agents/{id}/chat/conversations
Content-Type: application/json
{
"title": "New Chat"
}Response
{
"id": "conv-789",
"title": "New Chat",
"messageCount": 0,
"createdAt": "2024-12-15T11:00:00Z"
}Delete Conversation
DELETE /api/agents/{id}/chat/conversations/{conversationId}Response
{
"success": true,
"deleted": 1
}Message Management
Delete Message
DELETE /api/agents/{id}/chat/messages/{messageId}Response
{
"success": true
}Regenerate Response
Regenerate the assistant's response for a message.
POST /api/agents/{id}/chat/messages/{messageId}/regenerate
Content-Type: application/json
{
"model": "gpt-4o",
"enableTools": true,
"stream": true
}The response format matches the standard message response (streaming or non-streaming based on the stream parameter).
Message Format
ChatMessage
interface ChatMessage {
id: string
role: "user" | "assistant" | "system" | "tool"
content: string
conversationId?: string
toolCalls?: ToolCall[]
toolResults?: ToolResult[]
metadata?: {
model?: string
toolCalls?: ToolCall[]
toolResults?: ToolResult[]
}
createdAt: string
}ToolCall
interface ToolCall {
id: string // 9-character alphanumeric ID
name: string // Tool name
arguments: Record<string, unknown> // Tool parameters
}ToolResult
interface ToolResult {
toolCallId: string // References ToolCall.id
name: string // Tool name
result: unknown // Tool output
error?: string // Error message if failed
}Errors
| Code | Description |
|---|---|
| 400 | Invalid message or parameters |
| 404 | Conversation or message not found |
| 409 | Conversation locked (concurrent request) |
| 429 | Rate limited |
| 500 | AI service error |
Error Response
{
"error": "Conversation is currently being processed",
"code": "CONVERSATION_LOCKED"
}Context Window Management
The API automatically manages context:
- Token estimation: ~4 characters = 1 token
- Reserved tokens: ~20K for system prompt, tools, RAG, and response
- Automatic trimming: Older messages removed when approaching limits
- System prompt: Always included in context
Agentic Loop
When tools are enabled, the API uses an agentic loop:
- Maximum 8 tool call iterations per message
- Loop detection prevents repeated calls with same arguments
- Tool results are added to conversation for context
- Large results (like screenshots) are truncated
Examples
Simple Chat
curl -X POST https://your-domain.com/api/agents/{id}/chat \
-H "Content-Type: application/json" \
-d '{"content": "Hello!"}'Chat with Model Selection
curl -X POST https://your-domain.com/api/agents/{id}/chat \
-H "Content-Type: application/json" \
-d '{
"content": "Explain quantum computing",
"model": "gpt-4o"
}'Streaming with Tools
curl -X POST https://your-domain.com/api/agents/{id}/chat \
-H "Content-Type: application/json" \
-d '{
"content": "Search for recent news about AI",
"enableTools": true,
"stream": true
}'Continue Conversation
curl -X POST https://your-domain.com/api/agents/{id}/chat \
-H "Content-Type: application/json" \
-d '{
"content": "Tell me more about that",
"conversationId": "conv-123"
}'