Endpoints
POST /v1/chat/completions
Secondary endpoint that mirrors the chat-completions convention. Every model and every credit-pool rule from /v1/messages applies — only the request and response shapes differ.
When to use this endpoint
Use /v1/chat/completions when an existing client of yours expects this request and response shape and you want a drop-in target without rewriting the call site. For new code we recommend /v1/messages — it has first-class support for content blocks, tools, thinking, and vision; this endpoint surfaces a subset of those.
Basic call
curl https://caicaini.com/v1/chat/completions \
-H "Authorization: Bearer cai_api_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "caicaini/auto",
"messages": [
{"role": "system", "content": "You are concise."},
{"role": "user", "content": "Two reasons to use TypeScript over JS in 2026?"}
],
"max_tokens": 256
}'Request fields
| Field | Type | Description |
|---|---|---|
| modelrequired | string | One of the five virtual ids. caicaini/auto is the default for general traffic. |
| messagesrequired | array | developers.docs.chatCompletions.fieldMessagesDesc |
| max_tokens | integer | Hard cap on completion tokens. Highly recommended. |
| temperature | number 0–2 | Default 1. |
| top_p | number 0–1 | Nucleus sampling. |
| stop | string | string[] | Up to four stop sequences. |
| stream | boolean | When true, the server streams SSE chunks. See Streaming. |
| tools | Tool[] | Function definitions in the chat-completions tool format. See Tools. |
| tool_choice | "none" | "auto" | object | Constrains tool selection. |
| response_format | { type: "json_object" } | Forces the model to return valid JSON. Pair with a schema instruction in the system prompt. |
| user | string | End-user identifier. Forwarded to abuse and reporting tooling. |
Response
response · 200 OK
{
"id": "chatcmpl_01H8fkx2N3p4q5r6s7t8u9v",
"object": "chat.completion",
"created": 1746748800,
"model": "caicaini/auto",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "1) Catch class-of-bug at compile time. 2) Editor knows your shape."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 31,
"completion_tokens": 22,
"total_tokens": 53,
"credits_consumed": 27
}
}choices[0].message.contentholds the assistant text reply.choices[0].finish_reasonis one ofstop,length,tool_calls,content_filter.usage.credits_consumedis authoritative — same value you would see on the equivalent/v1/messagescall.
Streaming on this endpoint
Streaming uses the chat-completions chunk format: an SSE feed of chat.completion.chunk events ending with the literal frame data: [DONE]. Full event reference and parsing examples are on the Streaming page.