Endpoints

POST /v1/chat/completions

Secondary endpoint that mirrors the chat-completions convention. Every model and every credit-pool rule from /v1/messages applies — only the request and response shapes differ.

When to use this endpoint

Use /v1/chat/completions when an existing client of yours expects this request and response shape and you want a drop-in target without rewriting the call site. For new code we recommend /v1/messages — it has first-class support for content blocks, tools, thinking, and vision; this endpoint surfaces a subset of those.

Basic call

curl https://caicaini.com/v1/chat/completions \
  -H "Authorization: Bearer cai_api_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "caicaini/auto",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user",   "content": "Two reasons to use TypeScript over JS in 2026?"}
    ],
    "max_tokens": 256
  }'

Request fields

Field	Type	Description
modelrequired	string	One of the five virtual ids. caicaini/auto is the default for general traffic.
messagesrequired	array	developers.docs.chatCompletions.fieldMessagesDesc
max_tokens	integer	Hard cap on completion tokens. Highly recommended.
temperature	number 0–2	Default 1.
top_p	number 0–1	Nucleus sampling.
stop	string \| string[]	Up to four stop sequences.
stream	boolean	When true, the server streams SSE chunks. See Streaming.
tools	Tool[]	Function definitions in the chat-completions tool format. See Tools.
tool_choice	"none" \| "auto" \| object	Constrains tool selection.
response_format	{ type: "json_object" }	Forces the model to return valid JSON. Pair with a schema instruction in the system prompt.
user	string	End-user identifier. Forwarded to abuse and reporting tooling.

Response

response · 200 OK

{
  "id": "chatcmpl_01H8fkx2N3p4q5r6s7t8u9v",
  "object": "chat.completion",
  "created": 1746748800,
  "model": "caicaini/auto",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "1) Catch class-of-bug at compile time. 2) Editor knows your shape."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 31,
    "completion_tokens": 22,
    "total_tokens": 53,
    "credits_consumed": 27
  }
}

choices[0].message.content holds the assistant text reply.
choices[0].finish_reason is one of stop, length, tool_calls, content_filter.
usage.credits_consumed is authoritative — same value you would see on the equivalent /v1/messages call.

Streaming on this endpoint

Streaming uses the chat-completions chunk format: an SSE feed of chat.completion.chunk events ending with the literal frame data: [DONE]. Full event reference and parsing examples are on the Streaming page.

PreviousMessages

NextStreaming

POST /v1/chat/completions

When to use this endpoint#

Basic call#

Request fields#

Response#

Streaming on this endpoint#

When to use this endpoint

Basic call

Request fields

Response

Streaming on this endpoint