Caicaini
Get started

Endpoints

POST /v1/chat/completions

Secondary endpoint that mirrors the chat-completions convention. Every model and every credit-pool rule from /v1/messages applies — only the request and response shapes differ.

When to use this endpoint

Use /v1/chat/completions when an existing client of yours expects this request and response shape and you want a drop-in target without rewriting the call site. For new code we recommend /v1/messages — it has first-class support for content blocks, tools, thinking, and vision; this endpoint surfaces a subset of those.

Basic call

curl https://caicaini.com/v1/chat/completions \
  -H "Authorization: Bearer cai_api_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "caicaini/auto",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user",   "content": "Two reasons to use TypeScript over JS in 2026?"}
    ],
    "max_tokens": 256
  }'

Request fields

FieldTypeDescription
modelrequiredstringOne of the five virtual ids. caicaini/auto is the default for general traffic.
messagesrequiredarraydevelopers.docs.chatCompletions.fieldMessagesDesc
max_tokensintegerHard cap on completion tokens. Highly recommended.
temperaturenumber 0–2Default 1.
top_pnumber 0–1Nucleus sampling.
stopstring | string[]Up to four stop sequences.
streambooleanWhen true, the server streams SSE chunks. See Streaming.
toolsTool[]Function definitions in the chat-completions tool format. See Tools.
tool_choice"none" | "auto" | objectConstrains tool selection.
response_format{ type: "json_object" }Forces the model to return valid JSON. Pair with a schema instruction in the system prompt.
userstringEnd-user identifier. Forwarded to abuse and reporting tooling.

Response

response · 200 OK
{
  "id": "chatcmpl_01H8fkx2N3p4q5r6s7t8u9v",
  "object": "chat.completion",
  "created": 1746748800,
  "model": "caicaini/auto",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "1) Catch class-of-bug at compile time. 2) Editor knows your shape."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 31,
    "completion_tokens": 22,
    "total_tokens": 53,
    "credits_consumed": 27
  }
}
  • choices[0].message.content holds the assistant text reply.
  • choices[0].finish_reason is one of stop, length, tool_calls, content_filter.
  • usage.credits_consumed is authoritative — same value you would see on the equivalent /v1/messages call.

Streaming on this endpoint

Streaming uses the chat-completions chunk format: an SSE feed of chat.completion.chunk events ending with the literal frame data: [DONE]. Full event reference and parsing examples are on the Streaming page.