Caicaini
Get started

Reference

Errors

All non-2xx responses share one envelope. The HTTP status and the error.type field together tell you what to do — retry, back off, fix the input, or stop.

Error envelope

Every error response is JSON with the same shape. The type at the top is always the literal string "error"; the meaningful part is error.type.

error response
{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens must be a positive integer."
  }
}

Status codes

HTTPtypeWhen you see it
400invalid_request_errorThe request body is malformed or violates a constraint (unknown model, max_tokens too high, mis-shaped messages array). Do not retry — fix the input.
401authentication_errorMissing, malformed, or invalid API key. Do not retry as-is — re-authenticate.
402insufficient_quotaYour apiCredits balance is below what this turn requires. Hard stop. Top up at /developers/billing.
403permission_errorThe key authenticated, but is not allowed for this surface (CLI key on /v1/*, suspended account, scope mismatch).
404not_found_errorThe path or resource does not exist. Check the URL and the model id.
413request_too_largeThe body exceeded the 16 MB request cap. Split the payload or downscale images.
429rate_limit_errorPer-key rate limit hit. Honor Retry-After. Does not consume credits.
500api_errorInternal failure on our side. Retry with backoff. Idempotent.
502api_errorUpstream provider returned an error. Retry with backoff.
503api_errorService temporarily unavailable (deploy, maintenance). Retry with backoff.
504api_errorProvider timed out. Retry with backoff. Consider lowering max_tokens.
529overloaded_errorCapacity-constrained on the upstream model. Retry with backoff. Often clears in a few seconds.

How to handle each class

  • Retry-safe: 429, 500, 502, 503, 504, 529. Use exponential backoff with jitter. Honor Retry-After when present. Cap at 4–6 attempts.
  • Hard stop: 402 (insufficient_quota) and 401/403 (authentication_error, permission_error). No amount of retries will help. Surface to a human or alerting channel.
  • Fix-the-input: 400 (invalid_request_error) and 413. Validate before send next time; bug fix on the caller side.

Message ids

Every successful response carries an id field (e.g. msg_01H8fkx2N3p4q5r6s7t8u9v0wx on /v1/messages, chatcmpl_... on /v1/chat/completions). Save the id on every call you make. When you open a support ticket the id is the fastest way for us to trace the full lifecycle of the request: which provider was selected, what was reserved, and what was actually billed.

A small reusable handler

Wrap every call site in a function that classifies the response into one of the three buckets above. The example below distinguishes insufficient_quota (don't retry) from rate_limit_error (retry) from transient provider errors (retry). The caller decides retry strategy.

# Inspect the type field to decide what to do.
status=$(curl -s -o /tmp/body -w "%{http_code}" \
  -X POST https://caicaini.com/v1/messages \
  -H "Authorization: Bearer cai_api_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"caicaini/auto","max_tokens":50,"messages":[{"role":"user","content":"Hi"}]}')

errtype=$(jq -r '.error.type // empty' /tmp/body 2>/dev/null)
echo "status=$status type=$errtype"

case "$status$errtype" in
  200*)                                  echo "ok" ;;
  402*|*insufficient_quota)              echo "top up at /developers/billing"; exit 2 ;;
  429*|*rate_limit_error)                echo "throttled — sleep, then retry" ;;
  500*|502*|503*|504*|529*)              echo "transient — retry with backoff" ;;
  *)                                     echo "fatal — fix the request"; exit 1 ;;
esac