Caicaini
Get started

Reference

Using HTTP clients

The Caicaini API is plain HTTP and JSON. There is no proprietary client to install. Use whatever HTTP library you already trust, or write a small wrapper around the patterns below.

Why no first-party SDK

The HTTP surface is small (six endpoints) and the request and response shapes are the industry-standard Messages API and chat-completions conventions. Wrapping them in a vendor SDK adds a dependency and an API of its own without saving you much work. We will publish a thin first-party client when it earns its keep.

In the meantime: any third-party client library that targets one of the conventions Caicaini implements will work as long as you can override its base URL.

Environment variables

Pull the key and the base URL from environment variables, never literals.

shell
# Set once, reuse everywhere.
export CAICAINI_API_KEY="cai_api_..."
export CAICAINI_BASE="https://caicaini.com/v1"

curl "$CAICAINI_BASE/messages" \
  -H "Authorization: Bearer $CAICAINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"caicaini/auto","max_tokens":256,"messages":[{"role":"user","content":"Hello!"}]}'

A thin client per language

Below is roughly 30 lines of code per language: a constructor, a non-stream helper, and a stream iterator. Copy, paste, modify. There is no magic.

# Set once, reuse everywhere.
export CAICAINI_API_KEY="cai_api_..."
export CAICAINI_BASE="https://caicaini.com/v1"

curl "$CAICAINI_BASE/messages" \
  -H "Authorization: Bearer $CAICAINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"caicaini/auto","max_tokens":256,"messages":[{"role":"user","content":"Hello!"}]}'

Recommendations

  • Use plain fetch on Node 20+, browsers, Cloudflare Workers, Deno, Bun. It is built in, supports streaming, and has the smallest footprint.
  • Use httpx in Python. It supports both sync and async, has streaming context managers, and handles connection pooling well for high-concurrency workloads.
  • Connection-pool across calls. The TLS handshake dominates latency on cold connections — keep a single client instance per process.
  • Set generous timeouts: 120 s for non-streaming calls, 300 s+ for long streams. The model takes time, especially with extended thinking.
  • Log the response id on every call. It is the fastest identifier for us to trace a request when you open a support ticket.

Browsers

Do not call the API from a browser with the user's real key. Every API key authenticates as your account. Proxy through a server you control: a simple Next.js Route Handler, a Cloudflare Worker, an Express endpoint — anything that keeps the key server-side.