Reference
Using HTTP clients
The Caicaini API is plain HTTP and JSON. There is no proprietary client to install. Use whatever HTTP library you already trust, or write a small wrapper around the patterns below.
Why no first-party SDK
The HTTP surface is small (six endpoints) and the request and response shapes are the industry-standard Messages API and chat-completions conventions. Wrapping them in a vendor SDK adds a dependency and an API of its own without saving you much work. We will publish a thin first-party client when it earns its keep.
In the meantime: any third-party client library that targets one of the conventions Caicaini implements will work as long as you can override its base URL.
Environment variables
Pull the key and the base URL from environment variables, never literals.
# Set once, reuse everywhere.
export CAICAINI_API_KEY="cai_api_..."
export CAICAINI_BASE="https://caicaini.com/v1"
curl "$CAICAINI_BASE/messages" \
-H "Authorization: Bearer $CAICAINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"caicaini/auto","max_tokens":256,"messages":[{"role":"user","content":"Hello!"}]}'A thin client per language
Below is roughly 30 lines of code per language: a constructor, a non-stream helper, and a stream iterator. Copy, paste, modify. There is no magic.
# Set once, reuse everywhere.
export CAICAINI_API_KEY="cai_api_..."
export CAICAINI_BASE="https://caicaini.com/v1"
curl "$CAICAINI_BASE/messages" \
-H "Authorization: Bearer $CAICAINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"caicaini/auto","max_tokens":256,"messages":[{"role":"user","content":"Hello!"}]}'Recommendations
- Use plain
fetchon Node 20+, browsers, Cloudflare Workers, Deno, Bun. It is built in, supports streaming, and has the smallest footprint. - Use
httpxin Python. It supports both sync and async, has streaming context managers, and handles connection pooling well for high-concurrency workloads. - Connection-pool across calls. The TLS handshake dominates latency on cold connections — keep a single client instance per process.
- Set generous timeouts: 120 s for non-streaming calls, 300 s+ for long streams. The model takes time, especially with extended thinking.
- Log the response
idon every call. It is the fastest identifier for us to trace a request when you open a support ticket.
Browsers
Do not call the API from a browser with the user's real key. Every API key authenticates as your account. Proxy through a server you control: a simple Next.js Route Handler, a Cloudflare Worker, an Express endpoint — anything that keeps the key server-side.