API Documentation
ApiTopMix is a unified AI model aggregation gateway. Access GPT, Claude, Gemini, DeepSeek, and more through a single, OpenAI-compatible API.
Getting Started
Welcome to the ApiTopMix API. Our gateway provides a single, unified interface to interact with the world's leading AI models. The API is fully compatible with the OpenAI SDK, so you can switch with a single line change.
Base URL
Quick Start
Get up and running in under a minute:
- Sign up at apitopmix.com and generate an API key from your dashboard.
- Set the base URL to
https://apitopmix.comand add your API key to theAuthorizationheader. - Make your first request using any OpenAI-compatible SDK or a simple cURL call.
curl https://apitopmix.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4.1", "messages": [{"role": "user", "content": "Hello!"}] }'
base_url to https://apitopmix.com/v1 in any OpenAI SDK client. No other code changes needed./v1/messages. Set base_url to https://apitopmix.com in the Anthropic SDK. See Claude Messages API for details.Authentication
All API requests require authentication via a Bearer token in the Authorization header.
Authorization: Bearer sk-your-api-key-here
| Header | Value | Required |
|---|---|---|
Authorization | Bearer YOUR_API_KEY | Required |
Content-Type | application/json | Required |
Chat Completions
Create a chat completion with any supported model. This endpoint is compatible with the OpenAI Chat Completions API.
Request Body
| Parameter | Type | Description |
|---|---|---|
model | string | Model ID to use, e.g. gpt-4.1, claude-sonnet-4 Required |
messages | array | Array of message objects with role and content Required |
max_tokens | integer | Maximum number of tokens to generate Optional |
temperature | number | Sampling temperature between 0 and 2. Default: 1.0 Optional |
stream | boolean | Enable Server-Sent Events streaming. Default: false Optional |
top_p | number | Nucleus sampling parameter. Default: 1.0 Optional |
Message Object
| Field | Type | Description |
|---|---|---|
role | string | One of system, user, or assistant |
content | string | The text content of the message |
Response Format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1711234567,
"model": "gpt-4.1",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}
Code Examples
curl https://apitopmix.com/v1/chat/completions \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4.1", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum computing in simple terms."} ], "max_tokens": 500, "temperature": 0.7 }'
from openai import OpenAI client = OpenAI( api_key="sk-your-api-key", base_url="https://apitopmix.com/v1" ) response = client.chat.completions.create( model="claude-sonnet-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum computing."} ], max_tokens=500, temperature=0.7 ) print(response.choices[0].message.content)
import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sk-your-api-key', baseURL: 'https://apitopmix.com/v1' }); const response = await client.chat.completions.create({ model: 'gemini-3.1-pro-preview', messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'Explain quantum computing.' } ], max_tokens: 500, temperature: 0.7 }); console.log(response.choices[0].message.content);
Streaming (SSE)
Enable real-time token streaming by setting stream: true. The response is delivered as Server-Sent Events.
from openai import OpenAI client = OpenAI( api_key="sk-your-api-key", base_url="https://apitopmix.com/v1" ) stream = client.chat.completions.create( model="gpt-4.1", messages=[{"role": "user", "content": "Write a poem about AI."}], stream=True ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")
Each SSE event contains a JSON chunk:
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"}}]}
data: [DONE]
Claude Messages API
In addition to the OpenAI-compatible endpoint above, ApiTopMix also natively supports the Anthropic Messages API format. If you are already using the Anthropic SDK, you can connect directly without any format conversion.
/v1/chat/completions endpoint (shown above), or via the native Anthropic /v1/messages endpoint described here. Both use the same API key and same pricing.Authentication
The Anthropic format uses the x-api-key header (instead of Authorization: Bearer). You must also include the anthropic-version header.
| Header | Value | Required |
|---|---|---|
x-api-key | YOUR_API_KEY | Required |
anthropic-version | 2023-06-01 | Required |
Content-Type | application/json | Required |
Request Body
| Parameter | Type | Description |
|---|---|---|
model | string | Claude model ID, e.g. claude-sonnet-4-6 Required |
messages | array | Array of message objects with role and content Required |
max_tokens | integer | Maximum number of tokens to generate Required |
system | string | System prompt (passed as a top-level field, not inside messages) Optional |
temperature | number | Sampling temperature between 0 and 1. Default: 1.0 Optional |
stream | boolean | Enable Server-Sent Events streaming. Default: false Optional |
top_p | number | Nucleus sampling parameter Optional |
Supported Models
| Model ID | Description |
|---|---|
claude-opus-4-6 | Most capable Claude model |
claude-opus-4-6-thinking | Opus with extended thinking |
claude-opus-4-5-20251101 | Claude Opus 4.5 |
claude-opus-4-5-20251101-thinking | Opus 4.5 with extended thinking |
claude-sonnet-4-6 | Best balance of speed and capability |
claude-sonnet-4-6-thinking | Sonnet with extended thinking |
claude-sonnet-4-5-20250929 | Claude Sonnet 4.5 |
claude-sonnet-4-5-20250929-thinking | Sonnet 4.5 with extended thinking |
claude-haiku-4-5-20251001 | Fastest and most affordable Claude model |
claude-haiku-4-5-20251001-thinking | Haiku with extended thinking |
Response Format
{
"id": "msg_01abc123",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! How can I help you today?"
}
],
"model": "claude-sonnet-4-6",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 12,
"output_tokens": 9
}
}
Code Examples
curl https://apitopmix.com/v1/messages \ -H "x-api-key: $API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 500, "system": "You are a helpful assistant.", "messages": [ {"role": "user", "content": "Explain quantum computing in simple terms."} ] }'
import anthropic client = anthropic.Anthropic( api_key="sk-your-api-key", base_url="https://apitopmix.com" ) message = client.messages.create( model="claude-sonnet-4-6", max_tokens=500, system="You are a helpful assistant.", messages=[ {"role": "user", "content": "Explain quantum computing."} ] ) print(message.content[0].text)
import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic({ apiKey: 'sk-your-api-key', baseURL: 'https://apitopmix.com' }); const message = await client.messages.create({ model: 'claude-sonnet-4-6', max_tokens: 500, system: 'You are a helpful assistant.', messages: [ { role: 'user', content: 'Explain quantum computing.' } ] }); console.log(message.content[0].text);
Streaming
Enable real-time streaming by setting stream: true.
import anthropic client = anthropic.Anthropic( api_key="sk-your-api-key", base_url="https://apitopmix.com" ) with client.messages.stream( model="claude-sonnet-4-6", max_tokens=500, messages=[{"role": "user", "content": "Write a poem about AI."}] ) as stream: for text in stream.text_stream: print(text, end="")
base_url to https://apitopmix.com (without /v1) in the Anthropic SDK client. The SDK appends /v1/messages automatically.- Auth header:
x-api-keyinstead ofAuthorization: Bearer - System prompt: top-level
systemfield, not a message withrole: "system" - Response:
contentis an array of content blocks, not a single string max_tokensis required, not optional- Usage fields:
input_tokens/output_tokensinstead ofprompt_tokens/completion_tokens
Models
List all available models through the API. ApiTopMix aggregates models from leading AI providers.
curl https://apitopmix.com/v1/models \
-H "Authorization: Bearer $API_KEY"
Supported Providers & Models
Claude Series
Opus 4.6 / 4.5, Sonnet 4.6 / 4.5, Haiku 4.5
Includes standard and thinking modes
Gemini Series
Gemini 3.1 Pro, Gemini 3 Flash
Latest Gemini multimodal models
DeepSeek
V3.2, V3.1, R1 Distill series
Via NVIDIA NIM
Llama Series
Llama 4 Maverick/Scout, Llama 3.3 70B, Llama 3.1 405B
Free via NVIDIA NIM
Mistral Series
Large 3 675B, Medium 3, Small 4, Devstral 2
Via NVIDIA NIM
Kimi Series
Kimi K2.5, K2 Instruct, K2 Thinking
Via NVIDIA NIM
Qwen Series
Qwen 3.5 397B, Qwen 3 Coder 480B, QwQ 32B
Via NVIDIA NIM
More Models
GLM-5, MiniMax M2.5, Phi-4, GPT-OSS, Nemotron Ultra
50+ models available
Pricing
| Model | Input $/1M | Output $/1M | Notes |
|---|---|---|---|
| Anthropic Claude (Official 60% off) | |||
claude-opus-4-6 | $3.00 | $15.00 | |
claude-opus-4-6-thinking | $3.00 | $15.00 | Extended thinking |
claude-sonnet-4-6 | $1.80 | $9.00 | |
claude-sonnet-4-6-thinking | $1.80 | $9.00 | Extended thinking |
claude-haiku-4-5-20251001 | $0.60 | $3.00 | |
| Google Gemini (Official 60% off) | |||
gemini-3.1-pro-preview | $1.20 | $7.20 | Latest Gemini Pro |
gemini-3-flash-preview | $0.30 | $1.80 | Fast & affordable |
| NVIDIA NIM Models (Official 20% off / Free) | |||
meta/llama-3.3-70b-instruct | FREE | Meta Llama 3.3 | |
meta/llama-3.1-405b-instruct | FREE | 405B parameters | |
qwen/qwen3.5-397b-a17b | $0.02 | $0.06 | Qwen 3.5 |
mistralai/mistral-large-3-675b-instruct-2512 | $0.10 | $0.30 | Mistral Large 3 |
moonshotai/kimi-k2.5 | $0.12 | $0.60 | Kimi K2.5 |
z-ai/glm5 | $0.20 | $0.64 | GLM-5 |
/v1/models endpoint to get the complete list of model IDs. Pricing for all models is available on the Pricing page (login required).Embeddings
Generate vector embeddings for text inputs. Useful for search, clustering, and retrieval-augmented generation (RAG).
Request Body
| Parameter | Type | Description |
|---|---|---|
model | string | Embedding model ID Required |
input | string | array | Text(s) to embed Required |
encoding_format | string | float or base64. Default: float Optional |
Supported Models
| Model | Dimensions | Best For |
|---|---|---|
text-embedding-3-large | 3072 | Highest accuracy, retrieval tasks |
text-embedding-3-small | 1536 | Balanced performance and cost |
text-embedding-ada-002 | 1536 | Legacy support |
from openai import OpenAI client = OpenAI( api_key="sk-your-api-key", base_url="https://apitopmix.com/v1" ) response = client.embeddings.create( model="text-embedding-3-small", input="The quick brown fox jumps over the lazy dog." ) embedding = response.data[0].embedding print(f"Dimensions: {len(embedding)}") # 1536
Response Format
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023, -0.0091, 0.0152, ...]
}
],
"model": "text-embedding-3-small",
"usage": { "prompt_tokens": 9, "total_tokens": 9 }
}
Music Generation (Suno API)
Generate music tracks using AI. Submit a music generation task and poll for results.
Submit Music Generation
Request Body
| Parameter | Type | Description |
|---|---|---|
prompt | string | Text prompt describing the desired music Required |
style | string | Music genre/style, e.g. "jazz", "electronic", "pop" Optional |
title | string | Title for the generated track Optional |
lyrics | string | Custom lyrics for the track Optional |
make_instrumental | boolean | Generate without vocals. Default: false Optional |
curl -X POST https://apitopmix.com/suno/submit/music \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "A dreamy lo-fi beat with soft piano and rain sounds", "style": "lo-fi", "title": "Rainy Afternoon", "make_instrumental": true }'
Submit Response
{
"code": 200,
"message": "success",
"data": {
"task_id": "suno_abc123def456"
}
}
Fetch Generation Result
Poll this endpoint with the task_id to check generation progress and retrieve the result.
curl https://apitopmix.com/suno/fetch/suno_abc123def456 \
-H "Authorization: Bearer $API_KEY"
{
"code": 200,
"data": {
"task_id": "suno_abc123def456",
"status": "completed",
"tracks": [
{
"title": "Rainy Afternoon",
"audio_url": "https://cdn.apitopmix.com/audio/...",
"duration": 120,
"style": "lo-fi"
}
]
}
}
status is "completed" or "failed".Image Generation
Generate high-quality images from text prompts using AI. Compatible with the OpenAI DALL-E API format.
Request Body
| Parameter | Type | Description |
|---|---|---|
model | string | "nano-banana-2" Required |
prompt | string | Text description of the image to generate Required |
n | integer | Number of images (default: 1) Optional |
size | string | Image size, e.g. "1024x1024" Optional |
image_size | string | Resolution: "1K", "2K", or "4K". Default: 1K Optional |
Supported Models
| Model | Price | Features |
|---|---|---|
nano-banana-2 | $0.082/image | Text-to-image, image-to-image, multi-image input. Supports 1K/2K/4K output. |
curl -X POST https://apitopmix.com/v1/images/generations \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "nano-banana-2",
"prompt": "A majestic snow leopard on a mountain peak at sunrise, digital painting",
"n": 1,
"size": "1024x1024"
}'
Response
{
"created": 1711234567,
"data": [
{
"url": "https://example.com/generated-image.jpg"
}
]
}
Python Example
from openai import OpenAI
client = OpenAI(
api_key="sk-your-api-key",
base_url="https://apitopmix.com/v1"
)
response = client.images.generate(
model="nano-banana-2",
prompt="A futuristic city with flying cars at sunset, cyberpunk style",
n=1,
size="1024x1024"
)
image_url = response.data[0].url
print(f"Image: {image_url}")
image_size: "2K" or "4K" for higher resolution output. Provide detailed prompts with art style, lighting, and composition for best results. Failed generations are not charged.Video Generation
Generate short AI videos from text prompts. Video generation is asynchronous — submit a task, then poll for results.
Submit Video Generation
Request Body
| Parameter | Type | Description |
|---|---|---|
model | string | "grok-video-3" Required |
prompt | string | Text description of the video Required |
duration | integer | Length in seconds (5-15). Default: 8 Optional |
resolution | string | "480p" or "720p". Default: 480p Optional |
aspect_ratio | string | "16:9", "9:16", "1:1". Default: 16:9 Optional |
curl -X POST https://apitopmix.com/v1/videos \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "grok-video-3",
"prompt": "A golden retriever running on a beach at sunset, cinematic",
"duration": 10,
"resolution": "720p"
}'
Submit Response
{
"id": "task_abc123xyz",
"object": "video",
"status": "queued"
}
Poll Video Status
curl https://apitopmix.com/v1/videos/task_abc123xyz \ -H "Authorization: Bearer $API_KEY"
Completed Response
{
"code": "success",
"data": {
"status": "SUCCESS",
"progress": "100%",
"data": {
"output": "https://example.com/generated-video.mp4"
}
}
}
Python Example
import requests, time
BASE = "https://apitopmix.com/v1"
KEY = "sk-your-api-key"
H = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
# Submit
r = requests.post(f"{BASE}/videos", headers=H, json={
"model": "grok-video-3",
"prompt": "A cat watching fish in an aquarium",
"duration": 10, "resolution": "720p"
})
task_id = r.json()["id"]
print(f"Submitted: {task_id}")
# Poll
while True:
s = requests.get(f"{BASE}/videos/{task_id}", headers=H).json()
d = s.get("data", s)
print(f"Status: {d.get('status')} | {d.get('progress')}")
if d.get("status") == "SUCCESS":
print(f"Video: {d['data']['output']}")
break
elif d.get("status") == "FAILURE":
print(f"Failed: {d.get('fail_reason')}")
break
time.sleep(10)
Pricing
| Model | Price | Duration | Resolution |
|---|---|---|---|
grok-video-3 | $0.030/sec (40% off official) | 1-15 seconds | 480p / 720p |
| Duration | Cost |
|---|---|
| 5 seconds | $0.150 |
| 8 seconds | $0.240 |
| 10 seconds | $0.300 |
| 15 seconds | $0.450 |
Error Codes
All errors return a consistent JSON structure with an error object.
Error Response Format
{
"error": {
"message": "Invalid API key provided.",
"type": "authentication_error",
"code": "invalid_api_key"
}
}
Common Error Codes
| HTTP Status | Error Type | Description |
|---|---|---|
| 400 | invalid_request_error | The request body is malformed or missing required parameters. |
| 401 | authentication_error | Invalid or missing API key. Check your Authorization header. |
| 403 | permission_error | Your API key does not have permission to access this resource. |
| 404 | not_found_error | The requested resource or endpoint does not exist. |
| 429 | rate_limit_error | Too many requests. Slow down and retry with exponential backoff. |
| 500 | server_error | An unexpected error occurred on our servers. Try again later. |
| 503 | service_unavailable | The upstream model provider is temporarily unavailable. |
429 and 5xx errors, implement exponential backoff starting with a 1-second delay, doubling on each retry up to a maximum of 60 seconds.Rate Limits
ApiTopMix enforces rate limits to ensure fair usage and platform stability. Limits vary by plan and are applied per API key.
Rate Limit Headers
Every response includes headers indicating your current rate limit status:
| Header | Description |
|---|---|
x-ratelimit-limit-requests | Maximum requests allowed in the current window |
x-ratelimit-remaining-requests | Remaining requests in the current window |
x-ratelimit-reset-requests | Time (in seconds) until the request limit resets |
x-ratelimit-limit-tokens | Maximum tokens allowed per minute |
x-ratelimit-remaining-tokens | Remaining tokens in the current window |
Default Limits
| Tier | RPM (Requests/Min) | TPM (Tokens/Min) |
|---|---|---|
| Free | 10 | 40,000 |
| Standard | 60 | 200,000 |
| Pro | 300 | 1,000,000 |
| Enterprise | Custom | Custom |
Best Practices
- Monitor the
x-ratelimit-remaining-requestsheader to stay within your limits. - Implement exponential backoff when receiving
429responses. - Cache responses when possible to reduce unnecessary API calls.
- Use streaming for long completions to avoid timeout issues.
- Batch embedding requests by sending multiple texts in a single call.
ApiTopMix