API reference

API quickstart

From zero to streaming chat in five minutes.

Last updated 17 May 2026

This guide gets you from a virtual key to a working integration in five minutes. You will create a virtual key, send a chat completion, stream a response, and generate embeddings.

Prerequisites

An enrolled Operayde appliance reachable at https://appliance.<your-domain>.
A tenant-admin account to create virtual keys.

1. Get a virtual key

Open the operator portal at https://portal.operayde.com, navigate to Keys > Create key, and create a key with the following settings:

Label: quickstart
Allowed models: select all available models
RPM limit: 60
TPD limit: 1,000,000

Copy the key secret. It looks like op_live_26f2_f9b8c4e2aa1345b7d3f27a901c55d88a.

Set it as an environment variable:

export OPERAYDE_KEY="op_live_26f2_..."
export OPERAYDE_BASE="https://appliance.example.com/v1"

2. Send a chat request

curl

curl -s -X POST "$OPERAYDE_BASE/chat/completions" \
  -H "Authorization: Bearer $OPERAYDE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "operayde/instruct-13b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is Operayde?"}
    ]
  }' | jq .

Python

import os
import requests
 
base = os.environ["OPERAYDE_BASE"]
key = os.environ["OPERAYDE_KEY"]
 
resp = requests.post(
    f"{base}/chat/completions",
    headers={"Authorization": f"Bearer {key}"},
    json={
        "model": "operayde/instruct-13b",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is Operayde?"},
        ],
    },
)
 
data = resp.json()
print(data["choices"][0]["message"]["content"])

TypeScript

const base = process.env.OPERAYDE_BASE!;
const key = process.env.OPERAYDE_KEY!;
 
const resp = await fetch(`${base}/chat/completions`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${key}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "operayde/instruct-13b",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "What is Operayde?" },
    ],
  }),
});
 
const data = await resp.json();
console.log(data.choices[0].message.content);

3. Stream a response

Set "stream": true to receive Server-Sent Events as the model generates tokens.

curl

curl -s -N -X POST "$OPERAYDE_BASE/chat/completions" \
  -H "Authorization: Bearer $OPERAYDE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "operayde/instruct-13b",
    "messages": [{"role": "user", "content": "Write a haiku about data privacy."}],
    "stream": true
  }'

Each SSE event contains a JSON chunk:

data: {"id":"chat-abc123","choices":[{"delta":{"content":"Your"},"index":0}]}

data: {"id":"chat-abc123","choices":[{"delta":{"content":" data"},"index":0}]}

...

data: [DONE]

Python (streaming)

import os
import requests
import json
 
base = os.environ["OPERAYDE_BASE"]
key = os.environ["OPERAYDE_KEY"]
 
resp = requests.post(
    f"{base}/chat/completions",
    headers={"Authorization": f"Bearer {key}"},
    json={
        "model": "operayde/instruct-13b",
        "messages": [{"role": "user", "content": "Write a haiku about data privacy."}],
        "stream": True,
    },
    stream=True,
)
 
for line in resp.iter_lines():
    if not line:
        continue
    text = line.decode("utf-8")
    if text.startswith("data: ") and text != "data: [DONE]":
        chunk = json.loads(text[6:])
        delta = chunk["choices"][0].get("delta", {})
        if "content" in delta:
            print(delta["content"], end="", flush=True)
 
print()

TypeScript (streaming)

const resp = await fetch(`${base}/chat/completions`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${key}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "operayde/instruct-13b",
    messages: [{ role: "user", content: "Write a haiku about data privacy." }],
    stream: true,
  }),
});
 
const reader = resp.body!.getReader();
const decoder = new TextDecoder();
 
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
 
  const text = decoder.decode(value);
  for (const line of text.split("\n")) {
    if (line.startsWith("data: ") && line !== "data: [DONE]") {
      const chunk = JSON.parse(line.slice(6));
      const content = chunk.choices[0]?.delta?.content;
      if (content) process.stdout.write(content);
    }
  }
}

4. Generate embeddings

curl -s -X POST "$OPERAYDE_BASE/embeddings" \
  -H "Authorization: Bearer $OPERAYDE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "operayde/embed-bge",
    "input": [
      "Operayde keeps your data on-premise.",
      "Virtual keys control API access."
    ]
  }' | jq '.data[0].embedding[:5]'

The response contains a dense vector for each input string:

{
  "model": "operayde/embed-bge",
  "data": [
    {"index": 0, "embedding": [0.0123, -0.0456, ...]},
    {"index": 1, "embedding": [0.0789, 0.0012, ...]}
  ]
}

5. Using with OpenAI client libraries

Since the gateway is OpenAI-compatible, you can use existing client libraries by changing the base URL.

Python (openai library)

import os
from openai import OpenAI
 
client = OpenAI(
    base_url=os.environ["OPERAYDE_BASE"],
    api_key=os.environ["OPERAYDE_KEY"],
)
 
response = client.chat.completions.create(
    model="operayde/instruct-13b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain data residency in one sentence."},
    ],
)
 
print(response.choices[0].message.content)

TypeScript (openai library)

import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: process.env.OPERAYDE_BASE,
  apiKey: process.env.OPERAYDE_KEY,
});
 
const response = await client.chat.completions.create({
  model: "operayde/instruct-13b",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain data residency in one sentence." },
  ],
});
 
console.log(response.choices[0].message.content);

Error handling

All errors follow a standard envelope:

{
  "error": {
    "code": "policy_denied",
    "message": "Key 'quickstart' lacks scope for model 'operayde/large-70b'.",
    "type": "policy",
    "request_id": "req_01HZ..."
  }
}

HTTP status	Code	Meaning
400	`invalid_request`	Malformed JSON or missing required fields
401	`unauthorized`	Missing, expired, or revoked virtual key
403	`policy_denied`	OPA policy denied the request
429	`rate_limited`	RPM or TPD budget exhausted
500	`internal_error`	Server error — retry with backoff
503	`model_unavailable`	Requested model not loaded on this appliance

Next steps

Explore the full Gateway API reference for all endpoints and options.
Set up per-team virtual keys with appropriate rate limits.
Configure PII redaction to protect sensitive data.
Review rate limits to understand budget enforcement.