Chat completions - Weights & Biases Documentation

Create a chat completion using the /chat/completions endpoint. This endpoint follows the OpenAI format for sending messages and receiving responses.

Requirements

To create a chat completion, provide:

The Inference service base URL: https://api.inference.wandb.ai/v1.
Your W&B API key: [YOUR-API-KEY].
Your W&B team and project: [YOUR-TEAM]/[YOUR-PROJECT] (optional).
A model ID from the available models.

Request examples

The following examples show how to send a chat completion request using Python and curl. Replace the placeholder values with your own API key, optional team and project, and a model ID.

Python
Bash

import openai

client = openai.OpenAI(
    # The custom base URL points to Serverless Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Create an API key at https://wandb.ai/settings
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="[YOUR-API-KEY]",

    # Optional: Team and project for usage tracking
    project="[YOUR-TEAM]/[YOUR-PROJECT]",
)

# Replace [MODEL-ID] with any model ID from the available models list
response = client.chat.completions.create(
    model="[MODEL-ID]",
    messages=[
        {"role": "system", "content": "[YOUR-SYSTEM-PROMPT]"},
        {"role": "user", "content": "[YOUR-PROMPT]"}
    ],
)

print(response.choices[0].message.content)

curl https://api.inference.wandb.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer [YOUR-API-KEY]" \
  -H "OpenAI-Project: [YOUR-TEAM]/[YOUR-PROJECT]" \
  -d '{
    "model": "[MODEL-ID]",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Tell me a joke." }
    ]
  }'

Response format

A successful request returns a response in the following OpenAI-compatible format, including the generated assistant message and token usage details.

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "meta-llama/Llama-3.1-8B-Instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here's a joke for you..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 50,
    "total_tokens": 75
  }
}

Documentation Index

​Requirements

​Request examples

​Response format

Requirements

Request examples

Response format