Skip to main content
POST
/
v1
/
messages
{
"id": "msg_12345",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"start_timestamp": "2025-07-10T09:11:16.003615Z",
"stop_timestamp": "2025-07-10T09:11:28.048942Z",
"text": "Based on the analysis, I found 3 potential security issues..."
}
],
"model": "private-gpt",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 432,
"output_tokens": 89
}
}

Body

application/json

Request body for chat completion supporting multi-turn conversations with AI models.

Contains message history, tool definitions, system prompts, and response configuration. Supports both streaming and non-streaming responses with optional tool usage, citations, and advanced sampling parameters.

The request body defines the complete conversation context and AI behavior parameters for generating responses.

Chat request body model for handling chat interactions.

messages
MessageInput · object[]
required

Array of messages composing the chat conversation. Each message should have a 'role' (user, or assistant) and 'content'.

model
string | null

Model to use for the chat completion. If not provided, the default model will be used.

stream
boolean
default:false

Whether to stream the response back to the client.

tools
ToolSpec · object[] | null

List of tools to use for the response.

tool_choice
object

Define how the model should choose tools for the response.

tool_context
Tool Context · array

Context to provide to the tools, such as documents, databases connection strings, or data relevant to tool usage.

  • FileArtifact
  • UriArtifact
  • TextArtifact
  • IngestedArtifact
  • SqlDatabaseArtifact
mcp_servers
McpServerConfig · object[]

List of MCP servers to use for tool retrieval. Each server can have its own configuration.

response_format
object

Format of the response. Can be text, json_schema, or tool call.

system
object

System message configuration, including default prompt and citations.

thinking
object

Thinking configuration, enabling reasoning capabilities for the model.

priority
integer | null

Priority of the request, used for prioritizing responses.

seed
integer | null

Random seed for reproducibility.

min_p
number | null

Minimum probability threshold for token selection. Tokens with probability below this value are filtered out.

top_p
number | null

Nucleus sampling parameter. Only tokens with cumulative probability up to this value are considered.

temperature
number | null

Controls randomness in generation. Higher values make output more random, lower values more deterministic.

top_k
integer | null

Limits token selection to the top K most likely tokens at each step.

repetition_penalty
number | null

Penalty applied to tokens that have already appeared in the sequence to reduce repetition.

presence_penalty
number | null

Penalty applied based on whether a token has appeared in the text, encouraging topic diversity.

frequency_penalty
number | null

Penalty applied based on how frequently a token appears in the text, reducing repetitive content.

max_tokens
integer | null

Maximum number of tokens to generate in the response.

correlation_id
string | null

Correlation ID for tracking the request across systems.

Response

Successful chat message

AI response message with content and metadata.

id
string

Unique identifier for this message response

type
string
default:message

Response type identifier indicating this is a message response

Allowed value: "message"
role
string
default:assistant

The role of the response sender, always 'assistant' for AI responses

Allowed value: "assistant"
content
Content · array

List of content blocks that make up the AI's response

  • TextBlock
  • ImageBlock
  • AudioBlock
  • BinaryBlock
  • ResourceLinkBlock
  • ResourceBlock
  • SourceBlock
  • ToolUseBlock
  • ThinkingBlock
  • TLDRBlock
  • ToolResultBlock
model
string
default:private-gpt

Identifier of the AI model used to generate this response

stop_reason
enum<string> | null

Reason why the AI stopped generating content Enumeration for AI response stop reasons.

Available options:
end_turn,
max_tokens,
stop_sequence,
tool_use,
pause_turn,
refusal
stop_sequence
string | null

The specific stop sequence that triggered response completion

usage
object

Token usage statistics for this interaction