Initiate Async Chat Stream

Query Parameters

message_id

string | null

Optional custom identifier for the stream. If not provided, a unique ID will be generated automatically.

Body

application/json

Request body for initiating asynchronous chat completion with multi-turn conversation support.

Contains message history, tool definitions, system prompts, and response configuration. The request initiates a background process that streams events, which can be observed via the stream endpoint.

The request body defines the complete conversation context and AI behavior parameters for generating asynchronous responses.

Chat request body model for handling chat interactions.

messages

MessageInput · object[]

required

Array of messages composing the chat conversation. Each message should have a 'role' (user, or assistant) and 'content'.

Show child attributes

model

string | null

Model to use for the chat completion. If not provided, the default model will be used.

stream

boolean

default:false

Whether to stream the response back to the client.

tools

ToolSpec · object[] | null

List of tools to use for the response.

Show child attributes

tool_choice

ToolChoice · object

Define how the model should choose tools for the response.

Show child attributes

tool_context

Context to provide to the tools, such as documents, databases connection strings, or data relevant to tool usage.

Input for base64 encoded files.

FileArtifact
UriArtifact
TextArtifact
IngestedArtifact
SqlDatabaseArtifact

Show child attributes

mcp_servers

McpServerConfig · object[]

List of MCP servers to use for tool retrieval. Each server can have its own configuration.

Show child attributes

response_format

ResponseFormat · object

Format of the response. Can be text, json_schema, or tool call.

Show child attributes

system

System · object

System message configuration, including default prompt and citations.

Show child attributes

thinking

Thinking · object

Thinking configuration, enabling reasoning capabilities for the model.

Show child attributes

priority

integer | null

Priority of the request, used for prioritizing responses.

seed

integer | null

Random seed for reproducibility.

min_p

number | null

Minimum probability threshold for token selection. Tokens with probability below this value are filtered out.

top_p

number | null

Nucleus sampling parameter. Only tokens with cumulative probability up to this value are considered.

temperature

number | null

Controls randomness in generation. Higher values make output more random, lower values more deterministic.

top_k

integer | null

Limits token selection to the top K most likely tokens at each step.

repetition_penalty

number | null

Penalty applied to tokens that have already appeared in the sequence to reduce repetition.

presence_penalty

number | null

Penalty applied based on whether a token has appeared in the text, encouraging topic diversity.

frequency_penalty

number | null

Penalty applied based on how frequently a token appears in the text, reducing repetitive content.

max_tokens

integer | null

Maximum number of tokens to generate in the response.

correlation_id

string | null

Correlation ID for tracking the request across systems.

Response

Chat stream initiated successfully

Response model for initiated asynchronous chat completion streams

message_id

string

required

Unique identifier for the initiated stream

status

enum<string>

required

Initial status of the stream (typically 'pending')

Available options:

pending,

processing,

completed,

failed,

cancelled,

error

message

string

default:Request initiated successfully

Confirmation message for successful stream initiation

API Documentation

Tools

Examples

ZylonGPT API

Workspace API

Initiate Async Chat Stream

Query Parameters

Body

Response