Messages is the core of ZylonGPT. You send a conversation history, and Zylon returns the next assistant message as a list of content blocks (text, tool calls, tool results, and more).
Prerequisites
Basic request and response
Send one user message and get back one assistant message.
Single message
Content blocks
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 32,
"temperature": 0,
"messages": [
{ "role": "user", "content": "Write a one-sentence status update about deploying a billing dashboard and fixing three bugs." }
]
}'
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 32,
"temperature": 0,
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Write a one-sentence status update about deploying a billing dashboard and fixing three bugs." }
]
}
]
}'
{
"id" : "msg_abf981cf3f1c449eacb7a85c064ca505" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"start_timestamp" : "2026-02-06T13:23:16.876744Z" ,
"stop_timestamp" : "2026-02-06T13:23:16.975362Z" ,
"text" : "Deployed the new billing dashboard and resolved three bugs successfully."
}
],
"model" : "private-gpt" ,
"stop_reason" : "end_turn" ,
"stop_sequence" : null ,
"usage" : {
"input_tokens" : 95 ,
"output_tokens" : 16
}
}
To render plain text, concatenate the text values from any blocks where type is "text".
Messages shape
At minimum, send:
Field Required Description modelYes Model name to use (commonly "default"). messagesYes Conversation history. max_tokensYes Maximum tokens to generate.
Each message has:
Field Description role"user" or "assistant".contentA string, or an array of content blocks.
Validate a request
Use /messages/validate to check payload shape before sending a real request. It returns valid: true or a list of validation errors.
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages/validate" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 32,
"messages": [
{ "role": "user", "content": "Validate this payload." }
]
}'
Example response:
{ "valid" : true , "errors" : null }
Multi-turn conversations
To continue a conversation, include the prior turns in messages:
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 16,
"temperature": 0,
"messages": [
{ "role": "user", "content": "My name is Ada." },
{ "role": "assistant", "content": "Nice to meet you." },
{ "role": "user", "content": "What is my name? Reply with only the name, no punctuation." }
]
}'
{
"id" : "msg_9b14c8fd514741d29a0dff8996ca9213" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"start_timestamp" : "2026-02-06T10:34:12.924887Z" ,
"stop_timestamp" : "2026-02-06T10:34:12.974785Z" ,
"text" : "Ada"
}
],
"model" : "private-gpt" ,
"stop_reason" : "end_turn" ,
"stop_sequence" : null ,
"usage" : { "input_tokens" : 135 , "output_tokens" : 2 }
}
System prompts
Use the top-level system field to control assistant behavior (instead of a "system" role message):
Prefer the top-level system field for system instructions so they are applied consistently across the whole request.
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 8,
"temperature": 0,
"system": { "text": "You are a support agent. Reply with only a short ticket title.", "use_default_prompt": false },
"messages": [
{ "role": "user", "content": "The login button on mobile does not work." }
]
}'
{
"id" : "msg_4e328dc4baeb42b8a18dc0c4abd89468" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"start_timestamp" : "2026-02-06T13:37:36.368426Z" ,
"stop_timestamp" : "2026-02-06T13:37:36.446733Z" ,
"text" : "Mobile login button not working"
}
],
"model" : "private-gpt" ,
"stop_reason" : "end_turn" ,
"stop_sequence" : null ,
"usage" : { "input_tokens" : 100 , "output_tokens" : 6 }
}
Structured JSON output (json_schema)
If you need machine-readable output, set response_format.type to json_schema and provide a schema. Zylon returns JSON as text in a text content block.
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 128,
"response_format": {
"type": "json_schema",
"json_schema": {
"type": "object",
"additionalProperties": false,
"properties": {
"status": { "type": "string" },
"next_step": { "type": "string" }
},
"required": ["status", "next_step"]
}
},
"system": { "text": "Return ONLY valid JSON.", "use_default_prompt": false },
"messages": [
{ "role": "user", "content": "We deployed the billing dashboard. Return a status and the next step." }
]
}'
{
"id" : "msg_eaf35004b4654234be5e767ea09e72e6" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"start_timestamp" : "2026-02-06T13:39:44.864978Z" ,
"stop_timestamp" : "2026-02-06T13:39:45.412991Z" ,
"text" : "{ \" status \" : \" success \" , \" next_step \" : \" Monitor adoption and collect feedback. \" }"
}
],
"model" : "private-gpt" ,
"stop_reason" : "end_turn" ,
"stop_sequence" : null ,
"usage" : { "input_tokens" : 98 , "output_tokens" : 16 }
}
Streaming (SSE)
Set stream: true to receive a text/event-stream response containing events like content_block_delta (token deltas).
curl -N -X POST "https://{BASE_URL}/api/gpt/v1/messages" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{
"model": "default",
"stream": true,
"max_tokens": 32,
"messages": [
{ "role": "user", "content": "Give a three-word tagline for a productivity app." }
]
}'
event: message_start
data: {"type":"message_start","message":{"id":"msg_b55d29718361410bad4f80f8f67c6b72","type":"message","role":"assistant","content":[],"model":"private-gpt","usage":{}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"block_id":"block_019c331ac11375cea03f7e0fb7a2a044","content_block":{"type":"text","start_timestamp":"2026-02-06T13:18:37.331257Z","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"block_id":"block_019c331ac11375cea03f7e0fb7a2a044","delta":{"type":"text_delta","text":"Focus"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"block_id":"block_019c331ac11375cea03f7e0fb7a2a044","delta":{"type":"text_delta","text":" fast daily"}}
event: content_block_stop
data: {"type":"content_block_stop","stop_timestamp":"2026-02-06T13:18:37.417617Z","index":0,"block_id":"block_019c331ac11375cea03f7e0fb7a2a044"}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"input_tokens":90,"output_tokens":3}}
event: message_stop
data: {"type":"message_stop"}
Async messages
Async messages use the same request body as sync messages—you just call a different endpoint. The main difference is the response shape : sync returns a full message, async returns a message_id you can stream or poll later.
Sync vs async response shape
Sync response
Async response
{
"id" : "msg_abf981cf3f1c449eacb7a85c064ca505" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"start_timestamp" : "2026-02-06T13:23:16.876744Z" ,
"stop_timestamp" : "2026-02-06T13:23:16.975362Z" ,
"text" : "Deployed the new billing dashboard and resolved three bugs successfully."
}
],
"model" : "private-gpt" ,
"stop_reason" : "end_turn" ,
"stop_sequence" : null ,
"usage" : {
"input_tokens" : 95 ,
"output_tokens" : 16
}
}
{
"message_id" : "1cc8ed5e-fc7a-45e1-baf4-0b916ed32290" ,
"status" : "pending" ,
"message" : "Request initiated successfully"
}
Start an async message
curl -X POST "https://{BASE_URL}/api/gpt/v1/messages/async" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"max_tokens": 32,
"messages": [
{ "role": "user", "content": "Say only: async ok" }
]
}'
Stream or poll status
Stream events:
curl -N "https://{BASE_URL}/api/gpt/v1/messages/async/{message_id}/stream" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Accept: text/event-stream"
Poll status:
curl "https://{BASE_URL}/api/gpt/v1/messages/async/{message_id}/status" \
-H "Authorization: Bearer {API_TOKEN}"
Cancel or delete
Cancel: POST /messages/async/{message_id}/cancel
Delete: DELETE /messages/async/{message_id}/delete
Common mistakes
Wrong method : /messages is POST, not GET.
Missing auth header : include Authorization: Bearer ....
Wrong base URL : use https://{BASE_URL}/api/gpt/v1/.
Invalid request body : use POST /messages/validate to catch schema mistakes early.
Unexpected stops or errors : see Handling stop reasons for causes and fixes.
Next steps
Tools Let Zylon use built-in and custom tools during a message.
Artifacts Ingest text, files, or URIs into collections for retrieval.