Platform metrics are the operational metrics generated by the inference stack running in your cluster. They include:Documentation Index
Fetch the complete documentation index at: https://docs.zylon.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Triton metrics such as request rate, failures, queue depth, and latency
- vLLM metrics such as scheduler state, KV cache pressure, and token throughput
- GPU metrics from the DCGM exporter
- Node metrics from
node_exporter
Before you enable them
Platform metrics require the monitoring stack:Enable platform metrics
Configuration options
| Flag | Default | What it controls |
|---|---|---|
platformMetrics.enabled | false | Turns platform metric collection on |
platformMetrics.generationIntervalMs | 2000 | Triton metric generation interval |
platformMetrics.gpu.enabled | true | Includes GPU metrics |
platformMetrics.inference.counterLatencies | true | Enables cumulative latency counters |
platformMetrics.inference.histogramLatencies | true | Enables latency histograms |
platformMetrics.inference.summaryLatencies | true | Enables sliding-window latency summaries |
platformMetrics.inference.summaryQuantiles | "" | Overrides Triton’s default summary quantiles |
What you get
Triton
Main metric family:nv_*
Examples:
nv_inference_request_successnv_inference_request_failurenv_inference_pending_request_countnv_inference_request_duration_usnv_inference_compute_infer_duration_us
vLLM
Main metric families:vllm_llms_v1:*vllm_embeddings_v1:*
vllm_llms_v1:num_requests_runningvllm_llms_v1:kv_cache_usage_percvllm_llms_v1:time_to_first_token_seconds_bucketvllm_llms_v1:generation_tokens_total
GPU
Examples:nv_gpu_utilizationnv_gpu_memory_used_bytesnv_gpu_power_usage
Node
Main metric family:node_*
Examples:
node_cpu_seconds_totalnode_memory_MemAvailable_bytesnode_filesystem_avail_bytes
Next step
After platform metrics are enabled, you can either:- inspect them in the in-cluster Grafana stack
- forward them to your own backend through Metrics Destinations
- use the reference External Grafana Dashboard