Observability helps you answer three questions:Documentation Index
Fetch the complete documentation index at: https://docs.zylon.ai/llms.txt
Use this file to discover all available pages before exploring further.
- is Zylon healthy?
- what is it doing?
- where should its metrics go?
- Crash reporting tells Zylon when the platform fails, so support can diagnose the problem.
- Usage metrics send anonymous product telemetry to Zylon.
- Monitoring installs the local monitoring stack inside your cluster.
- Platform metrics are the actual technical metrics from Triton, vLLM, GPUs, and nodes.
- Destinations send those metrics to your own monitoring backend.
Getting started
For most setups, think about observability in this order:- Enable
monitoringif you want metrics at all. - Enable
platformMetricsif you want Triton, vLLM, GPU, and node metrics. - Add
destinationsif you want to send those metrics to your own backend. - Keep or disable
crashReportingandusageMetricsdepending on whether you want Zylon telemetry.
Crash reporting
observability.crashReporting controls whether Zylon sends crash diagnostics to Sentry.
Enable it if you want Zylon support to have failure information when the platform breaks. Disable it if you do not want any crash diagnostics sent to Zylon.
Usage metrics
observability.usageMetrics controls whether Zylon sends anonymous product telemetry to Zylon-managed observability services.
This is product-level telemetry, not the detailed Triton or vLLM metrics you use for operating the cluster. Disable it if you do not want to send usage telemetry to Zylon.
Monitoring
Monitoring must be enabled if you want local metrics or external metric forwarding.observability.monitoring installs the in-cluster monitoring stack, including Prometheus, Grafana, and k8s-monitoring.
This is the base for everything else related to metrics. If monitoring is disabled, you cannot inspect platform metrics locally and you cannot forward them to your own destinations.
Platform metrics
Platform metrics require monitoring:observability.platformMetrics.enabled turns on the operational metrics generated by the inference stack.
These are the metrics you use to understand request rate, failures, latency, queue depth, scheduler pressure, GPU usage, and host health. They come from Triton, vLLM, the GPU exporter, and node_exporter.
For the full metric configuration, see Platform Metrics.
External destinations
External destinations also require monitoring:k8s-monitoring.extraDestinations forwards the metrics collected in your cluster to your own monitoring backend.
Use it only when you want to send metrics somewhere outside the built-in monitoring stack, for example to Prometheus, Grafana Cloud, or an OTLP collector.
For destination setup, see Metrics Destinations.
Next pages
For the core Zylon configuration:- Platform Metrics: enable Triton, vLLM, GPU, and node metrics
- Metrics Destinations: send metrics to Prometheus, Grafana-compatible backends, or OTLP
- External Grafana Dashboard: import the reference dashboard into your own Grafana instance