Early Access — Now Open

Telemetry for every
GPU · CPU · TPU

TeleGCT collects, normalises, and scores hardware telemetry across your entire heterogeneous compute fleet — regardless of vendor, cloud, or infrastructure layer.

Request Early Access See How It Works

// live telemetry — simulated cluster snapshot

GPU Wastage Score 0.68 node-01 · A100-80GB · idle
Power Headroom 173W of 400W TDP available
Inference Efficiency 2.3 tokens/watt · MI300X
Stranded Capacity 9.2kW rack-01 · contracted unused

// hardware coverage

Every accelerator.
One platform.

TeleGCT normalises telemetry from every major compute vendor into a single unified schema — no matter where your workloads run.

GPU

Full telemetry from training and inference GPUs. Compute utilisation, power draw, VRAM usage, temperature, and wastage scoring per device.

NVIDIA H100 NVIDIA A100 AMD MI300X AMD MI250X Intel Gaudi 2 Groq LPU

CPU

Inference-grade CPU telemetry for ARM and x86. Core utilisation, memory bandwidth, power envelope, and tokens-per-watt for LLM serving workloads.

AWS Graviton 3/4 Ampere Altra AMD EPYC Intel Xeon Apple M-series

TPU

Purpose-built inference accelerator telemetry. Chip utilisation, power draw, and throughput metrics from cloud and on-premise deployments.

Google TPU v4 Google TPU v5 AWS Inferentia 2 AWS Trainium Qualcomm AI 100

// what telegct measures

Intelligence at every
layer of the stack.

From raw PDU outlet watts to per-token inference cost — TeleGCT connects the dots your existing tools miss.

01

Wastage Score

Per-device wastage scoring identifies idle hardware drawing power without doing work. Correlates GPU compute utilisation against actual wall power draw for accurate waste attribution.

02

Inference Portability

Benchmark your models across GPU, CPU, and TPU hardware. Get tokens-per-second and tokens-per-watt data to make migration decisions with confidence before you move production workloads.

03

Power Correlation

SNMP telemetry from PDU outlets cross-correlated with GPU metrics. See stranded rack capacity and contracted power waste — the cost your cloud bill doesn't show you.

04

Works With Your Stack

Connects to any Prometheus instance, DCGM exporter, ROCm exporter, or vLLM metrics endpoint. No agent required on your nodes. No K8s dependency. Plain EC2 and VMs fully supported.

// telegct scoring formulas

# Wastage Score — 0.0 (fully used) → 1.0 (fully wasted)
wastage_score = (1 - util_pct / 100) × (draw_w / tdp_w)

# Power Headroom — watts available before TDP ceiling
headroom_w = tdp_w - draw_w

# Inference Efficiency — tokens delivered per watt consumed
efficiency = tokens_per_sec / draw_w

// data sources

Connects to everything
you already run.

TeleGCT pulls from your existing observability stack. No new agents, no new dashboards to maintain. One URL and a read-only token is all we need.

REST / HTTPS

Prometheus

Any Prometheus-compatible endpoint. DCGM exporter, ROCm exporter, vLLM metrics — all supported. Basic, Bearer, OAuth2, API key auth.

gRPC

DCGM Direct

Connect directly to NVIDIA DCGM on port 5555 for richer GPU telemetry without requiring a Prometheus layer.

SNMP v2c / v3

PDU Power

APC, Raritan, Vertiv, ServerTech, Eaton. Outlet-level watts and amps correlated with GPU metrics for wall power attribution.

REST API

Cloud Monitoring

AWS CloudWatch for Inferentia and Trainium. Google Cloud Monitoring for TPU v4/v5. Azure Monitor for GPU VMs.

REST / Generic

Custom Endpoints

Define your own response schema mapping. Any JSON REST API can be a TeleGCT source with zero code changes.

Agent (optional)

TeleGCT Exporter

Lightweight binary for bare EC2 instances and VMs. Auto-detects NVIDIA and AMD hardware. Exposes /metrics over HTTPS.

Get early access.

We're onboarding our first design partners now. Point us at your Prometheus instance and we'll show you your wastage score in under 10 minutes.