Metrics

Measure system behavior and keep the evidence close.

Metrics tell leaders whether a system is healthy, costly, or changing over time. CloudGrid ingests OTLP counters, gauges, histograms, summaries, and exemplars as project data, then keeps metric context linkable to traces, dashboards, alerts, and evaluation reviews.

Metrics become more valuable when they explain what to inspect.

A chart alone rarely answers the whole question. CloudGrid keeps metric descriptors, attributes, and exemplars available so a trend can lead to a trace, a dashboard, an alert, or an AI evaluation review in the same project.

From trend to operational decision.

Metrics help enterprise teams move from "something changed" to "this is where we should investigate, alert, or review." CloudGrid keeps that path close to the rest of the evidence.

01
See whether the system is healthy.

Metrics give leaders and operators the shape of service health, latency, throughput, error pressure, and runtime cost over time.

02
Move from trend to evidence.

Descriptors and attributes explain what is being measured. Exemplars can point from a metric spike back to the trace behind it.

03
Make the signal operational.

The same metric can support a dashboard, an absence or threshold alert, and AI-runtime review without leaving the project context.

What CloudGrid provides

  • OTLP HTTP ingest on /v1/metrics with JSON and protobuf encodings.
  • Metric descriptor browsing with units, types, and attribute key inventories.
  • Time-series exploration with group-by descriptor attributes and aggregation.
  • Exemplar pivots back into the originating trace.
  • AI runtime metrics such as token totals, model-call counts, latency, and cost signals ingested as standard OTLP.

Operating model

  • Long-term metric retention uses the v1 SurrealDB adapter; alternative backends are the natural next adapter.
  • CloudGrid uses GraphQL view models with explicit aggregations for metric exploration.
  • Recording-rule and downsampling pipelines are owned by your collector layer for now.
  • Metric threshold and absence rules are managed in the project alerting workspace.

Where metric evidence is used.

Metrics support different enterprise conversations when they stay attached to the project record.

Operations
Operations

Track service latency, volume, error pressure, and saturation signals before they become customer-facing incidents.

FinOps
FinOps

Bring runtime, token, model-call, latency, and cost signals into the same project where AI behavior is evaluated.

Leadership
Leadership

Use dashboards and alerts to turn metric changes into visible operating signals instead of ad-hoc charts.