Observability without cloud bills

January 20, 2026· 1 min readRPi Kubernetes

Building a useful observability stack with Prometheus, Grafana, Loki, Vector, Jaeger, and OpenTelemetry.

RPi KubernetesSystems DesignLocal FirstDevelopment Timeline

Why this mattered

I wanted traces and metrics to be available before a real incident, not added after one.

This belongs in the development timeline because RPi Kubernetes is not a single feature. It is a hybrid k3s homelab with an Ubuntu control plane, four Raspberry Pi 5 workers, Cloudflare Tunnel, and a data platform made from Kafka, Flink, Redis Stack, MinIO, DataHub, Airbyte, Polaris, and observability services. The project only became useful once its infrastructure decisions were written down well enough to be repeated.

Design decision

The homelab stack favors open components and clear ownership: apps emit telemetry, collectors route it, and dashboards answer operational questions.

The practical stack around this decision includes k3s, Kustomize, Helm, Strimzi Kafka, Flink Operator, Redis Stack, RAGFlow, DataHub, Airbyte, Polaris, MinIO, Prometheus, Grafana, Loki, OpenTelemetry, Cloudflare Tunnel, FastAPI, Next.js. I try to keep the interfaces small: configuration describes intent, runtime code owns behavior, and operational notes explain what a future maintainer should check first.

What I would repeat

The constraint is retention and resource use, so every signal needs a reason to exist.

The repeatable pattern is to make the boring path explicit. For this project that means clear repository boundaries, documented setup, predictable deployment commands, and enough observability to know whether the system is healthy or merely quiet.

Reader takeaway

If you are building something similar, start with the workflow you need to repeat every week. Then add only the platform pieces that make that workflow easier to recover, explain, and extend.

FinOps labels in a personal cluster

Jan 31, 2026

Why cost and ownership labels still matter on a home Kubernetes platform.

Suspended jobs are operational signal

Jan 27, 2026

Why paused CronJobs and bootstrap jobs belong in the deployment story.

RAGFlow and the document store on home infrastructure

Jan 23, 2026

How document workflows fit beside the Kubernetes and data services.

From market-data script to Alpha Vantage producer

Jan 15, 2026

How the Alpha Vantage integration grew from a script into a cluster workload.