Vector Databases on the Edge: ChromaDB vs Milvus

February 5, 2026· 2 min readRPi Kubernetes

Comparing ChromaDB and Milvus for vector search on resource-constrained Raspberry Pi 5 nodes -- when to use each and how to deploy them.

Vector DatabaseChromaDBMilvusRAGEmbeddings

Why Vector Databases on a Pi Cluster?

RAG (Retrieval-Augmented Generation) pipelines need vector storage for embedding similarity search. Running a vector database locally means your RAG system doesn't depend on external services, latency stays low, and your data never leaves your network.

The RPi Kubernetes project deploys both ChromaDB and Milvus -- ChromaDB for development and lightweight workloads, Milvus for production-grade vector search.

ChromaDB: The Lightweight Option

ChromaDB runs as a single process with minimal dependencies. It's perfect for prototyping and datasets under a few hundred thousand vectors.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: chromadb
  namespace: data-services
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: chromadb
          image: chromadb/chroma:latest
          ports:
            - containerPort: 8000
          resources:
            requests:
              memory: "512Mi"
              cpu: "500m"
            limits:
              memory: "1Gi"
              cpu: "1"
          volumeMounts:
            - name: chroma-data
              mountPath: /chroma/chroma

On a single Pi5 with 8GB RAM, ChromaDB comfortably handles 100K-500K vectors with sub-10ms query latency. Beyond that, memory becomes the bottleneck.

Milvus: Production Scale

Milvus is a purpose-built vector database with GPU acceleration, sharding, replication, and support for billion-scale datasets. On the Pi cluster, it runs in standalone mode via Helm:

helm install milvus milvus/milvus \
  --namespace data-services \
  --values kubernetes/base-services/milvus/values.yaml

The values file configures Milvus for the constrained environment:

standalone:
  resources:
    requests:
      memory: "2Gi"
      cpu: "1"
    limits:
      memory: "4Gi"
      cpu: "2"

minio:
  enabled: true
  resources:
    requests:
      memory: "256Mi"

etcd:
  enabled: true
  resources:
    requests:
      memory: "256Mi"

Milvus requires MinIO (for object storage) and etcd (for metadata), so the total resource footprint is larger than ChromaDB. But it provides proper indexing algorithms (IVF_FLAT, HNSW, DISKANN), consistency guarantees, and OTLP tracing integration.

Performance Comparison

Benchmarks on a single RPi5 node (8GB, ARM64):

Metric	ChromaDB	Milvus (Standalone)
Index 100K vectors	45s	30s
Query latency (p50)	5ms	3ms
Query latency (p99)	15ms	8ms
Memory usage (100K)	600MB	1.2GB
Max vectors (8GB node)	~500K	~300K

ChromaDB uses less memory per vector but has slower indexing and higher tail latencies. Milvus is faster but needs more baseline memory for its dependencies.

LangChain Integration

Both databases integrate with LangChain for RAG pipelines:

from langchain_community.vectorstores import Chroma, Milvus
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="nomic-embed-text")

# ChromaDB
chroma_store = Chroma(
    collection_name="documents",
    embedding_function=embeddings,
    client=chromadb.HttpClient(host="chromadb.data-services", port=8000)
)

# Milvus
milvus_store = Milvus(
    collection_name="documents",
    embedding_function=embeddings,
    connection_args={"host": "milvus.data-services", "port": 19530}
)

When to Use Which

ChromaDB -- Development, prototyping, small datasets (<500K vectors), when you need simplicity and low resource usage.

Milvus -- Production, larger datasets, when you need advanced indexing, consistency guarantees, or OTLP tracing. Worth the extra resource cost when reliability matters.

On the Pi cluster, ChromaDB runs in development namespace for experimentation, and Milvus runs in data-services for production workloads.

Vector Sync Recipe: Dual-Writing to Milvus and ChromaDB

May 2, 2026

How the vector sync pipeline in rpi_kubernetes coordinates chunking, embedding, and dual vector-store writes with audit logging.

Building a Local RAG Coding Assistant with mem0

Feb 17, 2026

A deep dive into the local coding assistant example -- RAG over codebases, episodic memory with mem0, and Redis solution caching.

Deploying RAGFlow in the RPi Kubernetes Platform

Apr 27, 2026

What changed when RAGFlow was added to rpi_kubernetes and how it fits into the existing Postgres, MinIO, Redis, and search stack.

RAG Eval Playground: Building a Real Evaluation Loop Locally

Apr 13, 2026