SRE | DevOps | MLOps

Thomas
Nyambati

Building production-grade ML systems on Kubernetes. I care about reliability, observability, and security — not just model accuracy.


About

Senior Platform & SRE Engineer with 9+ years of experience designing large-scale cloud platforms, Kubernetes infrastructure, and high-volume observability stacks.

I’m drawn to the intersection of reliability and machine learning: where production discipline meets model chaos. I write about the trade-offs, the failures, and the tooling that actually holds up under load.

Current stack
  • Kubernetes / EKS / GKE
  • ArgoCD / GitOps
  • Prometheus + Mimir
  • Grafana / Loki / Tempo
  • Terraform / Terragrunt
  • Go / Python
  • Karpenter / HPA / VPA
  • Helm / Helmfile
  • GitHub Actions
  • AWS / GCP

Blog
post
How to Migrate Mimir KV Store from Consul to Memberlist With Zero-Downtime
Migrating Mimir KV Store from Consul to Memberlist (Zero-Downtime)
Apr 2, 2026
KubernetesGrafanaMimirObservability
View all 1 posts →

Projects
View all 0 projects →

Contact

Let's talk.

Open to conversations about MLOps, platform engineering, SRE, or just building stuff on Kubernetes. Find me on GitHub or send a mail.