Hey folks, Yoshik here.
This week’s digest is going out a bit late. I’ve been curating a few extra pieces because I didn’t want to send you a rushed issue with filler. I’d rather be a little late than waste your time. Good content coming up.
Thanks for sticking with my newsletter, I really appreciate it.
Welcome to this week's Uptime Sync. This issue covers Github Source Code Leak, Cloudflare's ClickHouse bottleneck that silently slowed a billing pipeline, GitHub's critical RCE warning for the git push pipeline, and Netflix's look at the human operations layer behind live-at-scale systems. We also examine Slack's move from SSH to REST for EMR data pipelines, AWS's deep dive into the invisible engineering behind Lambda's network, Databricks' 10 trillion samples per day monitoring challenge, and Wix's zero-downtime database migration service.
On the tutorial front: zero-downtime ECS deployments with automatic PostgreSQL migrations, going back to writing code by hand after AI fatigue, building a Kubernetes debugging AI agent with Claude Code, migrating from GitHub to Forgejo for digital sovereignty, replacing GPT-4 with a local SLM for deterministic CI/CD extraction, and tracing multi-agent AI swarms with Jaeger v2.
This week's projects bring you Grafana's all-in-one OpenTelemetry backend in a Docker image, Healthchecks for cron job monitoring, Infracost for cloud cost estimates and FinOps guidance, and K8sGPT for scanning Kubernetes clusters and triaging issues in plain English.
Newsworthy Reads
Tutorials of the Week
Projects of the Week
An OpenTelemetry backend in a Docker image. It bundles the OpenTelemetry Collector, Prometheus (metrics), Tempo (traces), Loki (logs), Pyroscope (profiles), and Grafana into a single container
A cron job monitoring service. It listens for HTTP requests and email messages ("pings") from your cron jobs and scheduled tasks ("checks").
A tool shows cloud cost estimates and FinOps best practices for Terraform, Terragrunt, CloudFormation, and AWS CDK
A tool for scanning your Kubernetes clusters, diagnosing, and triaging issues in simple English.
Join 1,000+ engineers staying ahead of the curve
Every week, Uptime Sync brings you:
Outage postmortems from Netflix, Cloudflare, Pinterest & more
Hands-on DevOps & SRE tutorials
Production-ready tools & open-source projects

👋 Find me on Twitter | Linkedin | Connect 1:1
Thank you for supporting this newsletter. Consider sharing this post with your friends.
Y’all are the best.
