Welcome to this week's Uptime Sync. This issue opens with Helix's dramatic journey cutting Docker build times from 45 minutes to 14 seconds - a 193x improvement with concrete techniques you can steal - followed by Netflix's detailed look at how they scale LLM post-training infrastructure to handle model refinement at their operational scale. We then examine Supabase's February 12 incident report (the kind of transparent postmortem that teaches more than most architecture talks), and Cloudflare's Code Mode approach that compresses entire APIs into 1,000 tokens for agent consumption. On the access control front, there's Uber's complete reinvention of how permissions work across thousands of microservices, and Datadog's hard-won lessons from hardening eBPF for runtime security in production workloads—plus a deep dive into runtime tracing for AI agents so you understand what's actually happening inside those containers.

On the tutorial front, you'll learn how to detect brute force attacks without relying on Fail2Ban (cleaner, more debuggable), get a thorough breakdown of Terraform memory errors so you stop treating OOMs like random events, and understand five critical Ingress-NGINX behaviors before migrating to Gateway API. You'll also walk through optimizing event replay from 30 to 14,000 events per second (with the actual bottlenecks exposed), explore replacing Redis with PostgreSQL for caching and background jobs (and why it's faster in specific workloads), and study database partitioning types, strategies, and the decision framework for when to use each approach.

To wrap it up, this week's projects bring you a macOS helper that plays Warcraft-style notifications when Claude Code finishes (so you never miss terminal events), a continuously updated IPv4 blocklist of malicious IPs for firewall and WAF protection, a production-ready Go-based LLM vulnerability scanner testing 210+ adversarial attacks including prompt injection, Just for saving and running project-specific commands cleanly, and Shaper - a SQL-driven dashboard platform powered by DuckDB that lets you build and share analytics with pure SQL queries.

Newsworthy Reads

Tutorials of the Week

Projects of the week

  • A macOS helper that plays Warcraft-style sound notifications when Claude Code finishes or needs input, so you never miss terminal events

  • A continuously updated, community-maintained IPv4 blocklist of malicious IPs for firewall and WAF protection.

  • A production-ready Go-based LLM vulnerability scanner that tests models against +210 adversarial attacks, including prompt injection and jailbreaks

  • Just is a handy way to save and run project-specific commands.

  • A SQL-driven dashboard platform powered by DuckDB that lets you build and share analytics with pure SQL queries

Join 1,000+ engineers staying ahead of the curve

Every week, Uptime Sync brings you:

  • Outage postmortems from Netflix, Cloudflare, Pinterest & more

  • Hands-on DevOps & SRE tutorials

  • Production-ready tools & open-source projects

👋 Find me on Twitter | Linkedin | Connect 1:1

Thank you for supporting this newsletter. Consider sharing this post with your friends.

Y’all are the best.

Keep Reading