Ankit Rana | Mechanical Sympathy

Ankit Rana | Mechanical SympathySystem architecture and engineering logs by Ankit Rana.https://ankit-rana.com/en-usInstrumenting AI Agents: Why the Apology Metric Is a First Class Reliability Signalhttps://ankit-rana.com/logs/23-vector-db-vs-graphrag-global-sense-making/https://ankit-rana.com/logs/23-vector-db-vs-graphrag-global-sense-making/Track apology phrases as a first class SLO for AI agents: spikes reveal context starvation, timeout dropouts, and payload truncation across data boundaries.Wed, 06 May 2026 00:00:00 GMTThe Expensive Cosplay of Local Models: True 3 AM Operational Cost of Hosting Llama-3https://ankit-rana.com/logs/22-expensive-cosplay-local-llama-inference-tco/https://ankit-rana.com/logs/22-expensive-cosplay-local-llama-inference-tco/Self-hosted Llama-70B looks cheap until VRAM, KV cache, and HBM bandwidth cap throughput. TCO is idle GPUs, batching latency, and ML infra on call.Fri, 01 May 2026 00:00:00 GMTDatabases Were Not Built for AI Agents: How LLMs Break Connection Poolinghttps://ankit-rana.com/logs/21-llms-connection-pooling-ai-agents/https://ankit-rana.com/logs/21-llms-connection-pooling-ai-agents/Agents keep pooled DB connections open for LLM inference, exhausting pools and evicting buffer cache. Decouple reasoning from data, replicas, semantic layers.Sun, 26 Apr 2026 00:00:00 GMTThread-per-Core Architecture: Why Extra Threads Eventually Destroy Throughputhttps://ankit-rana.com/logs/20-thread-per-core-architecture/https://ankit-rana.com/logs/20-thread-per-core-architecture/Oversized thread pools stall: timeslicing, context switches, cache thrashing. Thread-per-core, CPU pinning, and async I/O match physical cores.Sat, 18 Apr 2026 00:00:00 GMTBranch Prediction: Why an if Inside a Hot Loop Costs Millisecondshttps://ankit-rana.com/logs/19-branch-prediction-loop-unrolling/https://ankit-rana.com/logs/19-branch-prediction-loop-unrolling/How CPU pipelining and branch predictors work, why mispredictions flush the pipeline, and how sorting, branchless code, and loop unrolling help.Tue, 14 Apr 2026 00:00:00 GMTCPU Caches and Spatial Locality: Why an Array is 3x Faster Than a Linked List for the Exact Same Big-O Complexityhttps://ankit-rana.com/logs/18-cpu-caches-spatial-locality/https://ankit-rana.com/logs/18-cpu-caches-spatial-locality/Why arrays are faster than linked lists on real CPUs: cache lines, spatial locality, hardware prefetchers, and pointer chasing.Fri, 10 Apr 2026 00:00:00 GMTVirtual Memory and Lazy Allocation: Why RSS Matters More Than malloc()https://ankit-rana.com/logs/17-virtual-memory-rss-vsz-oom-killer/https://ankit-rana.com/logs/17-virtual-memory-rss-vsz-oom-killer/How virtual memory promises work, why malloc() doesn't equal RAM, what page faults do, how lazy allocation overbooks memory, and why OOM kills by RSS.Mon, 06 Apr 2026 00:00:00 GMTCuckoo Filters: Cache-Friendly Membership Checks With Deletionshttps://ankit-rana.com/logs/16-cuckoo-filters-architecture/https://ankit-rana.com/logs/16-cuckoo-filters-architecture/How Cuckoo filters work: fingerprints, two-bucket lookups, kick-out insertions, why they stay cache-friendly, and the real tradeoff of insertion failure.Thu, 02 Apr 2026 00:00:00 GMTBloom Filters vs Counting Bloom Filters: When Deletions Kill Performancehttps://ankit-rana.com/logs/15-bloom-filters-deletable-bloom-filters/https://ankit-rana.com/logs/15-bloom-filters-deletable-bloom-filters/Why counting (deletable) Bloom filters often lose in production: cache misses, random memory access, and better alternatives like hash tables or Cuckoo filters.Sat, 28 Mar 2026 00:00:00 GMTPagination at Scale: Why OFFSET and SKIP Will Eventually Break Your APIhttps://ankit-rana.com/logs/14-pagination-offset-vs-cursor/https://ankit-rana.com/logs/14-pagination-offset-vs-cursor/Why OFFSET/SKIP pagination degrades linearly with depth, how cursor-based pagination keeps latency flat, and when to switch before production bites back.Wed, 25 Mar 2026 00:00:00 GMTThe RUM Conjecture: You Cannot Optimize Reads, Updates, and Memory at Oncehttps://ankit-rana.com/logs/13-rum-conjecture-database-tradeoffs/https://ankit-rana.com/logs/13-rum-conjecture-database-tradeoffs/How the RUM Conjecture explains real-world database trade-offs between read latency, write throughput, and memory overhead across B-Trees, LSM-Trees, and hash indexes.Sat, 21 Mar 2026 00:00:00 GMTWhy UUID Primary Keys Quietly Destroy Database Performancehttps://ankit-rana.com/logs/12-uuids-primary-keys-performance/https://ankit-rana.com/logs/12-uuids-primary-keys-performance/How random UUID primary keys break clustered indexes, cause page splits and buffer pool churn, and what to use instead for mechanically sympathetic database design.Tue, 17 Mar 2026 00:00:00 GMTDesigning Resilient APIs: Failure-Handling Patterns for Distributed Systemshttps://ankit-rana.com/logs/11-resilient-api-design-patterns/https://ankit-rana.com/logs/11-resilient-api-design-patterns/Practical resilience patterns for distributed APIs: fail-fast, retries with backoff, circuit breakers, bulkheads, fallbacks, rate limiting, failover, and observability.Mon, 16 Feb 2026 00:00:00 GMTMicroservices Deep Dive: Architecting for Scalability and Resiliencehttps://ankit-rana.com/logs/10-microservices-architecture/https://ankit-rana.com/logs/10-microservices-architecture/How to design, operate, and scale microservices: core principles, when to use them, key patterns, and how to manage complexity in distributed systems.Sun, 18 Jan 2026 00:00:00 GMTConsistency Models in Azure Cosmos DB: From Strong to Eventualhttps://ankit-rana.com/logs/09-cosmosdb-consistency-models/https://ankit-rana.com/logs/09-cosmosdb-consistency-models/How Azure Cosmos DB's five consistency levels map onto PACELC tradeoffs, what each level guarantees, and how to choose the right consistency for your workload.Fri, 02 Jan 2026 00:00:00 GMTZero Trust Architecture: From Perimeter Walls to "Never Trust, Always Verify"https://ankit-rana.com/logs/08-zero-trust-architecture/https://ankit-rana.com/logs/08-zero-trust-architecture/How Zero Trust Architecture replaces perimeter-based security: core principles, differences from traditional models and ZTNA, enabling technologies, and real-world implementations.Mon, 03 Nov 2025 00:00:00 GMTTransitioning from REST to gRPC: System Design and Tradeoffshttps://ankit-rana.com/logs/06-grpc-vs-rest/https://ankit-rana.com/logs/06-grpc-vs-rest/How gRPC changes API design versus REST: protocol model, protobuf schemas, service interfaces, streaming patterns, and when gRPC or REST is the right architectural choice.Thu, 25 Sep 2025 00:00:00 GMTHTTP/2 System Design: How It Fixes HTTP/1.1https://ankit-rana.com/logs/05-http2-system-design/https://ankit-rana.com/logs/05-http2-system-design/Deep dive into HTTP/2: why HTTP/1.1 hit scaling limits, how multiplexing, server push, binary framing, and prioritization work, and why it matters for web performance.Sun, 17 Aug 2025 00:00:00 GMTHow Shazam's Music Recognition System Design Actually Workshttps://ankit-rana.com/logs/04-shazam-music-recognition/https://ankit-rana.com/logs/04-shazam-music-recognition/Inside Shazam's music recognition system design: audio fingerprinting, spectrogram peaks, NoSQL and SQL databases, and scaling search across millions of tracks.Wed, 16 Jul 2025 00:00:00 GMTMastering Event-Driven Architecture with Apache Kafkahttps://ankit-rana.com/logs/07-kafka-event-driven-architecture/https://ankit-rana.com/logs/07-kafka-event-driven-architecture/How to design scalable, resilient systems using event-driven architecture and Apache Kafka for high-throughput, real-time data processing.Tue, 10 Jun 2025 00:00:00 GMTSystem Migration: Minimize Downtime, Maximize Efficiencyhttps://ankit-rana.com/logs/03-system-migration/https://ankit-rana.com/logs/03-system-migration/A practical blueprint for system migration: isolated env, sync/async flows, bridge layer, traffic leakage, backup sync, and monitoring.Mon, 05 May 2025 00:00:00 GMTSystem Design: Principles for Maintainability, Scalability, and Reliabilityhttps://ankit-rana.com/logs/02-system-design-principles/https://ankit-rana.com/logs/02-system-design-principles/Data building blocks, fault tolerance, latency vs response time, scaling strategies, and the operability-simplicity-evolvability triad for durable systems.Wed, 16 Apr 2025 00:00:00 GMTTime Management for Software Engineers: A Toolkit That Scaleshttps://ankit-rana.com/logs/01-time-management-software-engineering/https://ankit-rana.com/logs/01-time-management-software-engineering/How to get better at delivery and focus: planning, prioritization, delegation, and guarding deep work.Sat, 15 Mar 2025 00:00:00 GMT