10 System Design Patterns 2026 (FAANG-Tested) — CQRS, Saga, Circuit Breaker & More | MockExperts

Q: Trade-offs and Caveats

Interview signal: Mention snapshots without being prompted. Interviewers are looking for awareness of the read performance problem.

Q: Trade-offs and Caveats

Interview signal: Always call out the eventual consistency problem and discuss when it's acceptable (social media feeds) vs. unacceptable (financial balances).

Q: Trade-offs and Caveats

Two styles: Choreography (services react to events — decoupled but hard to track) vs. Orchestration (a saga orchestrator tells each service what to do — centralized visibility but a potential single point of failure).

Why Patterns Matter More Than Memorizing Architectures

System design interviews are not memory tests. No interviewer expects you to memorize the exact architecture of DoorDash's delivery tracking system. What they're evaluating is whether you have a mental toolkit of reusable patterns that you can compose to solve any design problem — including ones you've never seen before.

In 2026, the bar has risen. SDE-2 and SDE-3 candidates are expected to know not just what these patterns are, but when to apply them, what trade-offs they introduce, and when to explicitly reject them. This guide covers the 10 patterns that appear most frequently across FAANG, fintech, and high-growth startup interviews.

Pattern 1: Event Sourcing

The Core Idea

Instead of storing the current state of an entity (e.g., "Account balance: $1,500"), store every event that led to that state ("Deposit $2,000 → Withdraw $500 → Deposit $300 → Withdraw $300"). The current state is derived by replaying the event log.

When to Use It

Financial transaction systems where audit trails are regulatory requirements
Collaborative editing tools (Google Docs) where you need to replay changes
Systems with complex temporal queries ("What was the account balance on March 15th?")
Any domain where understanding how you arrived at a state is as important as the state itself

Trade-offs and Caveats

Pro: Perfect audit trail. Complete historical record. Easy to replay and rebuild projections.
Pro: Natural fit for event-driven architectures — your events are the source of truth, not a derived artifact.
Con: Current state reads require replaying the event log — mitigated by periodic snapshots (store a checkpoint every N events).
Con: Schema evolution is hard — old events must remain parseable as you change your domain model.

Interview signal: Mention snapshots without being prompted. Interviewers are looking for awareness of the read performance problem.

Pattern 2: CQRS (Command Query Responsibility Segregation)

The Core Idea

Separate the model/data store used for writes (commands) from the model/data store used for reads (queries). The write model is optimized for consistency and transactional integrity. The read model is optimized for query performance, often pre-joined and denormalized.

When to Use It

Systems with dramatically different read/write patterns (e.g., an e-commerce platform where reads outnumber writes 100:1)
When your read queries require complex joins that slow writes
Combined with Event Sourcing, where events update the write store and projections populate read stores

Trade-offs and Caveats

Pro: Read and write sides can be scaled independently. Read replicas can be added without affecting write throughput.
Con: Introduces eventual consistency between write and read models — the read model may lag behind. This is unacceptable for some use cases (e.g., showing a user their own recent bank transaction immediately after submission).
Con: Significant operational complexity — two data models to maintain, keep in sync, and evolve.

Interview signal: Always call out the eventual consistency problem and discuss when it's acceptable (social media feeds) vs. unacceptable (financial balances).

Pattern 3: Saga Pattern (Distributed Transactions)

The Core Idea

In a microservices architecture, a business operation that spans multiple services cannot use traditional ACID transactions (two-phase commit is too slow and creates distributed deadlocks). The Saga pattern decomposes the transaction into a sequence of local transactions, each publishing an event that triggers the next step. If any step fails, compensating transactions undo the previous steps.

When to Use It

E-commerce order processing: Reserve inventory → Charge payment → Schedule shipping — each in a different service
Travel booking: Book flight → Book hotel → Book car — where partial failure requires rollbacks
Any multi-service workflow requiring atomicity

Trade-offs and Caveats

Pro: No distributed locking. Services remain independently scalable and deployable.
Con: Complicated failure handling — compensating transactions must be idempotent and their own failures must be handled.
Con: Temporary inconsistency is visible — a user might see "Payment charged" before "Order confirmed" if a saga is mid-flight.

Two styles: Choreography (services react to events — decoupled but hard to track) vs. Orchestration (a saga orchestrator tells each service what to do — centralized visibility but a potential single point of failure).

Pattern 4: Sidecar Pattern

The Core Idea

Deploy auxiliary functionality (logging, monitoring, service mesh proxy, configuration management) in a separate container that runs alongside the main application container within the same pod (in Kubernetes). The sidecar shares the same lifecycle, network, and storage as the main container.

When to Use It

Service meshes (Envoy/Istio): Deploy a proxy sidecar that handles mTLS encryption, circuit breaking, and observability without touching application code
Log shipping: A sidecar reads log files and forwards them to a centralized aggregator (Fluentd, Datadog Agent)
Configuration polling: A sidecar watches for config changes and reloads the main application without a restart

Trade-offs and Caveats

Pro: Separation of concerns — network policy, observability, and security are managed outside application code
Pro: Language-agnostic — the same Envoy sidecar works for Java, Python, and Go services
Con: Increases resource usage (CPU/memory for each sidecar per pod) and adds network hop latency

Pattern 5: Circuit Breaker

The Core Idea

When a downstream service starts failing repeatedly, stop making calls to it temporarily instead of hammering it with retries. The circuit "opens" (stops requests) after a failure threshold, enters a "half-open" state periodically to test if the service recovered, and "closes" (resumes normal operation) when health is confirmed.

States and Transitions

Closed (normal): Requests flow through. Track failure count.
Open (failures exceeded threshold): All requests fail fast without calling downstream. A timer starts.
Half-Open (timer expired): Allow one probe request. If successful → Closed. If fails → Open again.

When to Use It

Any synchronous RPC call to an external dependency (payment gateway, third-party API, another microservice). Essential for preventing cascading failures in distributed systems.

Interview signal: Discuss the difference between Circuit Breaker and Retry. Retries are for transient errors (network blip). Circuit Breakers are for sustained degradation. Using retries without Circuit Breakers during sustained outages can amplify the problem with retry storms.

Pattern 6: Outbox Pattern

The Core Idea

When a service needs to both update its database AND publish a message to a message broker (Kafka, RabbitMQ), these are two separate I/O operations — and they can fail independently, leading to inconsistency. The Outbox pattern writes the message to an "outbox" table in the same database transaction as the business data update. A separate relay process reads the outbox and publishes to the broker.

Why It Matters

Without the Outbox pattern, you get "dual write" problems: the DB update succeeds but the message publish fails (no one is notified of the order), or the message publishes but the DB rollback fails (the system thinks the order was created when it wasn't). The Outbox pattern achieves exactly-once delivery semantics using only local database transactions.

When to Use It

Any microservice that needs to reliably publish events upon state changes
Financial systems where "event published exactly once" is a compliance requirement

Pattern 7: Consistent Hashing

The Core Idea

A technique for distributing data/requests across a ring of nodes such that when a node is added or removed, only K/N keys need to be remapped (where K = number of keys, N = number of nodes) — not a complete redistribution.

When to Use It

Distributed caches (Redis Cluster, Memcached) where you need to shard data across multiple cache nodes without rehashing everything when a node is added
Load balancing with session affinity — route the same user to the same backend server
Content delivery networks — route requests to the nearest cache node

Virtual Nodes

A critical detail to mention in interviews: without virtual nodes, consistent hashing distributes load unevenly because physical nodes can be clustered on the ring. Virtual nodes assign each physical node multiple positions on the ring, achieving far more even distribution. This is how DynamoDB and Cassandra implement their partitioning.

Pattern 8: Rate Limiting (Token Bucket vs Leaky Bucket)

The Core Idea

Control the rate at which requests are processed to prevent abuse, protect backend services, and ensure fair resource distribution among clients.

Token Bucket

A bucket holds tokens (up to a max capacity). Tokens are added at a fixed rate. Each request consumes a token. If the bucket is empty, requests are rejected or queued. Allows bursting up to bucket capacity before throttling kicks in. Ideal for APIs where occasional bursts are acceptable.

Leaky Bucket

Requests enter a queue (the bucket) and are processed at a fixed rate (they "leak" out). If the queue is full, new requests are rejected. No bursting — output is always at a constant rate. Ideal for smooth egress to downstream services.

Sliding Window Counter (Most Interview-Recommended)

Combines the precision of a fixed window counter with burst smoothing, avoiding the boundary-condition problem of fixed windows (where 2×limit requests can pass by clustering at window boundaries). Uses Redis to implement efficiently across distributed instances.

Pattern 9: Bulkhead Pattern

The Core Idea

Isolate elements of an application into pools so that if one fails, the others continue to function. Named after ship bulkheads that divide a hull into watertight compartments — if one compartment floods, the ship doesn't sink.

Application in Software

Thread pool isolation: Assign dedicated thread pools to different downstream dependencies. A slow payment gateway exhausts its allocated 20 threads without affecting the 50 threads serving product catalog requests.
Connection pool isolation: Separate database connection pools for critical vs. non-critical operations.
Kubernetes resource limits: Set CPU and memory limits per pod so a runaway service can't starve other services on the same node.

Interview signal: Mention that Bulkhead is complementary to Circuit Breaker — they solve related but different problems. Circuit Breaker stops calling a broken downstream. Bulkhead prevents a broken call from consuming all resources and starving other healthy calls.

Pattern 10: Write-Through vs Write-Behind vs Cache-Aside

Why Caching Strategy Is a Design Pattern

Every system design question with a database will involve a cache. The strategy for keeping the cache and database consistent is a first-class design decision — and most candidates undersell it.

Cache-Aside (Lazy Loading)

Application reads from cache. On miss: reads from DB, writes to cache, returns result. Application writes go directly to DB, optionally invalidating/updating the cache. Most common pattern. Cache only contains frequently accessed data. Risk: cache stampede on cold start (N simultaneous requests all miss, all hit DB).

Write-Through

Every write goes to both the cache and DB synchronously. Cache is always fresh. Risk: every write is slower (two I/O operations). Cache accumulates data that may never be read.

Write-Behind (Write-Back)

Writes go to cache only and return immediately. A background process asynchronously flushes cache to DB. Highest write throughput. Risk: data loss if the cache fails before flushing. Not appropriate for financial or audit-critical data.

Interview signal: When asked about caching, immediately clarify which strategy you're proposing and justify why given the consistency and durability requirements of the system.

How to Apply These Patterns in an Interview

Knowing the patterns is the entry point. Using them well requires:

Identify the core constraint: Is the problem latency? Consistency? Throughput? Fault tolerance? This determines which patterns are relevant.
Propose a pattern with justification: "I'd use Event Sourcing here because the financial audit requirement means we need a complete immutable history of all changes."
Proactively call out trade-offs: Interviewers are waiting for you to acknowledge the downsides. They know the patterns — they want to know if you do.
Compose patterns: Real systems use multiple patterns. Event Sourcing + CQRS + Outbox is a common production combination for event-driven microservices.

Practice System Design with AI

System design is a skill that requires practice in the format of the actual interview — open-ended, interactive, with follow-up probing. Reading this article builds your vocabulary. Practicing out loud builds your performance.

MockExperts' AI interviewer simulates full system design rounds with adaptive follow-up questions, drawing from real interview patterns at FAANG companies and high-growth startups. Start a free system design mock interview and test how well you can apply these patterns under realistic interview pressure.

🔑 Crack Scalable Architecture & System Design:

Master high-level scalability. Read the 2026 System Design Masterclass: Scalability Patterns for Senior Engineers and test your distributed systems knowledge inside our AI System Design Simulator.

10 System Design Patterns You Must Know in 2026

Why Patterns Matter More Than Memorizing Architectures

Pattern 1: Event Sourcing

Test Your System Design Under Proctored Limits

The Core Idea

When to Use It

Trade-offs and Caveats

Pattern 2: CQRS (Command Query Responsibility Segregation)

The Core Idea

When to Use It

Trade-offs and Caveats

Pattern 3: Saga Pattern (Distributed Transactions)

The Core Idea

When to Use It

Trade-offs and Caveats

Pattern 4: Sidecar Pattern

The Core Idea

When to Use It

Trade-offs and Caveats

Pattern 5: Circuit Breaker

The Core Idea

States and Transitions

When to Use It

Pattern 6: Outbox Pattern

The Core Idea

Why It Matters

When to Use It

Pattern 7: Consistent Hashing

The Core Idea

When to Use It

Virtual Nodes

Pattern 8: Rate Limiting (Token Bucket vs Leaky Bucket)

The Core Idea

Token Bucket

Leaky Bucket

Sliding Window Counter (Most Interview-Recommended)

Pattern 9: Bulkhead Pattern

The Core Idea

Application in Software

Pattern 10: Write-Through vs Write-Behind vs Cache-Aside

Why Caching Strategy Is a Design Pattern

Cache-Aside (Lazy Loading)

Write-Through

Write-Behind (Write-Back)

How to Apply These Patterns in an Interview

Practice System Design with AI

🔑 Crack Scalable Architecture & System Design:

Does Your Resume Pass FAANG Audits?

Master System Design Architecture

Stay Ahead of the Competition