System Design
May 28, 2026
1K Views
13 min read

How to Design Figma in a System Design Interview: CRDTs, WebSockets & Real-Time Sync (2026)

Getting asked to 'design Figma' in a system design interview? Most candidates freeze at CRDTs and conflict resolution. This guide gives you the exact architecture, trade-offs, and 45-minute answer framework.

Advertisement
How to Design Figma in a System Design Interview: CRDTs, WebSockets & Real-Time Sync (2026)

Why "Design Figma" is the Most Feared System Design Question in 2026

Interviewers at Google, Meta, Notion, Linear, and Canva have made the collaborative real-time editor one of their signature system design questions. The reason it's so powerful as an interview filter: it simultaneously tests your understanding of consistency models, distributed state, networking protocols, conflict resolution algorithms, and CDN-based asset delivery — all in a single 45-minute session. A candidate who can articulate the trade-offs between CRDTs and Operational Transformation while also sketching a WebSocket server cluster with Redis Pub/Sub is immediately identifiable as a senior system thinker.

This guide gives you the exact framework to answer "Design Figma" or "Design Google Docs" in a way that makes interviewers lean forward in their chairs. Every major component is covered — from cursor synchronization at 60fps to durable document snapshotting and cross-region presence broadcasts.

📐 Get Graded on Real System Design Answers

Our AI system design interviewer evaluates your whiteboard reasoning, trade-off discussions, and scaling logic — just like a Staff Engineer would. Start free.

Practice System Design Interview →

Step 1: Define Functional and Non-Functional Requirements

System Design Exam Simulator

Test Your System Design Under Proctored Limits

Clear senior loops by practicing real-world microservice scaling, cache strategies, and DB sharding questions in our secure test simulator.

Requires Camera & Fullscreen Setup

The first 5 minutes of any system design interview should be spent scoping the problem. Don't start drawing boxes until you've aligned on requirements. For Figma specifically:

What are the functional requirements for a collaborative design tool like Figma?

  • Multiple users can simultaneously edit a shared canvas (shapes, text, images).
  • Each user's cursor position is visible to all other participants in real time.
  • Changes from any user propagate to all other users with sub-200ms latency.
  • Documents are persisted durably and can be recovered after disconnection.
  • Users can comment on specific layers or coordinates.
  • Full version history with the ability to restore previous states.

What non-functional requirements does a real-time collaborative editor demand?

  • Scale: Support 1 million concurrent users across documents.
  • Latency: Cursor position updates must arrive within 50ms for users in the same region.
  • Consistency: All users must converge to the same document state eventually (strong eventual consistency).
  • Availability: 99.99% uptime — brief regional outages should not corrupt documents.

Step 2: The Core Conflict — CRDTs vs. Operational Transformation (OT)

This is the most intellectually demanding part of the interview. How do you ensure that if User A inserts a shape at position (100, 200) and User B simultaneously moves the same shape to position (150, 250), both users converge to the same final state?

How does Operational Transformation (OT) resolve conflicts in Google Docs?

OT is a centralized, server-coordinated conflict resolution strategy. Every operation (insert character, move element, resize shape) is sent to a central server as a transformation function rather than an absolute state. The server applies a transformation function that adjusts operations based on concurrent concurrent operations that happened before them:

  • User A sends: "Insert 'Hello' at index 0"
  • User B sends: "Insert 'World' at index 0" (concurrent)
  • Server applies A first, then transforms B's operation: "Insert 'World' at index 6" (after 'Hello ')
  • Server broadcasts the transformed operations to all clients.

Limitation: OT requires all operations to pass through a single central server. At global scale with millions of users, this central transformation server becomes a single point of failure and a throughput bottleneck.

How do CRDTs (Conflict-Free Replicated Data Types) power Figma's real-time sync?

CRDTs are mathematical data structures designed to resolve conflicts by construction. The algebraic properties of a CRDT guarantee that applying operations in any order always produces the same result — no coordination server required. Figma's document structure uses a specialized tree CRDT called a Hybrid Logical Clock (HLC) CRDT tree:

  • Every node (layer, shape, frame) has a globally unique CRDT identifier composed of a logical timestamp and a random node ID.
  • When two users delete and re-insert the same node simultaneously, the CRDT's merge function deterministically picks a winner based on timestamp ordering, without any round-trip to a central arbitration server.
  • This enables Figma's peer-assisted synchronization — clients can sync directly with each other during edge server outages, then reconcile with the central store when connectivity is restored.
// Simplified CRDT sequence for element positioning
// Each update carries { elementId, position, logicalClock, authorId }
{
  "type": "MOVE_ELEMENT",
  "elementId": "rect-a4f2",
  "newPosition": { "x": 200, "y": 350 },
  "hlcTimestamp": "2026-05-28T12:00:00.001Z:node-7e3f", // HLC + nodeId
  "authorId": "user-01"
}
// If a concurrent operation arrives with the same elementId but
// a different hlcTimestamp, the higher HLC value wins (last-write-wins per element)

Step 3: Networking Architecture — WebSockets, WebRTC, and WebTransport

How do you synchronize cursor positions across 20+ users at 60fps?

Cursor position updates happen up to 60 times per second per user. A document with 20 simultaneous users generates 1,200 cursor events per second. These are not business-critical operations — dropping 1 in 5 cursor packets is invisible to users. For this tier, we use WebTransport over HTTP/3 (QUIC). Because QUIC is UDP-based, a single dropped packet doesn't stall the entire stream. The cursor just jumps to the next received position smoothly.

Why do document mutations require WebSockets instead of WebTransport?

Shape insertions, deletions, property changes, and layer moves are infrequent but must be delivered reliably. For these, we use persistent WebSocket connections over TCP. WebSockets provide bidirectional, ordered, guaranteed delivery — at the cost of head-of-line blocking if a packet is lost. For document mutations, this trade-off is acceptable because the frequency is low and ordering matters.

What does the high-level architecture of a Figma-like system look like?


Users (Global)
    │
    ▼
[Cloudflare Workers / Global Edge Layer]
    │                    │
    ▼                    ▼
WebSocket Server     WebTransport Server
(Doc mutations)      (Cursor positions)
    │                    │
    └────────┬───────────┘
             ▼
      [Redis Pub/Sub Cluster]
      (Room-level broadcast bus)
             │
             ▼
    [CRDT Merge Engine]
    (Resolve conflicts server-side)
             │
             ▼
    [PostgreSQL — Document Snapshots]
    [S3 — Assets (images, fonts)]
    [Redis — Ephemeral cursor state]

Step 4: Presence System — Who is Currently Viewing This Document?

Tracking which users are currently inside a specific document is a classic distributed presence problem. Our architecture uses the following pattern:

  1. On connection: Each user sends a "join room" event. The WebSocket server writes the user's session to a Redis Hash: presence:{docId} → { userId: lastHeartbeatTimestamp }.
  2. Heartbeat: Every 5 seconds, the client sends a lightweight heartbeat ping. The server refreshes the Redis TTL for that user's presence entry.
  3. On disconnect or missed heartbeat: Redis TTL expires (within 10 seconds) and the user is automatically purged from the presence map. The system broadcasts a "user left" event to all connected collaborators.

Step 5: Persistence & Version History

Raw CRDT event logs grow unboundedly. A large Figma document with 6 months of history might have millions of events. Our storage strategy uses a two-tier approach:

How do you handle real-time event replay for reconnecting users?

All CRDT operations are appended to a time-series log in Apache Kafka. This enables real-time event replay for users who reconnect after a brief disconnection — they can catch up by replaying events from their last known position in the Kafka offset.

Why are periodic document snapshots critical for onboarding new collaborators?

Every 15 minutes, a background job runs the CRDT merge algorithm across the event log and writes a full document snapshot as a JSON blob to PostgreSQL. This snapshot becomes the "base state" for new users who join the document for the first time, preventing them from having to replay thousands of events.

How should image and font assets be stored and delivered globally?

All image uploads, custom fonts, and vector assets are stored in S3. A CloudFront CDN sits in front of S3, ensuring low-latency delivery globally. Asset URLs are embedded in the document CRDT as stable references — the CDN handles the delivery optimization independently.

Step 6: Scaling to 1 Million Concurrent Users

A single WebSocket server can handle approximately 50,000–100,000 concurrent connections, depending on message frequency. At 1 million concurrent users across all documents:

  • Horizontal scaling: Deploy 20+ WebSocket server instances behind a Layer 4 load balancer (with sticky sessions per document room).
  • Room sharding: Each document room is "pinned" to a specific WebSocket server instance. All users in a given document connect to the same server, minimizing inter-server Pub/Sub overhead.
  • Redis Cluster: Shard the presence and event data across a Redis cluster with 6 nodes (3 primary + 3 replicas), ensuring no single Redis node becomes a bottleneck.
  • Global edge replication: Deploy WebSocket server clusters in at least 5 geographic regions (US-East, US-West, EU-West, Asia-Pacific, India). Users connect to the nearest region, reducing baseline latency by 60-80ms for international users.

Continue Mastering System Design

If you found this Figma deep-dive valuable, these related guides will strengthen your system design arsenal:

Can You Explain This Architecture Out Loud in 45 Minutes?

Reading system design is not enough. You need to practice verbalizing trade-offs under time pressure. Our AI evaluates your reasoning, asks follow-up probes, and scores your architectural decisions in real time.

Try System Design Mock Interview →
AI Resume Truth Serum

Does Your Resume Pass FAANG Audits?

Before applying, upload your resume. Our lightweight parsing agents will instantly scan for contradictions, project-scaling metrics, or over-claimed achievements.

Drag & drop your PDF resume

or click to browse local files (PDF only, max 5MB)

Master System Design Architecture

Practice real-world system design interviews with AI. Get high-level feedback on scalability, reliability, and security.

✅ Dynamic Product Match🚀 Trusted by 1k+ Engineers
Advertisement
Share this article:
Found this helpful?
System Design
Figma
Real-time
WebSockets
CRDT
Operational Transformation
Scalability
Interview
📋 Legal Disclaimer & Copyright Information

Educational Purpose: This article is published solely for educational and informational purposes to help candidates prepare for technical interviews. It does not constitute professional career advice, legal advice, or recruitment guidance.

Nominative Fair Use of Trademarks: Company names, product names, and brand identifiers (including but not limited to Google, Meta, Amazon, Goldman Sachs, Bloomberg, Pramp, OpenAI, Anthropic, and others) are referenced solely to describe the subject matter of interview preparation. Such use is permitted under the nominative fair use doctrine and does not imply sponsorship, endorsement, affiliation, or certification by any of these organisations. All trademarks and registered trademarks are the property of their respective owners.

No Proprietary Question Reproduction: All interview questions, processes, and experiences described herein are based on community-reported patterns, publicly available candidate feedback, and general industry knowledge. MockExperts does not reproduce, distribute, or claim ownership of any proprietary assessment content, internal hiring rubrics, or confidential evaluation criteria belonging to any company.

No Official Affiliation: MockExperts is an independent AI-powered interview preparation platform. We are not officially affiliated with, partnered with, or approved by Google, Meta, Amazon, Goldman Sachs, Bloomberg, Pramp, or any other company mentioned in our content.

Get Weekly Dives

Stay Ahead of the Competition

Join 1k+ engineers receiving our weekly deep-dives into FAANG interview patterns and system design guides.

No spam. Just hard-hitting technical insights once a week.