🧠 Simple Definition (Word-for-word)

Core: generate a short code (6-8 chars) mapped to a long URL.

⚡ Super Simple Line

Schema: {id, shortCode, longUrl, userId, createdAt, clickCount}.

⚡ Key Details & Explanation

Core: generate a short code (6-8 chars) mapped to a long URL. Schema: {id, shortCode, longUrl, userId, createdAt, clickCount}. Shortcode generation: base62 encoding of auto-incremented ID, or random string with collision check. API: POST /shorten → returns shortUrl; GET /:code → 301 redirect to longUrl. Scalability: cache hot URLs in Redis (most URLs are accessed rarely — 80/20 rule), CDN for the redirect service, DB sharding by shortCode hash for massive scale. Analytics: async queue for click tracking.

⚡ One-line Interview Answer

Core: generate a short code (6-8 chars) mapped to a long URL.

🧠 Simple Definition (Word-for-word)

WebSockets for persistent connections — but can't hold 1M open connections on one server.

⚡ Super Simple Line

Solution: use a pub/sub layer (Redis Pub/Sub or Kafka).

⚡ Key Details & Explanation

WebSockets for persistent connections — but can't hold 1M open connections on one server. Solution: use a pub/sub layer (Redis Pub/Sub or Kafka). Each notification server subscribes to channels for its connected users. When a notification is generated, publish to the user's channel — the server holding that user's connection delivers it. Use a message queue for delivery guarantees and retry. Store undelivered notifications in DB for users who are offline. Send push notifications (FCM/APNs) for mobile.

⚡ One-line Interview Answer

WebSockets for persistent connections — but can't hold 1M open connections on one server.

🧠 Simple Definition (Word-for-word)

Avoid sending large files through your server.

⚡ Super Simple Line

Pattern: client requests a presigned URL from your API, your API calls S3 to generate a presigned URL, client uploads directly to S3 bypassing your server.

⚡ Key Details & Explanation

Avoid sending large files through your server. Pattern: client requests a presigned URL from your API, your API calls S3 to generate a presigned URL, client uploads directly to S3 bypassing your server. For very large files (>100MB): multipart upload — split into chunks (5MB+ each), upload each chunk with its own presigned URL, complete multipart upload when all chunks arrive. Your API just coordinates metadata. Resumability: track which chunks are uploaded, resume from where it failed.

⚡ One-line Interview Answer

Avoid sending large files through your server.

🧠 Simple Definition (Word-for-word)

A Rate Limiter is a system component that limits the number of requests a client can make within a given timeframe to protect API resources from abuse or overloading. The Token Bucket algorithm maintains a bucket filled with tokens at a constant rate; requests consume tokens, allowing bursts of traffic. The Sliding Window Counter tracks request timestamps within a sliding window interval, offering strict rate limits with low memory overhead.

⚡ Super Simple Line

Token Bucket = allows traffic bursts (if tokens are in bucket, requests run instantly).
Sliding Window = strictly counts requests over a rolling time window (no bursts allowed).

📊 Comparison of Rate Limiting Algorithms

Algorithm	Traffic Bursts	Memory Usage	Implementation Complexity
Token Bucket	✅ Allowed	Low (stores integer)	Easy
Leaky Bucket	❌ Smoothed out	Low (FIFO queue)	Medium
Sliding Window Log	❌ Strict limit	⚠️ High (stores all timestamps)	Hard
Sliding Window Counter	❌ Strict limit	Low (stores counters)	Medium

🧪 Redis Implementation Concept

Redis is commonly used to build distributed rate limiters because operations are atomic:

// Sliding Window Counter using Redis sorted set (ZSET)
async function isRateLimited(userId) {
  const now = Date.now();
  const windowMs = 60000; // 1 minute
  const maxRequests = 100;
  
  const key = `rate_limit:${userId}`;
  
  await redis.multi()
    .zremrangebyscore(key, 0, now - windowMs) // Remove old timestamps
    .zcard(key) // Count remaining request logs
    .zadd(key, now, now) // Log current request
    .expire(key, 60)
    .exec();
}

⚡ One-line Interview Answer

Rate limiters protect APIs using algorithms like Token Bucket, which permits momentary bursts of traffic, or Sliding Window Counter, which strictly throttles traffic with low memory footprints.

🧠 Simple Definition (Word-for-word)

In a distributed system you can guarantee at most 2 of 3: Consistency (every read gets the latest write), Availability (every request gets a response), Partition Tolerance (system works despite network failures).

⚡ Super Simple Line

Since network partitions WILL happen, the real choice is C vs A during a partition.

⚡ Key Details & Explanation

In a distributed system you can guarantee at most 2 of 3: Consistency (every read gets the latest write), Availability (every request gets a response), Partition Tolerance (system works despite network failures). Since network partitions WILL happen, the real choice is C vs A during a partition. CP systems (PostgreSQL, MongoDB with majority reads): return error rather than stale data. AP systems (Cassandra, DynamoDB): return possibly stale data but remain available. Choose based on your domain: banking = CP, social feed = AP.

⚡ One-line Interview Answer

In a distributed system you can guarantee at most 2 of 3: Consistency (every read gets the latest write), Availability (every request gets a response), Partition Tolerance (system works despite network failures).

🧠 Simple Definition (Word-for-word)

Eventual consistency: in a distributed system, all replicas will eventually converge to the same value — but at any point in time, different nodes may serve different values.

⚡ Super Simple Line

Acceptable for: social media feeds (stale posts for a few seconds is fine), shopping cart (eventual sync is ok), DNS propagation, search indexes.

⚡ Key Details & Explanation

Eventual consistency: in a distributed system, all replicas will eventually converge to the same value — but at any point in time, different nodes may serve different values. Acceptable for: social media feeds (stale posts for a few seconds is fine), shopping cart (eventual sync is ok), DNS propagation, search indexes. Not acceptable for: financial transactions (money debits), inventory counts where overselling is a problem, authentication (revoked tokens must be respected immediately).

⚡ One-line Interview Answer

Eventual consistency: in a distributed system, all replicas will eventually converge to the same value — but at any point in time, different nodes may serve different values.

🧠 Simple Definition (Word-for-word)

Vertical: add more CPU/RAM to one machine — simple, no code changes, but limited and expensive, single point of failure.

⚡ Super Simple Line

Horizontal: add more machines — theoretically unlimited scale, resilient.

⚡ Key Details & Explanation

Vertical: add more CPU/RAM to one machine — simple, no code changes, but limited and expensive, single point of failure. Horizontal: add more machines — theoretically unlimited scale, resilient. Horizontal is hard when: state needs to be shared (sessions, file uploads — need shared Redis/S3), database writes don't scale easily horizontally (sharding adds complexity), stateful WebSocket connections tied to one server. Stateless apps scale horizontally easily — this is why stateless JWT and external state stores (Redis) matter.

⚡ One-line Interview Answer

Vertical: add more CPU/RAM to one machine — simple, no code changes, but limited and expensive, single point of failure.

🧠 Simple Definition (Word-for-word)

Message queues decouple producers and consumers, enable async processing, and add resilience.

⚡ Super Simple Line

RabbitMQ: traditional message broker, complex routing (exchanges, bindings), good for task queues, RPC patterns, small-medium volume.

⚡ Key Details & Explanation

Message queues decouple producers and consumers, enable async processing, and add resilience. RabbitMQ: traditional message broker, complex routing (exchanges, bindings), good for task queues, RPC patterns, small-medium volume. Kafka: high-throughput event streaming platform, messages retained and replayable, consumer groups, event sourcing — best for data pipelines, analytics, millions of events/sec. Bull/BullMQ: Redis-backed job queue for Node.js — simple, great for background jobs (email, thumbnail generation), retry logic, no separate infra if you already use Redis. A practical answer should also mention how I would verify the choice, for example by checking query plans, measuring response time, or testing behavior under concurrent writes.

⚡ One-line Interview Answer

Message queues decouple producers and consumers, enable async processing, and add resilience.

🧠 Simple Definition (Word-for-word)

Microservices: split an app into independently deployable services, each owning its data and running in its own process.

⚡ Super Simple Line

Benefits: independent scaling, independent deployment, tech heterogeneity, team autonomy.

⚡ Key Details & Explanation

Microservices: split an app into independently deployable services, each owning its data and running in its own process. Benefits: independent scaling, independent deployment, tech heterogeneity, team autonomy. Problems they introduce: distributed system complexity (network failures, latency), service discovery, distributed tracing, data consistency across services (no ACID transactions), API contracts between teams, operational overhead (many deployments, many logs). Start with a monolith, extract services only when there's a clear scaling or team boundary reason.

⚡ One-line Interview Answer

Microservices: split an app into independently deployable services, each owning its data and running in its own process.

🧠 Simple Definition (Word-for-word)

CDN (Content Delivery Network): geographically distributed servers that cache and serve static assets (images, JS, CSS, videos) from the edge server closest to the user.

⚡ Super Simple Line

How it works: request hits CDN edge, on cache miss the edge fetches from your origin server and caches it (TTL-based).

⚡ Key Details & Explanation

CDN (Content Delivery Network): geographically distributed servers that cache and serve static assets (images, JS, CSS, videos) from the edge server closest to the user. How it works: request hits CDN edge, on cache miss the edge fetches from your origin server and caches it (TTL-based). Subsequent requests are served from the edge without hitting your origin. In Next.js: next/image automatically uses the CDN configured in Vercel, static assets in /public are CDN-served by default. Cache-Control headers control CDN caching behavior.

⚡ One-line Interview Answer

CDN (Content Delivery Network): geographically distributed servers that cache and serve static assets (images, JS, CSS, videos) from the edge server closest to the user.

🧠 Simple Definition (Word-for-word)

Long polling: client makes request, server holds it open until data is available, then responds, client immediately makes new request — lots of overhead, HTTP overhead per message.

⚡ Super Simple Line

SSE (Server-Sent Events): persistent one-directional connection (server to client only) over HTTP — simple, auto-reconnect, EventSource API.

⚡ Key Details & Explanation

Long polling: client makes request, server holds it open until data is available, then responds, client immediately makes new request — lots of overhead, HTTP overhead per message. SSE (Server-Sent Events): persistent one-directional connection (server to client only) over HTTP — simple, auto-reconnect, EventSource API. WebSockets: full-duplex persistent TCP connection — both directions, lower overhead per message, more complex. Use SSE for: notifications, live feeds (server pushes only). Use WebSockets for: chat, collaborative editing, games (bidirectional real-time).

⚡ One-line Interview Answer

Long polling: client makes request, server holds it open until data is available, then responds, client immediately makes new request — lots of overhead, HTTP overhead per message.

🧠 Simple Definition (Word-for-word)

Two main approaches: Operational Transformation (OT) — transforms concurrent operations to be compatible (used by Google Docs); CRDT (Conflict-free Replicated Data Types) — data structures designed to merge automatically without conflict (used by Figma, Notion).

⚡ Super Simple Line

For a simpler system: use Yjs library (CRDT-based), connect via WebSocket (Socket.io or native), broadcast operations to all connected clients.

⚡ Key Details & Explanation

Two main approaches: Operational Transformation (OT) — transforms concurrent operations to be compatible (used by Google Docs); CRDT (Conflict-free Replicated Data Types) — data structures designed to merge automatically without conflict (used by Figma, Notion). For a simpler system: use Yjs library (CRDT-based), connect via WebSocket (Socket.io or native), broadcast operations to all connected clients. Persist: save document state to DB on debounced changes or a dedicated save action. The hard part is handling concurrent edits and cursor positions.

⚡ One-line Interview Answer

Two main approaches: Operational Transformation (OT) — transforms concurrent operations to be compatible (used by Google Docs); CRDT (Conflict-free Replicated Data Types) — data structures designed to merge automatically without conflict (used by Figma, Notion).

🧠 Simple Definition (Word-for-word)

For a chat system, I would start with conversations, participants, and messages as the core data model.

⚡ Super Simple Line

When a user sends a message, the backend should validate the user, save the message to the database first, and then deliver it in real time to online users through WebSockets.

⚡ Key Details & Explanation

For a chat system, I would start with conversations, participants, and messages as the core data model. When a user sends a message, the backend should validate the user, save the message to the database first, and then deliver it in real time to online users through WebSockets. If the recipient is offline, the message stays stored and can be loaded when they reconnect, and the system can also send a push notification. Message history should use cursor pagination because chat history can become large and users usually load older messages gradually. The system also needs delivery status, read receipts if required, and ordering by a reliable timestamp or sequence number. At scale, WebSocket servers need shared state through Redis Pub/Sub, Kafka, or another messaging layer so a message can reach users connected to different servers. The most important tradeoff is reliability versus complexity: for simple chat, saving messages and pushing live updates is enough, but for large-scale chat, ordering, retries, and multi-server delivery become the hard parts.

⚡ One-line Interview Answer

For a chat system, I would start with conversations, participants, and messages as the core data model.

🧠 Simple Definition (Word-for-word)

Frontend should debounce input and cancel stale requests.

⚡ Super Simple Line

Backend options depend on scale: SQL prefix search for simple cases, Elasticsearch/OpenSearch for large-scale ranking, typo tolerance, and advanced relevance.

⚡ Key Details & Explanation

Frontend should debounce input and cancel stale requests. Backend options depend on scale: SQL prefix search for simple cases, Elasticsearch/OpenSearch for large-scale ranking, typo tolerance, and advanced relevance. Return small payloads quickly, usually top 5-10 suggestions. Cache hot queries, track analytics for no-result searches, and rank by a combination of prefix match, popularity, and recent behavior where relevant.

⚡ One-line Interview Answer

Frontend should debounce input and cancel stale requests.

🧠 Simple Definition (Word-for-word)

An API gateway is the single entry point for clients in front of multiple services.

⚡ Super Simple Line

It can handle authentication, rate limiting, request routing, header normalization, logging, and sometimes response aggregation.

⚡ Key Details & Explanation

An API gateway is the single entry point for clients in front of multiple services. It can handle authentication, rate limiting, request routing, header normalization, logging, and sometimes response aggregation. This keeps cross-cutting concerns out of every service. Tradeoff: it can become a bottleneck or too smart if you push business logic into it. Keep it focused on routing and platform concerns, not domain behavior.

⚡ One-line Interview Answer

An API gateway is the single entry point for clients in front of multiple services.

🧠 Simple Definition (Word-for-word)

Event-driven architecture is a design pattern where the flow of the application is driven by events.

⚡ Super Simple Line

Instead of directly calling functions, components communicate by emitting and listening to events.

⚡ Key Details & Explanation

Event-driven architecture is a design pattern where the flow of the application is driven by events.

Instead of directly calling functions, components communicate by emitting and listening to events.

Benefits include:

Loose coupling between components
Better scalability
Easier to extend features

In Node.js, this pattern is very common because of EventEmitter and async nature.

For example, in a notification system, one service emits an event and multiple services like email or logging can react to it independently.

⚡ One-line Interview Answer

Event-driven architecture is a design pattern where the flow of the application is driven by events.

🧠 Simple Definition (Word-for-word)

To handle a sudden spike in traffic: I would first ensure that my infrastructure can scale (using cloud services like AWS or GCP).

⚡ Super Simple Line

Implement auto-scaling to add more servers as needed.

⚡ Key Details & Explanation

To handle a sudden spike in traffic:

I would first ensure that my infrastructure can scale (using cloud services like AWS or GCP).
Implement auto-scaling to add more servers as needed.
Use a load balancer to distribute traffic evenly across servers.
Implement caching to reduce database load.
And if necessary, I would also consider using a CDN to serve static assets faster.

This way, I can maintain performance and prevent downtime during traffic spikes.

⚡ One-line Interview Answer

To handle a sudden spike in traffic: I would first ensure that my infrastructure can scale (using cloud services like AWS or GCP).

System Design

Design a URL shortener (e.g. Bitly). Walk through schema, API, and scalability.

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

Design a real-time notification system for a web app with 1M users.

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

Design a file upload system with large file support.

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

How would you design a rate limiter? Explain token bucket vs sliding window algorithms.

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

📊 Comparison of Rate Limiting Algorithms

🧪 Redis Implementation Concept

⚡ One-line Interview Answer

What is the CAP theorem?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

What is eventual consistency? When is it acceptable?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

Horizontal vs vertical scaling. When does horizontal scaling become hard?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

What is a message queue? When Kafka vs RabbitMQ vs Bull/Redis?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

What are microservices? What problems do they introduce vs a monolith?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

What is a CDN and how does it work?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

Difference between WebSockets, SSE, and long polling?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

How would you implement a live collaborative document editor?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

Design a chat system for a product with online/offline users.

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

How would you design search autocomplete?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

What is the role of an API gateway in a distributed system?

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line

⚡ Key Details & Explanation

⚡ One-line Interview Answer

Explain Event Driven Architecture and its benefits in web applications.

🧠 Simple Definition (Word-for-word)

⚡ Super Simple Line