🧠 Simple Definition (Word-for-word)
Core: generate a short code (6-8 chars) mapped to a long URL.
⚡ Super Simple Line
Schema: {id, shortCode, longUrl, userId, createdAt, clickCount}.
⚡ Key Details & Explanation
Core: generate a short code (6-8 chars) mapped to a long URL. Schema: {id, shortCode, longUrl, userId, createdAt, clickCount}. Shortcode generation: base62 encoding of auto-incremented ID, or random string with collision check. API: POST /shorten → returns shortUrl; GET /:code → 301 redirect to longUrl. Scalability: cache hot URLs in Redis (most URLs are accessed rarely — 80/20 rule), CDN for the redirect service, DB sharding by shortCode hash for massive scale. Analytics: async queue for click tracking.
⚡ One-line Interview Answer
Core: generate a short code (6-8 chars) mapped to a long URL.
🧠 Simple Definition (Word-for-word)
WebSockets for persistent connections — but can't hold 1M open connections on one server.
⚡ Super Simple Line
Solution: use a pub/sub layer (Redis Pub/Sub or Kafka).
⚡ Key Details & Explanation
WebSockets for persistent connections — but can't hold 1M open connections on one server. Solution: use a pub/sub layer (Redis Pub/Sub or Kafka). Each notification server subscribes to channels for its connected users. When a notification is generated, publish to the user's channel — the server holding that user's connection delivers it. Use a message queue for delivery guarantees and retry. Store undelivered notifications in DB for users who are offline. Send push notifications (FCM/APNs) for mobile.
⚡ One-line Interview Answer
WebSockets for persistent connections — but can't hold 1M open connections on one server.
🧠 Simple Definition (Word-for-word)
Avoid sending large files through your server.
⚡ Super Simple Line
Pattern: client requests a presigned URL from your API, your API calls S3 to generate a presigned URL, client uploads directly to S3 bypassing your server.
⚡ Key Details & Explanation
Avoid sending large files through your server. Pattern: client requests a presigned URL from your API, your API calls S3 to generate a presigned URL, client uploads directly to S3 bypassing your server. For very large files (>100MB): multipart upload — split into chunks (5MB+ each), upload each chunk with its own presigned URL, complete multipart upload when all chunks arrive. Your API just coordinates metadata. Resumability: track which chunks are uploaded, resume from where it failed.
⚡ One-line Interview Answer
Avoid sending large files through your server.
🧠 Simple Definition (Word-for-word)
A Rate Limiter is a system component that limits the number of requests a client can make within a given timeframe to protect API resources from abuse or overloading. The Token Bucket algorithm maintains a bucket filled with tokens at a constant rate; requests consume tokens, allowing bursts of traffic. The Sliding Window Counter tracks request timestamps within a sliding window interval, offering strict rate limits with low memory overhead.
⚡ Super Simple Line
Token Bucket = allows traffic bursts (if tokens are in bucket, requests run instantly).
Sliding Window = strictly counts requests over a rolling time window (no bursts allowed).
📊 Comparison of Rate Limiting Algorithms
| Algorithm | Traffic Bursts | Memory Usage | Implementation Complexity |
|---|---|---|---|
| Token Bucket | ✅ Allowed | Low (stores integer) | Easy |
| Leaky Bucket | ❌ Smoothed out | Low (FIFO queue) | Medium |
| Sliding Window Log | ❌ Strict limit | ⚠️ High (stores all timestamps) | Hard |
| Sliding Window Counter | ❌ Strict limit | Low (stores counters) | Medium |
🧪 Redis Implementation Concept
Redis is commonly used to build distributed rate limiters because operations are atomic:
// Sliding Window Counter using Redis sorted set (ZSET)
async function isRateLimited(userId) {
const now = Date.now();
const windowMs = 60000; // 1 minute
const maxRequests = 100;
const key = `rate_limit:${userId}`;
await redis.multi()
.zremrangebyscore(key, 0, now - windowMs) // Remove old timestamps
.zcard(key) // Count remaining request logs
.zadd(key, now, now) // Log current request
.expire(key, 60)
.exec();
}
⚡ One-line Interview Answer
Rate limiters protect APIs using algorithms like Token Bucket, which permits momentary bursts of traffic, or Sliding Window Counter, which strictly throttles traffic with low memory footprints.
🧠 Simple Definition (Word-for-word)
In a distributed system you can guarantee at most 2 of 3: Consistency (every read gets the latest write), Availability (every request gets a response), Partition Tolerance (system works despite network failures).
⚡ Super Simple Line
Since network partitions WILL happen, the real choice is C vs A during a partition.
⚡ Key Details & Explanation
In a distributed system you can guarantee at most 2 of 3: Consistency (every read gets the latest write), Availability (every request gets a response), Partition Tolerance (system works despite network failures). Since network partitions WILL happen, the real choice is C vs A during a partition. CP systems (PostgreSQL, MongoDB with majority reads): return error rather than stale data. AP systems (Cassandra, DynamoDB): return possibly stale data but remain available. Choose based on your domain: banking = CP, social feed = AP.
⚡ One-line Interview Answer
In a distributed system you can guarantee at most 2 of 3: Consistency (every read gets the latest write), Availability (every request gets a response), Partition Tolerance (system works despite network failures).
🧠 Simple Definition (Word-for-word)
Eventual consistency: in a distributed system, all replicas will eventually converge to the same value — but at any point in time, different nodes may serve different values.
⚡ Super Simple Line
Acceptable for: social media feeds (stale posts for a few seconds is fine), shopping cart (eventual sync is ok), DNS propagation, search indexes.
⚡ Key Details & Explanation
Eventual consistency: in a distributed system, all replicas will eventually converge to the same value — but at any point in time, different nodes may serve different values. Acceptable for: social media feeds (stale posts for a few seconds is fine), shopping cart (eventual sync is ok), DNS propagation, search indexes. Not acceptable for: financial transactions (money debits), inventory counts where overselling is a problem, authentication (revoked tokens must be respected immediately).
⚡ One-line Interview Answer
Eventual consistency: in a distributed system, all replicas will eventually converge to the same value — but at any point in time, different nodes may serve different values.
🧠 Simple Definition (Word-for-word)
Vertical: add more CPU/RAM to one machine — simple, no code changes, but limited and expensive, single point of failure.
⚡ Super Simple Line
Horizontal: add more machines — theoretically unlimited scale, resilient.
⚡ Key Details & Explanation
Vertical: add more CPU/RAM to one machine — simple, no code changes, but limited and expensive, single point of failure. Horizontal: add more machines — theoretically unlimited scale, resilient. Horizontal is hard when: state needs to be shared (sessions, file uploads — need shared Redis/S3), database writes don't scale easily horizontally (sharding adds complexity), stateful WebSocket connections tied to one server. Stateless apps scale horizontally easily — this is why stateless JWT and external state stores (Redis) matter.
⚡ One-line Interview Answer
Vertical: add more CPU/RAM to one machine — simple, no code changes, but limited and expensive, single point of failure.
🧠 Simple Definition (Word-for-word)
Message queues decouple producers and consumers, enable async processing, and add resilience.
⚡ Super Simple Line
RabbitMQ: traditional message broker, complex routing (exchanges, bindings), good for task queues, RPC patterns, small-medium volume.
⚡ Key Details & Explanation
Message queues decouple producers and consumers, enable async processing, and add resilience. RabbitMQ: traditional message broker, complex routing (exchanges, bindings), good for task queues, RPC patterns, small-medium volume. Kafka: high-throughput event streaming platform, messages retained and replayable, consumer groups, event sourcing — best for data pipelines, analytics, millions of events/sec. Bull/BullMQ: Redis-backed job queue for Node.js — simple, great for background jobs (email, thumbnail generation), retry logic, no separate infra if you already use Redis. A practical answer should also mention how I would verify the choice, for example by checking query plans, measuring response time, or testing behavior under concurrent writes.
⚡ One-line Interview Answer
Message queues decouple producers and consumers, enable async processing, and add resilience.
🧠 Simple Definition (Word-for-word)
Microservices: split an app into independently deployable services, each owning its data and running in its own process.
⚡ Super Simple Line
Benefits: independent scaling, independent deployment, tech heterogeneity, team autonomy.
⚡ Key Details & Explanation
Microservices: split an app into independently deployable services, each owning its data and running in its own process. Benefits: independent scaling, independent deployment, tech heterogeneity, team autonomy. Problems they introduce: distributed system complexity (network failures, latency), service discovery, distributed tracing, data consistency across services (no ACID transactions), API contracts between teams, operational overhead (many deployments, many logs). Start with a monolith, extract services only when there's a clear scaling or team boundary reason.
⚡ One-line Interview Answer
Microservices: split an app into independently deployable services, each owning its data and running in its own process.
🧠 Simple Definition (Word-for-word)
CDN (Content Delivery Network): geographically distributed servers that cache and serve static assets (images, JS, CSS, videos) from the edge server closest to the user.
⚡ Super Simple Line
How it works: request hits CDN edge, on cache miss the edge fetches from your origin server and caches it (TTL-based).
⚡ Key Details & Explanation
CDN (Content Delivery Network): geographically distributed servers that cache and serve static assets (images, JS, CSS, videos) from the edge server closest to the user. How it works: request hits CDN edge, on cache miss the edge fetches from your origin server and caches it (TTL-based). Subsequent requests are served from the edge without hitting your origin. In Next.js: next/image automatically uses the CDN configured in Vercel, static assets in /public are CDN-served by default. Cache-Control headers control CDN caching behavior.
⚡ One-line Interview Answer
CDN (Content Delivery Network): geographically distributed servers that cache and serve static assets (images, JS, CSS, videos) from the edge server closest to the user.
🧠 Simple Definition (Word-for-word)
Long polling: client makes request, server holds it open until data is available, then responds, client immediately makes new request — lots of overhead, HTTP overhead per message.
⚡ Super Simple Line
SSE (Server-Sent Events): persistent one-directional connection (server to client only) over HTTP — simple, auto-reconnect, EventSource API.
⚡ Key Details & Explanation
Long polling: client makes request, server holds it open until data is available, then responds, client immediately makes new request — lots of overhead, HTTP overhead per message. SSE (Server-Sent Events): persistent one-directional connection (server to client only) over HTTP — simple, auto-reconnect, EventSource API. WebSockets: full-duplex persistent TCP connection — both directions, lower overhead per message, more complex. Use SSE for: notifications, live feeds (server pushes only). Use WebSockets for: chat, collaborative editing, games (bidirectional real-time).
⚡ One-line Interview Answer
Long polling: client makes request, server holds it open until data is available, then responds, client immediately makes new request — lots of overhead, HTTP overhead per message.
🧠 Simple Definition (Word-for-word)
Two main approaches: Operational Transformation (OT) — transforms concurrent operations to be compatible (used by Google Docs); CRDT (Conflict-free Replicated Data Types) — data structures designed to merge automatically without conflict (used by Figma, Notion).
⚡ Super Simple Line
For a simpler system: use Yjs library (CRDT-based), connect via WebSocket (Socket.io or native), broadcast operations to all connected clients.
⚡ Key Details & Explanation
Two main approaches: Operational Transformation (OT) — transforms concurrent operations to be compatible (used by Google Docs); CRDT (Conflict-free Replicated Data Types) — data structures designed to merge automatically without conflict (used by Figma, Notion). For a simpler system: use Yjs library (CRDT-based), connect via WebSocket (Socket.io or native), broadcast operations to all connected clients. Persist: save document state to DB on debounced changes or a dedicated save action. The hard part is handling concurrent edits and cursor positions.
⚡ One-line Interview Answer
Two main approaches: Operational Transformation (OT) — transforms concurrent operations to be compatible (used by Google Docs); CRDT (Conflict-free Replicated Data Types) — data structures designed to merge automatically without conflict (used by Figma, Notion).
🧠 Simple Definition (Word-for-word)
For a chat system, I would start with conversations, participants, and messages as the core data model.
⚡ Super Simple Line
When a user sends a message, the backend should validate the user, save the message to the database first, and then deliver it in real time to online users through WebSockets.
⚡ Key Details & Explanation
For a chat system, I would start with conversations, participants, and messages as the core data model. When a user sends a message, the backend should validate the user, save the message to the database first, and then deliver it in real time to online users through WebSockets. If the recipient is offline, the message stays stored and can be loaded when they reconnect, and the system can also send a push notification. Message history should use cursor pagination because chat history can become large and users usually load older messages gradually. The system also needs delivery status, read receipts if required, and ordering by a reliable timestamp or sequence number. At scale, WebSocket servers need shared state through Redis Pub/Sub, Kafka, or another messaging layer so a message can reach users connected to different servers. The most important tradeoff is reliability versus complexity: for simple chat, saving messages and pushing live updates is enough, but for large-scale chat, ordering, retries, and multi-server delivery become the hard parts.
⚡ One-line Interview Answer
For a chat system, I would start with conversations, participants, and messages as the core data model.
🧠 Simple Definition (Word-for-word)
Frontend should debounce input and cancel stale requests.
⚡ Super Simple Line
Backend options depend on scale: SQL prefix search for simple cases, Elasticsearch/OpenSearch for large-scale ranking, typo tolerance, and advanced relevance.
⚡ Key Details & Explanation
Frontend should debounce input and cancel stale requests. Backend options depend on scale: SQL prefix search for simple cases, Elasticsearch/OpenSearch for large-scale ranking, typo tolerance, and advanced relevance. Return small payloads quickly, usually top 5-10 suggestions. Cache hot queries, track analytics for no-result searches, and rank by a combination of prefix match, popularity, and recent behavior where relevant.
⚡ One-line Interview Answer
Frontend should debounce input and cancel stale requests.
🧠 Simple Definition (Word-for-word)
An API gateway is the single entry point for clients in front of multiple services.
⚡ Super Simple Line
It can handle authentication, rate limiting, request routing, header normalization, logging, and sometimes response aggregation.
⚡ Key Details & Explanation
An API gateway is the single entry point for clients in front of multiple services. It can handle authentication, rate limiting, request routing, header normalization, logging, and sometimes response aggregation. This keeps cross-cutting concerns out of every service. Tradeoff: it can become a bottleneck or too smart if you push business logic into it. Keep it focused on routing and platform concerns, not domain behavior.
⚡ One-line Interview Answer
An API gateway is the single entry point for clients in front of multiple services.
🧠 Simple Definition (Word-for-word)
Event-driven architecture is a design pattern where the flow of the application is driven by events.
⚡ Super Simple Line
Instead of directly calling functions, components communicate by emitting and listening to events.
⚡ Key Details & Explanation
Event-driven architecture is a design pattern where the flow of the application is driven by events.
Instead of directly calling functions, components communicate by emitting and listening to events.
Benefits include:
- Loose coupling between components
- Better scalability
- Easier to extend features
In Node.js, this pattern is very common because of EventEmitter and async nature.
For example, in a notification system, one service emits an event and multiple services like email or logging can react to it independently.
⚡ One-line Interview Answer
Event-driven architecture is a design pattern where the flow of the application is driven by events.
🧠 Simple Definition (Word-for-word)
To handle a sudden spike in traffic: I would first ensure that my infrastructure can scale (using cloud services like AWS or GCP).
⚡ Super Simple Line
Implement auto-scaling to add more servers as needed.
⚡ Key Details & Explanation
To handle a sudden spike in traffic:
- I would first ensure that my infrastructure can scale (using cloud services like AWS or GCP).
- Implement auto-scaling to add more servers as needed.
- Use a load balancer to distribute traffic evenly across servers.
- Implement caching to reduce database load.
- And if necessary, I would also consider using a CDN to serve static assets faster.
This way, I can maintain performance and prevent downtime during traffic spikes.
⚡ One-line Interview Answer
To handle a sudden spike in traffic: I would first ensure that my infrastructure can scale (using cloud services like AWS or GCP).