REST vs GraphQL vs gRPC?

REST is simplest and most compatible. GraphQL fits flexible client queries. gRPC is efficient for service-to-service. Choose based on client and performance needs.

How do you version APIs?

URL versioning (/v1) is simplest; header versioning is cleaner but harder to debug. Either way, deprecate gracefully with long notice.

What about idempotency?

Critical for safe retries. POST can be idempotent via an Idempotency-Key header. PUT and DELETE are idempotent by spec.

Which auth scheme to use?

OAuth 2.0 + JWT for user-facing APIs, mutual TLS or signed requests for service-to-service.

How do you document APIs?

OpenAPI (Swagger) is the industry standard. Many frameworks auto-generate it from code annotations.

REST API Interview Questions (2026) — 100 Q&A

REST (Representational State Transfer) is an architectural style for distributed hypermedia systems defined by Roy Fielding in his 2000 dissertation. The six constraints are: (1) **Client–Server** — concerns are separated so the UI and data storage can evolve independently; (2) **Statelessness** — every request from the client must contain all information the server needs; session state is never stored server-side; (3) **Cacheability** — responses must declare whether they can be cached to reduce load; (4) **Uniform Interface** — resources are identified in requests, manipulated through representations, messages are self-descriptive, and HATEOAS drives application state; (5) **Layered System** — clients cannot tell whether they are connected directly to the origin server or an intermediary; (6) **Code on Demand** (optional) — servers may extend client functionality by transferring executable code such as JavaScript. An API that satisfies all mandatory constraints is said to be RESTful.

An API (Application Programming Interface) is a defined contract that allows two software systems to communicate. It specifies what operations are available, what inputs they accept, and what outputs they return — without exposing internal implementation details. Web APIs typically communicate over HTTP using JSON or XML payloads. The client sends a request to a well-known URL endpoint; the server processes it and returns a structured response. APIs enable loose coupling: a mobile app, a web frontend, and a third-party partner can all consume the same backend service without knowing how it is built internally.

The five most common HTTP methods map onto CRUD operations: **GET** retrieves a resource without side effects — never use it to modify data. **POST** creates a new resource or triggers a non-idempotent action; the server determines the new resource URI. **PUT** replaces a resource entirely at a known URI; the client sends the full representation. **PATCH** applies a partial update — only the fields provided are changed. **DELETE** removes a resource.

http
GET    /orders/42          # read order 42
POST   /orders             # create a new order
PUT    /orders/42          # replace order 42 wholesale
PATCH  /orders/42          # update specific fields
DELETE /orders/42          # remove order 42

Use GET and HEAD for safe, read-only operations. Use PUT/DELETE when the action is idempotent. Use POST for anything that must only run once.

An operation is idempotent if calling it once produces the same server state as calling it N times. GET, HEAD, PUT, DELETE, OPTIONS, and TRACE are all idempotent. POST is NOT idempotent — submitting the same order form twice creates two orders. PATCH is technically not guaranteed to be idempotent (a patch that appends to a list is not), though many implementations treat simple field updates as idempotent. Idempotency matters for reliability: if a network request times out and you are unsure whether the server received it, you can safely retry an idempotent method without fear of duplicate side effects.

A method is safe if it does not modify server state — it is purely a read operation. GET, HEAD, OPTIONS, and TRACE are safe. Safe methods are a subset of idempotent methods: all safe methods are also idempotent, but not all idempotent methods are safe (DELETE is idempotent but not safe). Browsers and caches rely on the safety guarantee to prefetch GET requests and allow them to be replayed without asking the user. Violating this — for example, using GET to delete a record — breaks cache correctness and can cause data loss when crawlers or prefetch mechanisms visit the URL.

**200 OK** is the generic success code for GET, PUT, PATCH, and DELETE when the response body contains content. **201 Created** signals that a POST (or PUT) created a new resource; the response should include a `Location` header pointing to the new resource URI. **204 No Content** means success but there is nothing to return — common for DELETE or PATCH operations where the caller does not need the updated resource.

http
HTTP/1.1 201 Created
Location: /users/99
Content-Type: application/json

{ "id": 99, "email": "alice@example.com" }

Using 201 vs 200 correctly helps clients know whether a new resource was created so they can cache or navigate to it.

HTTP headers are key-value metadata sent with requests and responses that convey information beyond the body — content type, caching instructions, authentication tokens, compression, and more. Common **request headers**: `Authorization` (credentials), `Content-Type` (body format), `Accept` (desired response format), `User-Agent`, `Cookie`, `If-None-Match`, `X-Request-ID`. Common **response headers**: `Content-Type`, `Content-Length`, `Cache-Control`, `ETag`, `Location` (redirect target), `Set-Cookie`, `Retry-After`, `Access-Control-Allow-Origin` (CORS).

http
GET /products/5 HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJ...
Accept: application/json

The `Accept` header is sent by the client to declare which media types it can process in the response. The server uses content negotiation to pick the best match and sets the response `Content-Type` accordingly. The value can include multiple types with quality factors (`q` values).

http
Accept: application/json, text/html;q=0.9, */*;q=0.8

If the server cannot produce any of the accepted types, it returns **406 Not Acceptable**. APIs that only speak JSON often ignore `Accept` entirely and always return `application/json`, but properly honouring it enables serving the same endpoint in multiple formats (JSON, XML, CSV).

The base URL (also called base URI or base endpoint) is the root address of an API against which all endpoint paths are resolved. It typically includes the scheme, host, and a path prefix:

https://api.example.com/v1

All endpoints are relative to this: `GET https://api.example.com/v1/users`. API documentation and SDKs are configured once with the base URL so consumers do not repeat it. Separate base URLs are usually provided for production and sandbox environments (e.g., `https://sandbox.api.example.com/v1`). A well-structured base URL includes the version so all endpoints under it are versioned together.

Basic Auth sends credentials as a base64-encoded `username:password` string in the `Authorization` header on every request.

http
Authorization: Basic dXNlcjpwYXNz

Base64 is encoding, not encryption — the credentials are essentially in plaintext, so Basic Auth **must** only be used over HTTPS. The server decodes the header, looks up the user, and verifies the password hash. Basic Auth is stateless (no session), simple to implement, and supported natively by browsers and HTTP clients. It is appropriate for machine-to-machine calls with low security requirements or for protecting internal tools behind a gateway. For user-facing login flows, OAuth or token-based auth is preferred.

Bearer token auth sends an opaque or structured token (often a JWT) in the `Authorization` header:

http
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...

The word "Bearer" means "whoever holds this token is granted access." The server validates the token — either by looking it up in a database (opaque tokens) or by verifying the signature and claims (JWTs). Bearer tokens are the standard mechanism for OAuth 2.0. They expire (unlike API keys), can carry identity claims, and can be scoped to specific permissions. Because anyone who obtains the token can use it, they must be transmitted only over HTTPS and stored securely (never in localStorage for high-value tokens).

An **API** is the interface contract — the set of HTTP endpoints, methods, parameters, and data formats that a service exposes. An **SDK** (Software Development Kit) is a language-specific library that wraps the API and provides a more ergonomic experience for developers. The SDK handles low-level concerns like constructing HTTP requests, serialising/deserialising JSON, managing authentication headers, handling retries and pagination, and exposing typed objects.

js
// Raw API
fetch('/users/42', { headers: { Authorization: 'Bearer ' + token } });
// SDK equivalent
const user = await client.users.get(42);

You can always use an API without its SDK; the SDK just makes development faster and less error-prone.

The Richardson Maturity Model (RMM) describes four levels of REST maturity. **Level 0 (Swamp of POX)** — one endpoint, one HTTP method, everything is a RPC-style POST: `POST /api { "action": "getUser", "id": 42 }`. **Level 1 (Resources)** — separate URIs per resource but still uses only GET/POST: `POST /users/42`. **Level 2 (HTTP Verbs)** — correct HTTP method semantics, proper status codes: `GET /users/42` returns 200, `DELETE /users/42` returns 204. **Level 3 (HATEOAS)** — responses include hypermedia links describing available actions.

json
{ "id": 42, "_links": { "self": "/users/42", "orders": "/users/42/orders" } }

Most production REST APIs operate at Level 2. Level 3 is rare due to implementation complexity.

HATEOAS (Hypermedia As The Engine Of Application State) is the Level 3 REST constraint: every response includes links to related resources and available actions, so clients discover the API dynamically rather than hardcoding URIs. The idea is that a client should only need to know the entry-point URL and then follow links, similar to a web browser.

json
{
  "orderId": 7,
  "_links": {
    "cancel": { "href": "/orders/7/cancel", "method": "POST" },
    "invoice": { "href": "/orders/7/invoice" }
  }
}

In practice, most teams skip HATEOAS because: client developers hardcode URLs anyway, the added payload size and server complexity are not justified, tooling support is immature, and the discoverability claim rarely holds for typed API consumers. API documentation (OpenAPI) serves the discoverability goal more practically.

PUT replaces the entire resource representation: you must send the complete object, and fields you omit are cleared. PATCH applies a partial update: only the fields in the request body are changed. Two standardised PATCH formats exist: **JSON Merge Patch (RFC 7396)** — send a plain JSON object with only the fields to change; set a field to `null` to delete it. **JSON Patch (RFC 6902)** — send an array of operation objects (`add`, `remove`, `replace`, `move`, `copy`, `test`) for fine-grained, atomic mutation.

json
// JSON Merge Patch
PATCH /users/42
{ "email": "new@example.com" }

// JSON Patch
PATCH /users/42
[{ "op": "replace", "path": "/email", "value": "new@example.com" }]

JSON Patch is more powerful (supports `test` for optimistic concurrency) but verbose. JSON Merge Patch is simpler and widely adopted.

An idempotency key is a client-generated unique token (UUID) sent with a non-idempotent POST request, allowing the server to detect and deduplicate retries. If the client times out and retries, the server recognises the same key and returns the cached response instead of executing the operation again — preventing duplicate charges, double orders, etc.

http
POST /payments
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
Content-Type: application/json

{ "amount": 5000, "currency": "usd" }

Server-side implementation: store the idempotency key in a table with the response payload and status. On incoming request, check if the key exists — if so, return the stored response immediately without re-executing. Set a TTL (Stripe uses 24 hours). Use a unique index on the key to prevent race conditions. Scope keys per user to prevent cross-user replay attacks.

Conditional requests let clients avoid downloading unchanged resources and let servers prevent lost updates. The server generates an **ETag** (entity tag — typically a hash or version of the resource) and includes it in the response. On subsequent GET requests, the client sends `If-None-Match: "etag-value"` — the server returns 304 Not Modified (no body) if unchanged, saving bandwidth. For writes, the client sends `If-Match: "etag-value"` with PUT/PATCH — the server processes the request only if the ETag still matches, returning 412 Precondition Failed if it has changed since the client last fetched it (optimistic concurrency). `Last-Modified` / `If-Modified-Since` works analogously using timestamps instead of hashes, but ETags are more precise.

http
GET /articles/1 → ETag: "abc123"
GET /articles/1
If-None-Match: "abc123"
→ 304 Not Modified

A cursor encodes the position of the last returned item as an opaque, URL-safe token (base64 of JSON `{"id": 42, "created_at": "2024-01-15T10:00:00Z"}`). The client passes this in the next request; the server decodes it and uses it as a WHERE clause filter.

sql
SELECT * FROM jobs
WHERE (created_at, id) < (:cursor_ts, :cursor_id)
ORDER BY created_at DESC, id DESC
LIMIT 20;

Requirements: (1) **Sort stability** — the sort key(s) must be stable and part of a unique index, otherwise rows can appear in different positions between pages. Adding the primary key as a tiebreaker ensures uniqueness. (2) The cursor must encode all sort columns. (3) Never expose raw database IDs in cursors — always encode/encrypt them. Respond with `next_cursor: null` when on the last page.

Common conventions: `?status=active&role=admin` for filtering, `?sort=created_at&order=desc` for sorting. Never pass sort column names directly to SQL — maintain an allowlist of sortable fields and map them to column names server-side to prevent SQL injection and column enumeration.

js
const SORTABLE = { created_at: 'jobs.created_at', title: 'jobs.title' };
const col = SORTABLE[req.query.sort] ?? 'jobs.created_at';

For complex filters, some APIs accept an object notation (`?filter[status]=active`) or a simple query DSL. Validate and sanitise every parameter. Avoid filtering on non-indexed columns — add database indexes for all commonly filtered fields. Document which filters are supported to prevent clients from relying on undocumented behaviour that could change.

Field masks let clients request only a subset of fields in the response: `GET /users?fields=id,name,email`. This reduces payload size and over-fetching — useful for mobile clients on slow connections or list views that need minimal data.

json
GET /orders?fields=id,total,status
→ [{ "id": 1, "total": 99.99, "status": "shipped" }]

Implement field masks when: responses are large (many fields), the API is used heavily from mobile, or different consumers need drastically different projections. The server parses the `fields` parameter, validates each field name against an allowlist, and projects the query accordingly. Google's APIs use a `fields` query parameter; GraphQL solves this problem more elegantly. Avoid implementing field masks for small responses — the overhead of parsing and projecting outweighs the savings.

Rate limiting protects APIs from abuse and ensures fair usage. Common algorithms: **Token bucket** — a bucket holds up to `capacity` tokens; requests consume one token each; tokens refill at a fixed rate. Allows short bursts up to capacity. **Sliding window counter** — counts requests in a rolling time window (e.g., last 60 s) using a Redis sorted set with timestamps; smoother than fixed windows which allow double the rate at window boundaries. **Fixed window counter** — simplest; reset the counter every minute. Vulnerable to boundary bursts. Clients must receive standard headers so they can back off correctly:

http
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1716912000
Retry-After: 58

Return 429 Too Many Requests when the limit is exceeded.

A JWT (JSON Web Token) is three base64url-encoded segments separated by dots: `header.payload.signature`. The **header** contains the token type and signing algorithm (`{ "alg": "RS256", "typ": "JWT" }`). The **payload** contains claims — statements about the subject: registered claims (`iss` issuer, `sub` subject/user ID, `aud` audience, `exp` expiry timestamp, `iat` issued-at, `jti` JWT ID for revocation), and custom claims (`role`, `email`). The **signature** is created by signing `base64(header) + "." + base64(payload)` with a secret (HMAC) or private key (RSA/EC).

eyJhbGciOiJSUzI1NiJ9
.eyJzdWIiOiJ1c2VyXzQyIiwiZXhwIjoxNzE2OTEyMDAwfQ
.SflKxwRJSMeKKF2QT4fw...

The payload is not encrypted — only signed. Never store secrets in JWT claims.

A CORS preflight is an automatic `OPTIONS` request the browser sends before a cross-origin request that uses non-simple methods (PUT, PATCH, DELETE) or custom headers. The browser asks the server "is this request allowed?"

http
OPTIONS /api/users HTTP/1.1
Origin: https://app.example.com
Access-Control-Request-Method: DELETE
Access-Control-Request-Headers: Authorization

The server must respond with the appropriate CORS headers:

http
Access-Control-Allow-Origin: https://app.example.com
Access-Control-Allow-Methods: GET, POST, DELETE
Access-Control-Allow-Headers: Authorization
Access-Control-Max-Age: 86400

`Access-Control-Max-Age` tells the browser how long to cache the preflight result, avoiding repeated OPTIONS requests. If the server does not respond correctly, the browser blocks the actual request. A wildcard `*` for `Allow-Origin` cannot be used with `credentials: 'include'`.

SSE uses a plain HTTP response with `Content-Type: text/event-stream`. The server keeps the connection open and writes events in a specific text format:

id: 42
event: price-update
data: {"symbol":"AAPL","price":182.50}

Each event block is separated by a blank line. The `id` field sets the `Last-Event-ID`. If the connection drops, the browser automatically reconnects (after `retry:` ms, defaulting to 3 s) and sends `Last-Event-ID` in the request header, allowing the server to replay missed events. SSE is limited to UTF-8 text. For binary data, base64-encode it. In HTTP/1.1, browsers allow only 6 connections per origin and SSE occupies one permanently — HTTP/2 multiplexing eliminates this limit. Node.js implementation: keep the response writable, write events, handle `close` event to clean up.

WebSocket upgrades an HTTP/1.1 connection to a persistent, bidirectional TCP socket. The client initiates with a special HTTP Upgrade request:

http
GET /ws HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

The `Sec-WebSocket-Key` is a random base64 nonce. The server concatenates it with the magic GUID `258EAFA5-E914-47DA-95CA-C5AB0DC85B11`, SHA-1 hashes it, base64-encodes it, and returns it in `Sec-WebSocket-Accept`. The client verifies the accept value:

http
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

After the 101 response, the connection is no longer HTTP — both sides communicate using WebSocket frames. The handshake prevents non-WebSocket servers from accidentally accepting connections.

RFC 7807 defines a standard JSON format for HTTP API error responses, avoiding inconsistent ad-hoc error schemas across APIs. Fields: `type` (URI identifying the error type, links to docs), `title` (human-readable summary, stable), `status` (HTTP status code), `detail` (human-readable, request-specific explanation), `instance` (URI of the specific occurrence, useful for support).

json
{
  "type": "https://api.example.com/errors/validation",
  "title": "Validation Failed",
  "status": 422,
  "detail": "The email field must be a valid email address.",
  "instance": "/requests/abc123",
  "errors": [{ "field": "email", "message": "Invalid format" }]
}

`Content-Type: application/problem+json`. You can extend the schema with custom fields. This standard makes errors machine-readable and consistent across your entire API surface.

Batch requests allow clients to send multiple operations in a single HTTP request, reducing round trips. Two patterns: (1) **Array body** — the endpoint accepts an array and processes items together:

http
POST /users/batch-delete
[{ "id": 1 }, { "id": 2 }, { "id": 3 }]

(2) **Batch endpoint** — a single `POST /batch` accepts an array of sub-requests, each specifying their own method, path, and body (used by Facebook Graph API, Microsoft 365). The response is an array of individual results with their own status codes. Trade-offs: batching improves throughput but makes error handling complex (partial success — some items may fail). Consider returning a 207 Multi-Status response for partial success. Limit batch size (e.g., max 100 items) to prevent abuse. Batch operations that are logically atomic should run in a database transaction.

Webhooks are server-initiated HTTP POST requests to a URL the client registers, triggered by events. Delivery guarantees: use **at-least-once delivery** — persist the event to a queue before attempting delivery, retry with exponential backoff on non-2xx responses or timeouts (e.g., 1 min, 5 min, 30 min, 2 hr, 24 hr). After N failures, mark as failed and alert. Consumers must be **idempotent** — the same event may arrive more than once. Security via HMAC signature verification:

http
X-Webhook-Signature: sha256=a8b9c...

The server computes `HMAC-SHA256(secret, raw_request_body)` and includes it. The receiver recomputes the HMAC and compares using a constant-time comparison to prevent timing attacks. Include a timestamp in the payload and reject events older than 5 minutes to prevent replay attacks. Never compute HMAC on the parsed JSON — always use the raw bytes.

A clear deprecation strategy prevents breaking clients without warning. Steps: (1) Announce deprecation in docs, changelog, and developer blog with a concrete sunset date (minimum 6–12 months for public APIs). (2) Add a `Deprecation` header to all responses from deprecated endpoints:

http
Deprecation: Sat, 31 Dec 2025 23:59:59 GMT
Sunset: Sat, 31 Dec 2025 23:59:59 GMT
Link: <https://api.example.com/v2/users>; rel="successor-version"

(3) Log usage of deprecated endpoints so you know which clients are still using them — contact those teams directly. (4) Keep the old version running until usage drops to zero or the sunset date passes. (5) Return 410 Gone (not 404) after sunset to differentiate "never existed" from "intentionally removed". (6) Provide a migration guide with example before/after code.

The BFF pattern creates a separate API layer tailored to each client type (web, iOS, Android) rather than exposing a single general-purpose API. Each BFF aggregates calls to multiple downstream microservices, shapes the response to exactly what its client needs, and handles client-specific concerns (auth flows, response formats, caching strategies).

Web App → /bff/web → [UserService, OrderService, RecommendationService]
iOS App → /bff/ios → [UserService, OrderService] (leaner payload)

Benefits: eliminates over-fetching/under-fetching for each client; teams can optimise independently; adds an isolation layer so internal service changes do not break client contracts. Drawbacks: code duplication across BFFs (mitigated by shared libraries), additional deployment unit to maintain. Popular at organisations with multiple client types with very different data needs (Netflix, SoundCloud).

Idempotent writes prevent duplicate data when retries occur. Three patterns: (1) **Upsert (INSERT ON CONFLICT)** — insert the record; if a unique constraint is violated, update instead. Safe to retry because the end state is the same.

sql
INSERT INTO users (email, name)
VALUES ($1, $2)
ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name;

(2) **Conditional write** — only update if the current version matches: `UPDATE orders SET status=$2 WHERE id=$1 AND version=$3`. Returns 0 rows affected if already updated; the API can return the current state. (3) **Idempotency key table** — store the operation key with its result; check before executing. Use database transactions to ensure the check and the insert are atomic. PostgreSQL advisory locks or `SELECT ... FOR UPDATE` prevent race conditions between concurrent retries hitting the same idempotency key.

Multipart uploads send a body with multiple parts separated by a boundary string, allowing files and JSON fields in the same request.

http
POST /uploads HTTP/1.1
Content-Type: multipart/form-data; boundary=----boundary

------boundary
Content-Disposition: form-data; name="file"; filename="photo.jpg"
Content-Type: image/jpeg

<binary data>
------boundary--

For large files, avoid buffering the entire file in memory — use streaming. Node.js: pipe `req` through a multipart parser (busboy, formidable) directly to object storage (S3 presigned URL upload). For very large files (>100 MB), use **chunked/resumable upload**: split the file into chunks client-side; the client uploads each chunk to `/uploads/{uploadId}/parts/{partNumber}`; server reassembles or uses S3 Multipart Upload API. AWS S3 multipart upload requires a minimum 5 MB per part (except the last). Return a resumable upload URL so clients can recover from interrupted uploads.

These are standardised JSON formats that implement HATEOAS-style hypermedia. **HAL (Hypertext Application Language)** — `application/hal+json`: resources have `_links` (relationships and URIs) and `_embedded` (included sub-resources).

json
{ "id": 1, "_links": { "self": { "href": "/orders/1" } },
  "_embedded": { "items": [{ "sku": "ABC" }] } }

**JSON:API** — `application/vnd.api+json`: more opinionated; resources have a `type` and `id`, `attributes` for data, `relationships` for associations, and `included` for sideloaded related resources. It specifies compound document format, sparse fieldsets, filtering, sorting, and pagination conventions. JSON:API eliminates the need to design these conventions yourself. Neither format is dominant in most new APIs — most teams use plain JSON with OpenAPI for documentation. They are most valuable for libraries and frameworks that auto-generate client code from the hypermedia.

Every API request should emit a structured log entry (JSON) and optionally a distributed trace span. Minimum fields per request:

json
{
  "trace_id": "4bf92f3577b34da6",
  "span_id": "00f067aa0ba902b7",
  "method": "POST",
  "path": "/orders",
  "status": 201,
  "duration_ms": 43,
  "user_id": "user_42",
  "ip": "203.0.113.5",
  "user_agent": "MyApp/2.0",
  "error": null
}

Emit metrics derived from logs: request rate, error rate (4xx/5xx), p50/p95/p99 latency, per-endpoint. Set up dashboards for error rate spike alerts and latency degradation. Use distributed tracing (OpenTelemetry) to propagate `traceparent` across service boundaries so you can reconstruct the full call graph. Never log request bodies containing PII or credentials — log only non-sensitive identifiers.

Optimistic concurrency control prevents the "lost update" problem where two clients read the same resource, both modify it, and the second write silently overwrites the first. Full flow: (1) `GET /articles/42` → server returns the article with `ETag: "v5"`. (2) Client A and Client B both read this response. (3) Client A sends `PATCH /articles/42` with `If-Match: "v5"` and a new body. Server checks — ETag is still "v5", update proceeds, new ETag is `"v6"`, response includes `ETag: "v6"`. (4) Client B sends `PATCH /articles/42` with `If-Match: "v5"`. Server checks — current ETag is "v6", mismatch → returns **412 Precondition Failed**. Client B must re-fetch, merge changes, and retry.

http
PATCH /articles/42 HTTP/1.1
If-Match: "v5"
Content-Type: application/json

{ "title": "Updated Title" }

The ETag must change on every write — use a version counter (`version INT`) or a hash of the serialised resource. Store the version in the same row as the data and update atomically: `UPDATE articles SET title=$1, version=version+1 WHERE id=$2 AND version=$3`.

In a GraphQL resolver, fetching `posts` and then their `author` for each post naively executes one SQL query for posts and then N queries for authors — the N+1 problem. DataLoader solves this with **batching** and **caching** within a single request's lifecycle. When a resolver calls `userLoader.load(userId)`, DataLoader defers the actual fetch. At the end of the current event loop tick, it collects all pending keys and calls the batch function once with all of them.

js
const userLoader = new DataLoader(async (userIds) => {
  const users = await db.query(
    `SELECT * FROM users WHERE id = ANY($1)`, [userIds]
  );
  return userIds.map(id => users.find(u => u.id === id));
});

This transforms N+1 queries into 2 queries (1 for posts + 1 for all unique authors). The in-request cache means the same userId is only fetched once per request even if referenced multiple times. For deeply nested fields, each level gets batched separately, resulting in O(depth) queries instead of O(N^depth). DataLoader cannot replace JOIN-based fetching for very hot paths — for truly performance-critical list views, custom resolvers with explicit JOINs are more efficient.

Mass assignment occurs when an API binds all incoming request body properties directly to a data model without filtering, allowing an attacker to set fields they should not control.

js
// VULNERABLE: directly merges request body into update query
await db.query("UPDATE users SET $1 WHERE id=$2", [req.body, userId]);
// Attacker sends: { "name": "Alice", "role": "admin", "balance": 999999 }

Prevention: use explicit **allow-lists** — define exactly which fields the caller is permitted to set for each endpoint, and only include those in the database update.

js
const ALLOWED = ['name', 'email', 'bio'];
const safe = pick(req.body, ALLOWED); // lodash pick or manual
await db.query("UPDATE users SET ...", [safe]);

Additionally: validate the schema with a library (Zod, Joi, JSON Schema) that strips unknown fields (`stripUnknown: true`). Never use ORM "update all properties" methods with raw request input. For admin-only fields, require explicit privilege check before accepting the field, not just at the route level.

Application-level rate limiting fails in horizontally scaled deployments because each instance has its own in-memory counter. The solution: centralise the counter in Redis and use a **Lua script** to make the check-and-increment atomic (no race condition between two Redis operations).

lua
local key = KEYS[1]  -- e.g. "ratelimit:user_42:minute:1716912060"
local limit = tonumber(ARGV[1])  -- e.g. 100
local window = tonumber(ARGV[2])  -- 60 seconds
local current = redis.call("INCR", key)
if current == 1 then redis.call("EXPIRE", key, window) end
if current > limit then return 0 else return 1 end

This runs atomically in Redis — no two instances can race. The key encodes the user ID and the current time window (floor to minute). For sliding window rate limiting, use a Redis sorted set: add the current timestamp as both score and member, trim entries older than the window, count the set size. Lua keeps the trim and count atomic. Publish rate limit headers from Redis's returned values. Rate limit by: user ID (authenticated), IP (unauthenticated), API key, endpoint path, or combinations. At extreme scale (millions of keys), use Redis Cluster with consistent hashing on the rate limit key.

A reverse proxy cache (nginx proxy_cache, Varnish) stores upstream responses and serves them directly for subsequent matching requests. The cache key is typically `method + URL`. The **`Vary` response header** tells the cache which request headers affect the response, so it maintains separate cache entries per header value:

http
Vary: Accept-Encoding, Accept-Language

This means `gzip` and `br` compressed versions are stored separately; French and English responses are stored separately. Overusing `Vary: *` (varies on everything) effectively disables caching. Cache **invalidation**: (1) TTL expiry via `Cache-Control: max-age`. (2) `PURGE` method (Varnish / nginx `cache_purge` module) — the application sends `PURGE /articles/42` to the cache after an update. (3) Surrogate keys (cache tags) — tag cache entries with logical keys (`article:42`, `author:5`); on update, purge by tag (Fastly/Varnish). Nginx does not support tag-based invalidation natively — use a Lua module or external Redis mapping. Cache invalidation at the proxy level dramatically reduces origin load for read-heavy content but requires careful design to avoid serving stale data after writes.

For datasets too large to buffer in memory (millions of rows, large files): **Chunked transfer encoding** — the server sets `Transfer-Encoding: chunked` and streams response body chunks as they are generated; the `Content-Length` is unknown. Node.js: pipe a readable stream into `res`. This frees memory on both sides. **NDJSON (Newline-Delimited JSON)** — each JSON object is on its own line separated by ` `, enabling the client to parse the stream incrementally without waiting for the full response:

{"id":1,"name":"Alice"}
{"id":2,"name":"Bob"}

`Content-Type: application/x-ndjson`. Clients parse line by line. **Range requests** for file downloads: the server advertises `Accept-Ranges: bytes`. The client can request a specific byte range:

http
GET /exports/large-file.csv
Range: bytes=0-1048575
→ 206 Partial Content
Content-Range: bytes 0-1048575/52428800

This enables resumable downloads — if interrupted, the client requests the remaining range. Implement ETag validation with range requests to detect if the file changed mid-download.

CloudEvents is a CNCF specification for describing event data in a common, interoperable format, enabling routing, filtering, and tracing across different event brokers (Kafka, HTTP, AMQP). Every CloudEvent has required attributes: `id`, `source`, `specversion` ("1.0"), `type`, and optional attributes: `datacontenttype`, `subject`, `time`, `dataschema`. Two HTTP binding content modes: **Structured mode** — all attributes and data are encoded in a single JSON body with `Content-Type: application/cloudevents+json`:

json
{ "specversion": "1.0", "type": "com.example.order.created",
  "source": "/orders", "id": "abc123", "time": "2024-01-15T10:00:00Z",
  "data": { "orderId": 42, "total": 99.99 } }

**Binary mode** — event attributes are HTTP headers (`ce-id`, `ce-type`, `ce-source`), and the raw event data is the body with its native `Content-Type`. Binary mode has lower overhead (no JSON wrapper) and works better with binary payloads. Use CloudEvents when building event-driven APIs that need to integrate with multiple consumers across different platforms.

Pact contract testing has two sides: consumer and provider, run independently in CI. **Consumer side**: write a test that describes what the consumer expects from the provider. Pact records this as an "interaction" in a pact JSON file.

js
await provider.addInteraction({
  state: 'user 42 exists',
  uponReceiving: 'a request for user 42',
  withRequest: { method: 'GET', path: '/users/42' },
  willRespondWith: { status: 200, body: like({ id: 42, name: string() }) }
});

**Provider states** (`state: 'user 42 exists'`) are setup hooks that the provider test runner calls before each interaction to put the database into the required state (insert fixtures, mock dependencies). **Provider side CI**: pull pact files from Pact Broker, run `pact:verify` — this starts a replay of each interaction against the real provider. If all interactions pass, the provider publishes "can-i-deploy: true" to the broker. **can-i-deploy** is a Pact Broker CLI command that checks whether both consumer and provider have verified compatible versions — only deploy if both sides pass. This creates a safety net: any provider change that breaks a consumer pact fails CI before deployment, enabling truly independent service deployment.

A service mesh (Istio, Linkerd) manages service-to-service traffic at the infrastructure level, independent of application code. For canary API releases: (1) Deploy the new API version (`v2`) as a separate deployment with a distinct label. (2) Create an Istio **VirtualService** that routes a small percentage of traffic to v2:

yaml
http:
- route:
  - destination: { host: api, subset: v1 }
    weight: 90
  - destination: { host: api, subset: v2 }
    weight: 10

(3) Define **DestinationRules** with subsets matching pod labels. (4) Gradually increase the v2 weight based on error rate and latency metrics from the mesh's telemetry (Prometheus + Grafana). (5) Use header-based routing for internal testing: route requests with `X-Canary: true` to v2 without percentage rollout. (6) If v2 error rate exceeds the SLO, immediately shift weight back to 0 (automated via Flagger or Argo Rollouts). The API gateway layer can add additional controls: A/B testing by user segment (user ID hash), geographic rollout, or beta-user whitelisting by JWT claim.

Zero-downtime migration of a live API to a new version or data store requires avoiding a hard cutover. Key patterns: **Shadow traffic (dark launch)**: route a copy of live production traffic to the new implementation in parallel, without serving the shadow's response to the client. Compare outputs to detect divergence before committing.

Request → Old API → Response to client
          ↓ async copy
         New API → log divergence

**Dual-write**: during the migration window, write mutations to both old and new stores simultaneously. Reads can start being shifted to the new store once the backfill is complete and consistency is verified. **Traffic shifting strategy**: start at 0% to new, run shadow comparison, verify correctness, then shift 1% → 5% → 25% → 50% → 100% with automated rollback triggers on error rate increase. **Read-your-writes consistency**: if a user writes to the new store but another read is served from the old store before replication, they see stale data — use sticky sessions or a migration flag per user. **Cutover**: once 100% of traffic is on the new version, keep the old version running for 24–48 hours as fallback, then decommission. Monitor database connections and cache warming during ramp-up.

Some API operations take seconds to minutes (video transcoding, report generation, ML inference). Three patterns: **(1) 202 Accepted + polling**: the server immediately returns 202 with a `Location: /operations/{opId}` header. The client polls the operation URL until it transitions from `pending` to `complete` or `failed`. Simple for clients but wastes bandwidth on polling and adds latency.

http
POST /reports → 202 Accepted
Location: /operations/op_abc123

GET /operations/op_abc123 → { "status": "pending", "progress": 45 }
GET /operations/op_abc123 → { "status": "complete", "result_url": "..." }

**(2) Webhooks**: on completion, the server calls a client-registered callback URL. Zero latency overhead on the server, no polling. But the client must expose a public HTTPS endpoint, handle retries from the server, and verify HMAC signatures. Harder to implement for browser clients. **(3) SSE**: client opens an event stream; the server pushes progress updates and the final result. No polling, no webhook infrastructure needed. Works well in browsers. Timeout if the operation takes longer than nginx's `proxy_read_timeout`. Best choice by scenario: SSE for browser-initiated operations; webhooks for server-to-server or mobile; polling for the simplest cross-platform SDK compatibility.

A well-designed search API separates filtering (exact match, reduces the candidate set) from relevance scoring (how well documents match the query, reorders the candidate set). Conflating them leads to poor results: applying a relevance boost inside a filter clause returns wrong documents. Query DSL design:

json
POST /search
{
  "query": "software engineer",
  "filters": { "location": "remote", "salary_min": 100000 },
  "sort": "relevance",
  "cursor": "eyJzY29yZSI6MC45Miwi...",
  "limit": 20
}

Use `query` for full-text search (Elasticsearch `multi_match`, PostgreSQL `ts_vector`). Use `filters` for exact-match criteria applied as pre-filters before scoring. **Relevance sorting and cursor pagination**: relevance scores are floating-point values that can tie across pages. Without a tiebreaker, pages are unstable — the same document appears on multiple pages or is skipped. Always add a deterministic secondary sort key (document ID) as a tiebreaker. The cursor encodes `(score, id)` from the last document. Elasticsearch's `search_after` parameter implements this exactly. For keyword-sorted results (alphabetical, date), sort stability is easier — just use the primary key as a tiebreaker in the WHERE clause.

A well-designed SDK dramatically reduces integration time and support burden. Key elements: **(1) Error hierarchy**: base `ApiError` with sub-classes for each error family. Never expose raw HTTP errors; map them to typed exceptions.

ts
class ApiError extends Error { status: number; requestId: string; }
class AuthenticationError extends ApiError {}
class RateLimitError extends ApiError { retryAfter: number; }
class NotFoundError extends ApiError {}

**(2) Retry logic**: automatic retry for idempotent methods (GET) on 5xx and network errors; retry for 429 with exponential backoff honouring `Retry-After`. Never retry POST automatically (not idempotent); allow opt-in with idempotency key. **(3) Pagination helpers**: return async iterators or generator functions that transparently fetch the next cursor.

ts
for await (const job of client.jobs.list({ status: "active" })) {
  console.log(job.title); // SDK handles pagination automatically
}

**(4) Versioning alignment**: pin the SDK version to a specific API version (e.g., SDK v2.x always sends `API-Version: 2024-06-01`). Provide a way to override. Publish a changelog mapping SDK versions to API version changes. (5) Provide configurable base URL for sandbox vs production. (6) Include `X-SDK-Version` request header for observability.

OpenTelemetry (OTel) provides vendor-neutral distributed tracing. When service A calls service B, it propagates trace context via the **W3C Trace Context** standard headers:

http
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01

Format: `version-traceId(16B hex)-parentSpanId(8B hex)-flags`. `01` in flags means "sampled". Service B reads `traceparent`, creates a child span with the same `traceId` and its own `spanId`, setting the incoming `parentSpanId` as its parent. This connects the spans in the trace backend (Jaeger, Tempo, Honeycomb) into a single trace tree spanning all services. Implementation in Node.js: `@opentelemetry/api` with auto-instrumentation for Express/Fastify propagates headers automatically. Manual instrumentation:

js
const span = tracer.startSpan("process-payment", {
  kind: SpanKind.SERVER,
  attributes: { "user.id": userId, "payment.amount": amount }
});

Always propagate `tracestate` alongside `traceparent` for vendor-specific metadata (sampling decisions, baggage). In API gateways, log the `traceId` in access logs to correlate gateway logs with application traces. Sampling strategy: head-based sampling at the gateway (sample 1% of all traffic) or tail-based sampling (sample 100% of traces containing errors).

Usage-based API billing requires accurate metering at every request, quota enforcement in real time, and aggregation for billing calculations. Architecture layers: **(1) Metering at the gateway**: increment a Redis counter per API key per billing period on every request, before the request hits the application — this is the source of truth for quota enforcement.

lua
-- Atomic increment and check
local count = redis.call("INCR", "meter:key_abc:2024-01")
redis.call("EXPIRE", "meter:key_abc:2024-01", 2678400) -- 31 days
if count > limit then return 0 end return 1

**(2) Quota enforcement**: soft limits (warn at 80% via email) and hard limits (return 429 at 100%). Expose usage in API headers: `X-RateLimit-Used: 8000`, `X-RateLimit-Limit: 10000`. **(3) Aggregation for billing**: stream meter events (Kafka, Kinesis, or Redis INCR) to a time-series store (ClickHouse, TimescaleDB). Aggregate daily/monthly usage by customer, by endpoint, by tier. **(4) Stripe metering**: use Stripe Billing with usage records — report meter events to Stripe's metered billing API at the end of each billing period (or in near-real-time via the Meter Events API). **(5) Idempotent reporting**: meter events must be deduplicated before billing — use the request's `X-Request-ID` as the idempotency key when reporting to Stripe.

**Safe** methods (GET, HEAD, OPTIONS, TRACE) do not change server state — clients can call them freely without side effects. **Idempotent** methods (GET, HEAD, OPTIONS, TRACE, PUT, DELETE) produce the same server state regardless of how many times they are called — retrying them after a network timeout is safe. POST and PATCH are neither safe nor idempotent. **Cacheable** methods are those whose responses may be stored and reused: GET and HEAD are cacheable by default if the response includes appropriate cache headers; POST responses are cacheable only if `Cache-Control` or `Expires` explicitly allows it. DELETE and PUT are not cacheable. The matrix:

Method   Safe  Idempotent  Cacheable
GET      ✓     ✓           ✓
HEAD     ✓     ✓           ✓
POST     ✗     ✗           conditional
PUT      ✗     ✓           ✗
PATCH    ✗     ✗           ✗
DELETE   ✗     ✓           ✗

RFC 9110 (HTTP Semantics) formalises these. A compliant framework like Spring WebMVC or ASP.NET Core will refuse to add `Cache-Control` to PUT/DELETE responses by default, and an API gateway like Kong can be configured to only cache responses to safe methods. The practical implication: design your API so GET/HEAD are truly free of side effects (no audit-logging writes in the hot path), and ensure DELETE really is idempotent — a second DELETE on a missing resource should return 404, not 500.

A search API must balance expressiveness, performance, and predictable pagination. **Query DSL design**: separate *filters* (exact/range matches that do not affect relevance) from *query* (free-text that scores results). Use a flat structure for simple use cases and a nested boolean DSL for complex ones:

json
{
  "query": "senior engineer",
  "filters": { "location": "remote", "salary_min": 120000 },
  "sort": [{ "field": "relevance", "order": "desc" }, { "field": "created_at", "order": "desc" }]
}

**Relevance vs filtering separation**: filters run as fast no-score queries (bit sets in Lucene/Elasticsearch) and should not affect scoring. The relevance score comes from BM25 over the `query` field. Mixing them (e.g., boosting by recency) is a deliberate tuning decision, not a default. **Cursor pagination with sort stability**: offset pagination (`?page=3`) breaks when new results are inserted — items shift between pages. Cursor-based pagination encodes the last-seen sort key as an opaque token. For multi-field sorts (`relevance DESC, created_at DESC`), the cursor must encode both values and the WHERE clause becomes `WHERE (relevance, created_at) < (cursor.relevance, cursor.created_at)` using tuple comparison. Encode as base64 JSON to keep the API surface clean. **Sort stability requirement**: if two documents have identical sort keys, a tiebreaker (document `id`) must be the final sort field to guarantee the cursor never produces duplicate or skipped pages. **Response shape**: include `next_cursor`, `has_more`, `total_count` (if cheap), and `took_ms` for client-side debugging.

REST API Interview Questions (2026)

Frequently Asked Questions

REST vs GraphQL vs gRPC?

How do you version APIs?

What about idempotency?

Which auth scheme to use?

How do you document APIs?

Related Topics

Ready to apply?