SeniorArchitect

Feature Flags and A/B Testing

Ship behind flags, run experiments, and roll out features gradually. How to implement feature flags and A/B testing in frontend applications.

Frontend DigestFebruary 27, 20265 min read

architecture feature-flags experimentation

Feature flags let you deploy code without enabling it for everyone. A/B testing lets you compare variants to improve conversion or engagement. Together they support safer releases and data-driven product decisions. This guide covers how to implement them on the frontend.

What Feature Flags Are For

Flags let you turn features on or off without a new deploy. Use them for: gradual rollouts, killing switches for problematic code, enabling features per user or segment, and separating deploy from release. The frontend reads a flag (from config, API, or a provider) and branches UI or behavior accordingly.

Separating deploy from release: You can ship code to production behind a flag that's off for everyone. Then turn it on for internal users, then a percentage of traffic, then everyone. If something breaks, turn the flag off without rolling back the deploy. That reduces risk and lets you release when the feature is ready rather than when the deploy happens.

Implementing Flags on the Frontend

Flags can come from build-time config (env vars), a runtime config endpoint, or a third-party provider (LaunchDarkly, Split, etc.). For simple cases, a JSON config fetched at app load is enough. For experiments and targeting, use a provider that supports user attributes and rules. Always have a default (e.g. off) when the flag service fails so you don't break the app.

Build-time vs runtime: Build-time flags (e.g. NEXT_PUBLIC_FEATURE_X) are baked into the bundle and require a new build to change. Use them for features that are truly environment-specific. Runtime flags are fetched when the app loads (or from an API) and can be changed without a deploy. For most product features, runtime flags are preferable so you can toggle quickly in production.

A/B Testing Basics

Define a hypothesis, create variants (e.g. A and B), assign users to a variant (randomly or by segment), and measure an outcome (click-through, sign-up, revenue). Assignment must be stable—the same user sees the same variant on repeat visits. Persist the assignment in a cookie or local storage, or let the backend return the variant.

Stable assignment: If the user is logged in, use their user ID for assignment so they see the same variant across devices. If anonymous, use a persistent cookie or a fingerprint. Never assign on every page load or the user will see flicker between variants and your experiment metrics will be skewed. Document the assignment logic so analytics and backend can segment correctly.

Frontend Responsibilities

The frontend requests the user's variant (or flags) from your backend or provider, then renders the right UI. Avoid flicker by resolving flags before first paint where possible (e.g. inline script or server-rendered segment). For client-rendered apps, show a neutral loading state until flags are ready, or use defaults and hydrate with the correct variant.

Avoiding flash: If flags load after first paint, the user may see the default variant and then a layout shift when the real variant loads. Options: (1) Inline critical flags in the initial HTML (e.g. from server-side provider SDK). (2) Block first paint until flags are fetched (can hurt LCP). (3) Use a default that matches the most common variant and accept a small percentage of users seeing a brief flash. Many teams use (1) for above-the-fold experiments and (3) for the rest.

Ethics and Stats

Don't run experiments that harm users or mislead them. Use proper statistical methods and sample sizes; don't declare a winner too early. Document your experiments and clean up flags that are no longer needed to avoid technical debt.

Stats: Run experiments long enough to reach statistical significance for your primary metric. Stopping early can lead to false positives. Use a calculator or work with data/analytics to determine sample size and duration. When you ship a winning variant, remove the flag and the old code path so the codebase doesn't accumulate one-off experiment branches. Keep a registry of active flags and owners so someone is responsible for cleaning up.

Feature flags and A/B testing, used with clear ownership and rollout rules, make releases safer and product decisions more evidence-based. Start with a small set of flags and a simple evaluation path; add complexity only when you need it.

Cleanup and Ownership

Every flag should have an owner and a target state (e.g. "remove after rollout"). Schedule a quarterly review to retire flags that are fully rolled out or no longer needed. Unused flags add cognitive load and risk; a short registry (name, owner, purpose, removal date) keeps the codebase manageable. When rolling out a winning variant, remove the flag and the losing code path in the same release so you don't leave dead branches in the codebase. Document your flag naming and evaluation order (e.g. user id hash, then flag key) so new team members can reason about behavior without reading the code.