shadcn Dark Mode Broken by AI-Generated Hardcoded Colors
Dark mode unreadable from AI-generated shadcn components? Prevent design token drift with automated compliance checks and avoid white-on-white bugs.
- Combine v0.dev (generation + Design Mode) with Cursor (Agent mode + .cursorrules) to scaffold shadcn/ui components in minutes instead of hours
- Biggest risk: AI defaults to hardcoded Tailwind classes that break dark mode and custom themes — enforce semantic tokens at every step
- Treat every AI output as a first draft, not a finished component
Think of it like a factory assembly line. v0.dev is the machine that stamps out the raw part from a blueprint — and lets you tweak it visually in Design Mode before it leaves the press. Cursor is the engineer who reshapes and wires that part to fit the specific machine it's going into, using Agent mode and project rules to handle structural adaptation, not cosmetic touch-ups. Neither alone produces a finished component. Together they scale production from one component per hour to eight or more.
Manual component creation is a scaling bottleneck. Each component requires boilerplate, variant logic, accessibility markup, and design token integration. At five components this is manageable. At fifty it is unsustainable.
AI tools automate the scaffolding phase. By combining v0.dev's generative output — now with Design Mode for visual polishing before export — with Cursor's contextual editing — now with full Agent mode and .cursorrules for project-wide rule enforcement — you create a pipeline that produces dozens of consistent components per session. The developer shifts from writing boilerplate to curating and refining AI output.
This article documents a workflow our team used to generate 52 production components for a B2B SaaS dashboard in approximately six hours of active work across two engineers. The component library covered data display, form inputs, navigation, feedback states, and layout primitives. Without this pipeline, the same output would have taken three to four days.
The risk is real: shipping AI output that drifts from your design system's tokens, breaks accessibility standards, or imports unnecessary dependencies. This article covers the workflow, the failure modes we hit, and the quality gates that prevent them from reaching production.
A note on tooling versions: this workflow reflects the state of these tools in early 2026. Tailwind CSS v4 introduced a CSS-first configuration model — the @theme directive replaces the tailwind.config.ts file for token definitions. React 19 introduced the use() hook and first-class server component support, which affects how you structure components that fetch data. Both are addressed where relevant.
The Two-Tool Workflow: v0.dev and Cursor
This workflow uses each tool for the phase where it excels. Trying to do everything in one tool produces worse results and slower output.
v0.dev handles initial generation. It translates structured text prompts into functional React components using shadcn/ui primitives and Tailwind CSS. In 2026 it also offers Design Mode — a visual editor that lets you tweak layout, spacing, and color directly in the interface before exporting code. This removes a category of small fixes that previously required a Cursor round-trip.
Cursor handles contextualization. Its AI features — Chat with @codebase context, Agent and Composer mode for multi-file autonomous edits, inline Cmd+K transformations, and .cursorrules for project-wide rule enforcement — adapt generic v0.dev output to your project's design tokens, existing hooks, type definitions, and coding conventions.
The developer's role is quality control. You write the spec, review the generated scaffold, direct the refactoring, and sign off before merge. The AI handles the mechanical labor; you handle the judgment calls.
- v0.dev output is a first draft — assume 30 to 50 percent of it needs modification even after Design Mode adjustments.
- Cursor is the structural adaptation tool — it aligns generic output to project-specific context via Agent mode and .cursorrules.
- The developer is the quality gate — no AI output ships without human review of every line.
- Speed comes from repeating the loop efficiently, not from skipping review steps.
Phase 1: Generation with v0.dev
v0.dev translates structured UI descriptions into functional React components. Prompt quality directly determines output quality — a vague prompt produces a vague component that requires extensive rework.
A strong v0.dev prompt includes: the component name, one sentence describing its core function, the key props it accepts, the variants it supports, the states it must handle, the specific shadcn/ui primitives to use, and explicit token requirements.
After initial generation, use Design Mode to fix obvious visual issues — padding, spacing, color, layout — before exporting. This takes two to three minutes and removes a round of Cursor work.
v0.dev output is complete enough to run but not complete enough to ship. It will have hardcoded colors, generic types, and no connection to your project's hooks or utilities. That is expected. That is what Phase 2 addresses.
Phase 2: Customization with Cursor
Cursor transforms the v0.dev scaffold into a project-native component through three steps: contextualize, refactor, and validate.
Step 1 — Contextualize. Paste the v0.dev output into your project at the correct file path. Open Cursor Chat and provide context using @codebase, or explicitly reference key files: @src/styles/globals.css (for Tailwind v4 @theme tokens), @src/types/user.ts, @src/hooks/useDataTable.ts. The more precise the context, the better the adaptation.
Step 2 — Refactor. Use Cmd+K for inline targeted changes or Agent mode for multi-step transformations. Common refactoring commands are shown in the code block below. If you have a .cursorrules file, it enforces project conventions automatically — semantic token usage, import patterns, naming conventions — reducing the number of manual corrections needed.
Step 3 — Validate. Run tsc --noEmit. Render the component in both light and dark mode. Check the output of the hardcoded color audit script. Do not proceed to the quality gates until these three checks pass.
- Cursor's @codebase context has limits — large projects will not fit in a single context window.
- Reference files explicitly rather than relying on @codebase scans: @src/styles/globals.css, @src/types/user.ts, @src/hooks/useDataTable.ts.
- A well-configured .cursorrules file reduces context dependency because project rules are applied automatically.
- Agent mode handles multi-file refactoring better than Chat — use it for changes that touch more than two files.
Scaling to 50+ Components: The Specification System
Generating one component is a technique. Generating fifty consistently is a system. The difference is the Component Specification Document.
Before generating any component, define every component in a structured spec. For each component: name, one-sentence description, key props (three to five), variant options, required states, and whether it is a server or client component. This document becomes your prompt source and your living documentation.
The batch process is sequential and repeatable: spec → prompt → v0.dev generation → Design Mode review → Cursor refactor → quality gate → Storybook story → merge. Each component follows the same pipeline. Variation in output quality comes from variation in spec quality — not from the tools.
In our six-hour session generating 52 components, two engineers worked in parallel on separate component groups. One handled data display components (tables, charts, stat cards); the other handled form inputs and navigation. Parallel execution is possible because each component is self-contained and the pipeline is the same for both.
- A structured spec produces consistent components across the entire library because each prompt follows the same pattern.
- Without specs, each generated component drifts toward a different pattern depending on how the prompt was written.
- Specs serve as living documentation — they answer 'why does this component have these props?' without reading the implementation.
- The spec review is the cheapest review in the pipeline. Catch structural problems here, not after generation.
Quality Gates: The Non-Negotiable Checkpoint
Automation without quality gates multiplies technical debt at the same rate it accelerates production. Each of the four gates targets a distinct failure mode that AI generation introduces.
Gate 1 — Visual regression. Render the component in Storybook across all variants and all states (loading, error, empty, populated). Check both light and dark mode. Screenshot comparison catches layout breaks that look fine in isolation but break in composition.
Gate 2 — Accessibility audit. Run axe-core against the component in the browser or Storybook. AI-generated components miss ARIA labels, keyboard navigation, and focus management at a high rate. This gate is not optional — it is a legal requirement in many jurisdictions.
Gate 3 — Integration test with real data. Mock data hides edge cases that production data exposes: long strings, null values, empty arrays, deeply nested objects. Connect the component to your actual API or a fixture that mirrors production data shape.
Gate 4 — Bundle size check. AI sometimes suggests heavy dependencies for problems that have lightweight solutions. A generated table component should not pull in a full charting library. Measure the bundle impact of each component before merge.
For simple presentational components (cards, badges, alerts), all four gates take eight to ten minutes. For complex interactive components (data tables, multi-step forms), they take twenty to thirty minutes. That time is not optional — it is the price of sustainable speed.
- In our experience, one unreviewed AI component introduces three to five downstream bugs — type mismatches, token drift, missing keyboard handlers, or edge case render failures.
- Fixing a component post-merge takes four times longer than reviewing it pre-merge because downstream code has already been written against the broken implementation.
- Quality gates are not a slowdown — they are the mechanism that makes the speed sustainable.
Version Control and Team Workflow at Scale
Generating 50+ components creates a version control and review workflow problem. Without a clear branching and commit strategy, the PR queue becomes unmanageable and review quality drops.
We used a component-group branching strategy: one feature branch per logical group of components (data-display, form-inputs, navigation, feedback). Each branch contained six to ten related components. This kept PR diffs reviewable and allowed parallel work without merge conflicts.
Commit strategy within each branch: one commit per component, with a consistent message format. This makes bisecting straightforward if a component introduces a regression.
Review strategy: the author runs all quality gates locally before opening the PR. The reviewer checks only that the gates passed (via CI output) and does a spot-check on one component's light and dark mode rendering. With quality gates in CI, the reviewer is not re-checking mechanical compliance — they are checking judgment calls.
Common Failure Modes at Scale
After generating components in volume, specific failure patterns become predictable. These are the five most common issues we hit and have seen other teams hit.
- A single Button with five variants is easier to maintain than five separate button components.
- A single Card with size and density variants covers most display use cases without proliferation.
- Track your component count against your spec count. If components grow faster than specs, you have sprawl.
- Use Cursor Agent mode to refactor sprawl: 'Merge ButtonSmall, ButtonLarge, and ButtonIcon into a single Button component with size and icon variant props.'
The Design Token Drift Incident
- Never trust AI output to use your design tokens correctly — v0.dev defaults to generic Tailwind classes regardless of what you specify in your prompt.
- Automate design system compliance checks before merging any generated component. A shell script or ESLint rule catches what code review misses.
- Test every generated component in both light and dark mode before marking it done. Add this to your PR checklist, not your memory.
- In Tailwind v4, your semantic tokens are defined in your CSS file under @theme — make sure your audit scripts and AI prompts reference the correct location.
use() is more appropriate than useEffect for the data fetching pattern.Key takeaways
Common mistakes to avoid
6 patternsPrompting v0.dev for logic instead of presentation
Ignoring design token requirements in the prompt
Not validating AI-generated TypeScript types against project models
Creating new components when a variant would suffice
Skipping accessibility review because the component looks correct
Not generating Storybook stories alongside components
Interview Questions on This Topic
How would you design a system to automatically generate UI components that adhere to a company's design system?
Frequently Asked Questions
That's React.js. Mark it forged?
6 min read · try the examples if you haven't