Building an AI SaaS from Scratch with Next.js 15 (Complete Guide)
- Build in this exact order: scaffold, schema, auth, rate limiting, AI, billing, deployment. Each layer depends on the previous one β skipping any layer creates expensive rework.
- Use import 'server-only' in every file that handles API keys, the Stripe client, or the Supabase admin client. This is compile-time enforcement that prevents the most common AI SaaS security mistake.
- Streaming AI responses require a Route Handler β not a Server Action. Server Actions cannot return ReadableStream responses. Non-streaming AI operations belong in Server Actions.
- This guide builds a complete, deployable AI SaaS: Next.js 15 App Router, Supabase Auth + RLS, Stripe metered billing, Vercel AI SDK streaming via Route Handlers
- Build in this exact order: scaffold, env validation, database schema, auth middleware, rate limiting, AI Route Handler, Stripe billing, deployment
- Biggest risk: shipping AI features before rate limiting and billing β a single viral post can generate a five-figure API bill in hours
Suspected API key exposure in client bundle
grep -rn 'sk-' .next/static/chunks/ --include='*.js'grep -rn 'OPENAI\|ANTHROPIC\|sk-' src/app --include='*.tsx' | grep -v 'server-only\|// server'Stripe webhook signature verification fails
npx stripe listen --forward-to localhost:3000/api/webhooks/stripecurl -X POST http://localhost:3000/api/webhooks/stripe -H 'Content-Type: application/json' -d '{}'Slow queries on multi-tenant tables
SELECT tablename, indexname FROM pg_indexes WHERE tablename IN ('conversations','usage_records','messages');EXPLAIN ANALYZE SELECT * FROM conversations WHERE user_id = '00000000-0000-0000-0000-000000000000' ORDER BY created_at DESC LIMIT 20;Environment variable missing at runtime after successful build
vercel env lsgrep -rn 'process.env' src/lib/env.tsProduction Incident
Production Debug GuideCommon symptoms when building AI-powered SaaS applications with Next.js 15 and Supabase
request.text() before calling stripe.webhooks.constructEvent(). Do not call request.json() before verification.auth.uid() matches the user_id column type in your table. Both must be uuid. Also verify your middleware is running and refreshing the session β stale tokens cause auth.uid() to return null, which RLS treats as an unauthenticated request.Most AI SaaS tutorials stop at a chatbot demo. They skip the infrastructure that turns a prototype into a product: authentication, billing, multi-tenancy, rate limiting, and usage tracking. Without these, you have a tech demo, not a business.
This guide builds a complete, deployable AI SaaS application from the first commit to production. By the end, you have a running application with: authenticated users scoped to their own data, AI chat with streaming responses, per-user token quotas enforced before every AI call, Stripe metered billing that converts token consumption into revenue, and a Vercel deployment with isolated environments.
The stack is deliberate: Next.js 15 App Router for the application layer, Supabase for auth and data, Stripe for metered billing, Vercel AI SDK for model orchestration, and Upstash Redis for rate limiting. Each choice optimizes for developer velocity without sacrificing production readiness.
One warning before we start: the most expensive mistake in AI SaaS is building AI features before solving rate limiting and billing. The production incident below happened to a real team. It shapes every architectural decision in this guide.
What You Will Build
Before writing any code, understand exactly what this guide produces. The finished application is a deployable AI chat SaaS with the following capabilities:
Authenticated users can sign up, log in via email or OAuth, and access only their own data. The database enforces this at the row level β no application code can accidentally leak one user's conversations to another.
Each user has a monthly token quota. Before every AI call, the application checks the remaining quota. If the user has exceeded their limit, they receive a clear upgrade prompt. If they are within limits, the AI call proceeds and the token consumption is recorded.
A streaming chat interface sends messages to an AI model and displays the response word-by-word as it is generated. The model is called server-side β the API key never reaches the browser.
Stripe metered billing tracks every token consumed and generates invoices automatically at the end of each billing period. Users who exceed the free tier are billed based on actual usage.
The application deploys to Vercel with three isolated environments: local development, preview (one per pull request), and production. Each environment has its own Supabase project and Stripe account.
The build order is non-negotiable: scaffold first, then schema, then auth, then rate limiting, then AI, then billing, then deployment. Each layer depends on the previous one. Skipping layers creates rework.
- Step 1: Project scaffold and environment validation β nothing works without this foundation.
- Step 2: Database schema with RLS β multi-tenancy must be enforced before any data is written.
- Step 3: Auth and middleware β users must exist before AI calls can be attributed to them.
- Step 4: Rate limiting β must be in place before any AI call can be made.
- Step 5: AI orchestration β the feature, not the foundation.
- Step 6: Stripe billing β converts usage into revenue.
- Step 7: Deployment β ships the product.
Prerequisites and Stack Overview
Before starting, verify the following accounts and tools are available.
Accounts required: Supabase (free tier is sufficient to start), Stripe (test mode for development), Vercel (Hobby plan supports this stack), Upstash (free tier supports the rate limiting patterns in this guide), and an OpenAI account with API access.
Local tools required: Node.js 20+ or Bun 1.1+, the Supabase CLI for migrations, the Stripe CLI for local webhook testing, and Git.
The stack choices are deliberate. Next.js 15 App Router provides Server Components, Route Handlers, and Server Actions in a single deployable unit β no separate backend service. Supabase bundles authentication, PostgreSQL with Row Level Security, and file storage behind one SDK. Stripe metered billing maps token consumption to revenue without custom invoice logic. Vercel AI SDK abstracts the AI model provider β swap between OpenAI and Anthropic without changing application code. Upstash provides serverless Redis with a per-request pricing model that suits Vercel's serverless execution model.
# Verify prerequisites node --version # Must be 20+ bun --version # Alternative to Node β 1.1+ supabase --version # Supabase CLI stripe --version # Stripe CLI # Create Next.js 15 project npx create-next-app@latest ai-saas \ --typescript \ --tailwind \ --eslint \ --app \ --src-dir \ --import-alias '@/*' cd ai-saas # Install all dependencies npm install \ @supabase/supabase-js \ @supabase/ssr \ stripe \ @stripe/stripe-js \ ai \ @ai-sdk/openai \ @ai-sdk/anthropic \ zod \ server-only \ @upstash/redis \ @upstash/ratelimit # Initialize Supabase locally supabase init supabase start
Project Scaffold and Environment Validation
The project structure sets conventions for the entire codebase. Every module has a predictable location. Every import path is explicit.
The directory structure separates concerns: lib contains all server-side integrations (Supabase, Stripe, AI), components contains the UI, and app contains routing. Within lib, each integration is an isolated module. This prevents circular imports and makes each module independently testable.
Environment variables are validated at startup using a Zod schema. Missing or malformed variables fail the build β not a customer request at 2am. The validation runs once at module load time on the server. Client-safe variables use the NEXT_PUBLIC_ prefix; server-only variables do not.
import { z } from 'zod' const envSchema = z.object({ // Supabase β NEXT_PUBLIC_ vars are safe to expose to the client NEXT_PUBLIC_SUPABASE_URL: z.string().url(), NEXT_PUBLIC_SUPABASE_ANON_KEY: z.string().min(1), // Service role key bypasses RLS β server-only, never NEXT_PUBLIC_ SUPABASE_SERVICE_ROLE_KEY: z.string().min(1), // Stripe β publishable key is safe for client; secret key is server-only STRIPE_SECRET_KEY: z.string().startsWith('sk_'), STRIPE_WEBHOOK_SECRET: z.string().startsWith('whsec_'), NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY: z.string().startsWith('pk_'), // AI providers β server-only, never NEXT_PUBLIC_ // OpenAI keys now support both sk- (legacy) and sk-proj- (project-scoped) formats OPENAI_API_KEY: z.string().min(1), ANTHROPIC_API_KEY: z.string().startsWith('sk-ant-').optional(), // Upstash Redis β for rate limiting UPSTASH_REDIS_REST_URL: z.string().url(), UPSTASH_REDIS_REST_TOKEN: z.string().min(1), // App NEXT_PUBLIC_APP_URL: z.string().url(), NODE_ENV: z.enum(['development', 'test', 'production']).default('development'), }) // Parse at module load time β build fails if any required variable is missing export const env = envSchema.parse(process.env) export type Env = z.infer<typeof envSchema>
Database Schema and Row Level Security
The schema is designed for multi-tenancy from the first migration. Every table has a user_id column. Row Level Security policies enforce that users access only their own rows. No application-level authorization code is needed for basic data access β the database enforces it.
The schema has three domains. User management: profiles extends the Supabase auth.users table with application-specific fields including the Stripe customer ID and subscription status. Content: conversations and messages store the chat history scoped to each user. Usage tracking: usage_records stores the token count and cost of every AI call, which feeds directly into Stripe metered billing.
Indexes are placed on (user_id, created_at DESC) for all tenant-scoped tables. This composite index serves the dominant query pattern: fetch a user's recent items in reverse chronological order.
The service role key bypasses RLS. It is used only in webhook handlers and admin operations β never in user-facing request handlers.
-- Enable UUID extension create extension if not exists "uuid-ossp"; -- ============================================================ -- USER MANAGEMENT -- ============================================================ -- Profiles extends auth.users β one row per authenticated user create table public.profiles ( id uuid references auth.users on delete cascade primary key, email text not null, full_name text, avatar_url text, -- Stripe integration stripe_customer_id text unique, subscription_status text not null default 'free' check (subscription_status in ('free', 'active', 'past_due', 'cancelled', 'trialing')), -- Token quota enforcement token_usage_current_month integer not null default 0, token_limit_monthly integer not null default 10000, -- Timestamps created_at timestamptz not null default now(), updated_at timestamptz not null default now() ); -- ============================================================ -- CONTENT -- ============================================================ create table public.conversations ( id uuid primary key default gen_random_uuid(), user_id uuid not null references public.profiles(id) on delete cascade, title text not null default 'New Conversation', model text not null default 'gpt-4o-mini', created_at timestamptz not null default now(), updated_at timestamptz not null default now() ); create table public.messages ( id uuid primary key default gen_random_uuid(), conversation_id uuid not null references public.conversations(id) on delete cascade, user_id uuid not null references public.profiles(id) on delete cascade, role text not null check (role in ('user', 'assistant', 'system')), content text not null, created_at timestamptz not null default now() ); -- ============================================================ -- USAGE TRACKING β feeds Stripe metered billing -- ============================================================ create table public.usage_records ( id uuid primary key default gen_random_uuid(), user_id uuid not null references public.profiles(id) on delete cascade, conversation_id uuid references public.conversations(id) on delete set null, model text not null, prompt_tokens integer not null, completion_tokens integer not null, total_tokens integer generated always as (prompt_tokens + completion_tokens) stored, cost_usd numeric(10, 6) not null, -- Stripe meter event ID for idempotency stripe_meter_event_id text unique, created_at timestamptz not null default now() ); -- ============================================================ -- STRIPE WEBHOOK IDEMPOTENCY -- ============================================================ create table public.processed_webhook_events ( stripe_event_id text primary key, processed_at timestamptz not null default now() ); -- ============================================================ -- INDEXES -- ============================================================ create index idx_conversations_user_created on public.conversations(user_id, created_at desc); create index idx_messages_conversation_created on public.messages(conversation_id, created_at asc); create index idx_messages_user on public.messages(user_id); create index idx_usage_records_user_created on public.usage_records(user_id, created_at desc); -- ============================================================ -- ROW LEVEL SECURITY -- ============================================================ alter table public.profiles enable row level security; alter table public.conversations enable row level security; alter table public.messages enable row level security; alter table public.usage_records enable row level security; alter table public.processed_webhook_events enable row level security; -- Profiles: users read and update their own profile only create policy "users_select_own_profile" on public.profiles for select using (auth.uid() = id); create policy "users_update_own_profile" on public.profiles for update using (auth.uid() = id); -- Conversations: users manage their own conversations only create policy "users_select_own_conversations" on public.conversations for select using (auth.uid() = user_id); create policy "users_insert_own_conversations" on public.conversations for insert with check (auth.uid() = user_id); create policy "users_update_own_conversations" on public.conversations for update using (auth.uid() = user_id); create policy "users_delete_own_conversations" on public.conversations for delete using (auth.uid() = user_id); -- Messages: users manage their own messages only create policy "users_select_own_messages" on public.messages for select using (auth.uid() = user_id); create policy "users_insert_own_messages" on public.messages for insert with check (auth.uid() = user_id); -- Usage records: users read their own usage only -- Inserts happen via service role in Route Handlers after AI calls create policy "users_select_own_usage" on public.usage_records for select using (auth.uid() = user_id); -- Webhook events: no direct user access β service role only -- No policies needed: all access via service role bypasses RLS -- ============================================================ -- TRIGGERS -- ============================================================ -- Auto-create profile when a user signs up via Supabase Auth create or replace function public.handle_new_user() returns trigger language plpgsql security definer set search_path = public as $$ begin insert into public.profiles (id, email, full_name, avatar_url) values ( new.id, new.email, new.raw_user_meta_data->>'full_name', new.raw_user_meta_data->>'avatar_url' ); return new; end; $$; create trigger on_auth_user_created after insert on auth.users for each row execute procedure public.handle_new_user(); -- Auto-update updated_at timestamp create or replace function public.handle_updated_at() returns trigger language plpgsql as $$ begin new.updated_at = now(); return new; end; $$; create trigger profiles_updated_at before update on public.profiles for each row execute procedure public.handle_updated_at(); create trigger conversations_updated_at before update on public.conversations for each row execute procedure public.handle_updated_at();
- Every table has a user_id column and RLS policies. Application code never adds WHERE user_id = ? β the database enforces it automatically.
- The service role key bypasses RLS entirely. Use it only in webhook handlers and admin operations, never in user-facing request handlers.
- RLS policies must be tested as rigorously as application code. A wrong policy exposes all users' data, regardless of how correct your application code is.
- The processed_webhook_events table uses service role only β no RLS policy is needed because user access is never intended.
auth.uid() returning null because middleware did not refresh the token.Authentication and Middleware
Supabase Auth handles the entire authentication lifecycle. The application does not implement auth logic β it consumes auth state from Supabase.
The middleware pattern is critical and must be implemented exactly as shown. The middleware runs on every request, calls getUser() to validate and refresh the session token, and redirects unauthenticated users away from protected routes. Without it, session tokens expire and users are logged out silently β or worse, RLS policies see a null user ID and block all data access.
The key distinction: always use getUser() in middleware and server-side code, never getSession(). getSession() returns cached session data that may be stale. getUser() makes a network request to Supabase to validate the token and return fresh user data. This is the single most common Supabase Auth bug in Next.js applications.
import { createServerClient } from '@supabase/ssr' import { NextResponse, type NextRequest } from 'next/server' // Routes that require authentication const PROTECTED_ROUTES = ['/dashboard', '/chat', '/settings', '/billing'] // Routes that redirect authenticated users away const AUTH_ROUTES = ['/login', '/signup'] export async function middleware(request: NextRequest) { let supabaseResponse = NextResponse.next({ request }) const supabase = createServerClient( process.env.NEXT_PUBLIC_SUPABASE_URL!, process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY!, { cookies: { getAll() { return request.cookies.getAll() }, setAll(cookiesToSet) { cookiesToSet.forEach(({ name, value }) => request.cookies.set(name, value) ) supabaseResponse = NextResponse.next({ request }) cookiesToSet.forEach(({ name, value, options }) => supabaseResponse.cookies.set(name, value, options) ) }, }, } ) // CRITICAL: Always call getUser() β not getSession() // getUser() validates the token server-side and refreshes it if expired // getSession() returns cached data that may be stale const { data: { user } } = await supabase.auth.getUser() const pathname = request.nextUrl.pathname // Redirect unauthenticated users away from protected routes const isProtectedRoute = PROTECTED_ROUTES.some(route => pathname.startsWith(route) ) if (isProtectedRoute && !user) { const redirectUrl = new URL('/login', request.url) redirectUrl.searchParams.set('redirectTo', pathname) return NextResponse.redirect(redirectUrl) } // Redirect authenticated users away from auth pages const isAuthRoute = AUTH_ROUTES.some(route => pathname.startsWith(route)) if (isAuthRoute && user) { return NextResponse.redirect(new URL('/dashboard', request.url)) } // CRITICAL: Return supabaseResponse β not NextResponse.next() // supabaseResponse carries the refreshed session cookies return supabaseResponse } export const config = { matcher: [ // Run middleware on all routes except static files and Next.js internals '/((?!_next/static|_next/image|favicon.ico|.*\.(?:svg|png|jpg|jpeg|gif|webp)$).*)', ], }
NextResponse.next(). The auth client is import 'server-only' β it never reaches a Client Component.Rate Limiting with Upstash Redis
Rate limiting must exist before any AI call can be made. This is not an optimization β it is a business requirement. Without it, a single user or a viral launch can generate an unbounded API bill.
Upstash provides serverless Redis with per-request billing, which fits Vercel's serverless execution model. The @upstash/ratelimit package provides multiple algorithm implementations. For AI SaaS, use sliding window β it provides smooth rate limiting that prevents burst abuse while allowing legitimate usage.
The rate limiter uses the authenticated user ID as the identifier, not the IP address. IP-based rate limiting is trivially bypassed with a VPN. User ID-based limiting requires authentication, which means every limited request is traceable to a specific account.
Two rate limits are enforced: a per-minute limit that prevents burst abuse (10 requests per minute per user) and a per-day limit that enforces the daily token budget (100 requests per day on the free tier). Both checks happen before the AI call is initiated.
import 'server-only' import { Ratelimit } from '@upstash/ratelimit' import { Redis } from '@upstash/redis' import { env } from '@/lib/env' const redis = new Redis({ url: env.UPSTASH_REDIS_REST_URL, token: env.UPSTASH_REDIS_REST_TOKEN, }) // Per-minute limit β prevents burst abuse // 10 requests per minute per user, sliding window export const minuteRatelimit = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, '1 m'), analytics: true, prefix: 'ratelimit:minute', }) // Per-day limit β enforces daily request budget on free tier // 100 requests per day per user, sliding window export const dailyRatelimit = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(100, '24 h'), analytics: true, prefix: 'ratelimit:daily', }) export interface RateLimitResult { success: boolean limit: number remaining: number reset: number // Unix timestamp when the window resets reason?: 'minute_limit' | 'daily_limit' } // Check both rate limits for a user // Returns the most restrictive result export async function checkRateLimit(userId: string): Promise<RateLimitResult> { const [minuteResult, dailyResult] = await Promise.all([ minuteRatelimit.limit(userId), dailyRatelimit.limit(userId), ]) // Per-minute limit hit if (!minuteResult.success) { return { success: false, limit: minuteResult.limit, remaining: minuteResult.remaining, reset: minuteResult.reset, reason: 'minute_limit', } } // Per-day limit hit if (!dailyResult.success) { return { success: false, limit: dailyResult.limit, remaining: dailyResult.remaining, reset: dailyResult.reset, reason: 'daily_limit', } } return { success: true, limit: dailyResult.limit, remaining: dailyResult.remaining, reset: dailyResult.reset, } }
- Rate limit check: Is this user making too many requests per minute? Reject if yes.
- Token quota check: Does this user have remaining tokens for this month? Reject if no.
- AI call: execute only after both checks pass.
- Usage record: write the actual token count after the response completes.
- Never reverse this order. A quota check after the call is too late β the cost has already been incurred.
AI Orchestration with Vercel AI SDK
AI streaming responses require a Route Handler β not a Server Action. This is a critical architectural distinction. Server Actions return serializable values; they cannot return streaming Response objects with ReadableStream bodies. The Vercel AI SDK's useChat hook expects to call an HTTP endpoint that responds with a streaming body. That pattern requires a Route Handler at app/api/chat/route.ts.
Server Actions are appropriate for non-streaming AI operations: generating titles, summarizing content, classifying text, or any operation where you wait for the complete response before returning. For streaming chat responses β the primary use case in this guide β use a Route Handler.
The Route Handler enforces the complete request lifecycle: authenticate the user, check rate limits, verify token quota, call the AI model, record usage on completion, and return the streaming response. Every step happens server-side. The API key is a server-only environment variable that never appears in client bundles.
import 'server-only' import { createOpenAI } from '@ai-sdk/openai' import { createAnthropic } from '@ai-sdk/anthropic' import { env } from '@/lib/env' // AI clients β server-only, never imported in Client Components // The server-only import above causes a build error if this file // is transitively imported by any Client Component export const openai = createOpenAI({ apiKey: env.OPENAI_API_KEY, }) export const anthropic = env.ANTHROPIC_API_KEY ? createAnthropic({ apiKey: env.ANTHROPIC_API_KEY }) : null // Model selection with fallback // Primary: OpenAI gpt-4o-mini (cost-effective for most queries) // Fallback: Anthropic claude-3-haiku (if OpenAI is unavailable) export type SupportedModel = | 'gpt-4o' | 'gpt-4o-mini' | 'claude-3-5-sonnet-latest' | 'claude-3-haiku-20240307' export function getModel(modelId: SupportedModel) { if (modelId.startsWith('gpt-')) { return openai(modelId) } if (modelId.startsWith('claude-') && anthropic) { return anthropic(modelId) } // Default fallback return openai('gpt-4o-mini') } // Approximate cost calculation for usage recording // Prices in USD per 1M tokens β update when provider pricing changes const MODEL_COSTS: Record<SupportedModel, { input: number; output: number }> = { 'gpt-4o': { input: 2.50, output: 10.00 }, 'gpt-4o-mini': { input: 0.15, output: 0.60 }, 'claude-3-5-sonnet-latest': { input: 3.00, output: 15.00 }, 'claude-3-haiku-20240307': { input: 0.25, output: 1.25 }, } export function calculateCost( model: SupportedModel, promptTokens: number, completionTokens: number ): number { const costs = MODEL_COSTS[model] ?? MODEL_COSTS['gpt-4o-mini'] return ( (promptTokens * costs.input + completionTokens * costs.output) / 1_000_000 ) }
Stripe Metered Billing
Stripe metered billing ties AI token consumption to revenue automatically. Each AI response records a meter event in Stripe. At the end of the billing period, Stripe aggregates all events and generates an invoice.
The integration has two paths. Subscription creation: a Checkout Session creates the customer, subscription, and payment method in one step. Usage recording: a meter event is created after each AI response, containing the token count for that response.
Webhooks handle the subscription lifecycle: payment success, payment failure, subscription cancellation, and trial expiry. The webhook handler must be idempotent β Stripe retries failed webhooks, and processing the same event twice can double-credit or double-charge a customer. The processed_webhook_events table (created in the schema migration) stores event IDs to prevent duplicate processing.
The webhook handler uses the raw request body for signature verification. Next.js App Router parses request bodies by default β use request.text() before stripe.webhooks.constructEvent() to get the unparsed body.
import 'server-only' import Stripe from 'stripe' import { env } from '@/lib/env' export const stripe = new Stripe(env.STRIPE_SECRET_KEY, { // Pin to a specific API version at project start // Update deliberately when Stripe releases breaking changes apiVersion: '2025-01-27.acacia', typescript: true, }) export async function createCustomer( userId: string, email: string ): Promise<string> { const customer = await stripe.customers.create({ email, metadata: { supabase_user_id: userId }, }) return customer.id } export async function createCheckoutSession( userId: string, email: string, stripeCustomerId: string | null, priceId: string ): Promise<string> { const session = await stripe.checkout.sessions.create({ customer: stripeCustomerId ?? undefined, customer_email: stripeCustomerId ? undefined : email, client_reference_id: userId, line_items: [{ price: priceId, quantity: 1 }], mode: 'subscription', success_url: `${env.NEXT_PUBLIC_APP_URL}/dashboard?upgraded=true`, cancel_url: `${env.NEXT_PUBLIC_APP_URL}/billing`, // Collect tax automatically β required for most jurisdictions automatic_tax: { enabled: true }, // Allow promo codes for growth campaigns allow_promotion_codes: true, }) if (!session.url) throw new Error('Failed to create checkout session URL') return session.url } export async function createBillingPortalSession( stripeCustomerId: string ): Promise<string> { const session = await stripe.billingPortal.sessions.create({ customer: stripeCustomerId, return_url: `${env.NEXT_PUBLIC_APP_URL}/billing`, }) return session.url } export async function recordMeterEvent( stripeCustomerId: string, totalTokens: number, idempotencyKey: string ): Promise<void> { await stripe.billing.meterEvents.create( { event_name: 'ai_tokens_used', payload: { stripe_customer_id: stripeCustomerId, value: totalTokens.toString(), }, }, // Stripe idempotency key prevents duplicate meter events { idempotencyKey } ) }
request.json() before stripe.webhooks.constructEvent().Deployment and Environment Isolation
The application deploys to Vercel with three isolated environments: local development, preview (one per pull request), and production. Isolation means each environment has its own Supabase project, its own Stripe account in test mode for preview and production mode for production, and its own set of environment variables.
Sharing a Supabase project or Stripe account between environments is a common mistake with expensive consequences. A migration that runs correctly in preview can corrupt production if the environments share the same database. A test webhook can flip a production user's subscription status.
The deployment checklist ensures nothing is missed across all three environments.
After initial deployment, reset the monthly token usage counter for all users at the start of each billing period. This runs as a scheduled Supabase Edge Function or a cron job via Vercel Cron.
name: CI and Deploy on: push: branches: [main] pull_request: branches: [main] jobs: verify: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - run: npm ci # Type check β catches missing env vars at build time via Zod - run: npm run build env: NEXT_PUBLIC_SUPABASE_URL: ${{ secrets.PREVIEW_SUPABASE_URL }} NEXT_PUBLIC_SUPABASE_ANON_KEY: ${{ secrets.PREVIEW_SUPABASE_ANON_KEY }} SUPABASE_SERVICE_ROLE_KEY: ${{ secrets.PREVIEW_SUPABASE_SERVICE_ROLE }} STRIPE_SECRET_KEY: ${{ secrets.PREVIEW_STRIPE_SECRET_KEY }} STRIPE_WEBHOOK_SECRET: ${{ secrets.PREVIEW_STRIPE_WEBHOOK_SECRET }} NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY: ${{ secrets.PREVIEW_STRIPE_PUBLISHABLE_KEY }} OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} UPSTASH_REDIS_REST_URL: ${{ secrets.UPSTASH_REDIS_REST_URL }} UPSTASH_REDIS_REST_TOKEN: ${{ secrets.UPSTASH_REDIS_REST_TOKEN }} NEXT_PUBLIC_APP_URL: https://preview.your-app.com - run: npm run lint
| Concern | Route Handler (app/api/chat/route.ts) | Server Action | When to Use Server Action |
|---|---|---|---|
| Streaming responses | Supported β toDataStreamResponse() returns a streaming Response | Not supported β Server Actions return serializable values, not ReadableStream | Never for streaming β always use Route Handler |
| useChat hook compatibility | Full β useChat calls a Route Handler via HTTP POST | Not compatible β useChat expects an HTTP endpoint | Never for useChat |
| Secret management | Automatic β server-only environment variables | Automatic β server-only environment variables | Both are equally secure |
| Non-streaming AI (title generation, summarization) | Works but adds routing overhead | Preferred β no HTTP endpoint needed, direct TypeScript call | All non-streaming AI operations |
| Type safety | Manual β parse request.json() and validate | End-to-end β TypeScript types flow from client to server | Server Actions when type safety matters more than streaming |
| Rate limiting | Applied before model call in the handler | Applied at the top of the action function | Both support rate limiting equally |
| Error handling | Return NextResponse.json with status codes | Throw errors β caught by error boundaries or try/catch in the client | Server Actions when error boundary handling is preferred |
π― Key Takeaways
- Build in this exact order: scaffold, schema, auth, rate limiting, AI, billing, deployment. Each layer depends on the previous one β skipping any layer creates expensive rework.
- Use import 'server-only' in every file that handles API keys, the Stripe client, or the Supabase admin client. This is compile-time enforcement that prevents the most common AI SaaS security mistake.
- Streaming AI responses require a Route Handler β not a Server Action. Server Actions cannot return ReadableStream responses. Non-streaming AI operations belong in Server Actions.
- Rate limiting uses the authenticated user ID as the identifier, not the IP address. Both limits β per-minute burst and per-day budget β must be checked before every AI call.
- Stripe webhook handlers must be idempotent. Store the event ID before processing and check for duplicates on every request. A non-idempotent handler will eventually double-charge a customer.
- Set a hard spending cap in the OpenAI dashboard as the last line of defense. Rate limiting protects against abuse. The spending cap protects against bugs in your rate limiting code.
β Common Mistakes to Avoid
Interview Questions on This Topic
- QHow would you design the architecture for a multi-tenant AI SaaS that must not expose API keys to the client?SeniorReveal
- QExplain how you would implement rate limiting for an AI SaaS without blocking legitimate users.Mid-levelReveal
- QWhat is Row Level Security and how does it differ from application-level authorization?JuniorReveal
- QWhy use a Route Handler instead of a Server Action for streaming AI responses, and when would you use a Server Action?Mid-levelReveal
Frequently Asked Questions
Why Next.js 15 App Router instead of a separate Express or Fastify backend?
Next.js 15 App Router with Route Handlers and Server Actions eliminates the need for a separate API service. Route Handlers handle webhooks and streaming responses. Server Actions handle mutations. The framework manages routing, rendering, and deployment in a single Vercel project. A separate backend adds: a separate deployment pipeline, CORS configuration between frontend and backend, shared type definitions that must be kept in sync, and an additional service to monitor and scale. For most AI SaaS applications, none of these trade-offs are worth the added complexity. Migrate to a separate backend when you need to scale API and frontend independently, support non-HTTP protocols like WebSockets at scale, or have multiple client applications (web, mobile, CLI) that share an API.
Can I use a different database instead of Supabase?
Yes, with trade-offs. The architecture works with any PostgreSQL database that supports Row Level Security β Neon is a strong alternative with instant branch databases for development environments. For auth, you would add Clerk or Auth.js as a separate service. For storage, you would add an S3-compatible service. Supabase is chosen because it bundles PostgreSQL with RLS, auth, and storage behind a single SDK and dashboard β one billing relationship instead of three or four. The trade-off in switching is integration complexity: you gain flexibility in each component but lose the integrated auth-to-database session flow that makes Supabase's RLS with auth.uid() work automatically.
How do I handle AI model provider outages?
The getModel() function in src/lib/ai.ts supports multiple providers. For resilience, wrap the streamText call in a try/catch and retry with the fallback model if the primary returns a 503 or timeout. Store the model used in each conversation β when a user resumes a conversation, use the same model that started it for consistency. Monitor provider status pages and configure Sentry alerts for elevated AI error rates. For production applications with strict availability requirements, implement exponential backoff with jitter for retries and expose the current provider's status on your application's status page.
How do I reset monthly token usage at the start of each billing period?
Add a Vercel Cron Job that runs on the first day of each month and resets token_usage_current_month to 0 for all users. The cron job calls a protected API route that uses the Supabase admin client to run an UPDATE on the profiles table. Protect the cron route with a shared secret in the Authorization header to prevent unauthorized resets. Alternatively, use a Supabase Edge Function with the pg_cron extension to run the reset as a scheduled database job β this keeps the reset logic closer to the data and does not require an external HTTP call.
When should I migrate from this stack to something more complex?
Migrate specific components when you hit concrete limits β not anticipated ones. Migrate from Supabase to self-hosted Postgres when Supabase's pricing or connection limits become a real cost, not a theoretical concern. Migrate from Next.js Route Handlers to a separate backend when you have multiple client applications or need to scale API traffic independently of frontend traffic. Migrate from Stripe metered billing to a custom billing engine when Stripe's pricing model does not match your revenue structure. The stack in this guide handles hundreds of thousands of users and millions of AI calls per month. Most AI SaaS applications will not outgrow it. Premature architectural migration is the most common way to turn a two-week task into a two-month project.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.