Mid-level 7 min · April 14, 2026

Next.js 15 AI SaaS — API Key Leak to Five-Figure Invoice

A client-bundled OpenAI key caused overnight five-figure charges during a free beta.

N
Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Every example here is drawn from a real system.

Follow
Production
production tested
May 23, 2026
last updated
1,510
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • This guide builds a complete, deployable AI SaaS: Next.js 15 App Router, Supabase Auth + RLS, Stripe metered billing, Vercel AI SDK streaming via Route Handlers
  • Build in this exact order: scaffold, env validation, database schema, auth middleware, rate limiting, AI Route Handler, Stripe billing, deployment
  • Biggest risk: shipping AI features before rate limiting and billing — a single viral post can generate a five-figure API bill in hours
✦ Definition~90s read
What is Next.js 15 AI SaaS — API Key Leak to Five-Figure Invoice?

Building an AI SaaS on Next.js 15 means you're shipping a full-stack application where the server handles API key management, usage tracking, and billing—not just rendering UI. The core challenge is that your users' OpenAI/Anthropic keys are high-value targets; a single leak in client-side code or a misconfigured environment variable can result in a five-figure invoice within hours.

Building an AI SaaS is like opening a restaurant.

This article walks through the exact architecture that prevents that: server-only API routes, Row Level Security in PostgreSQL to isolate tenant data, and Upstash Redis for distributed rate limiting that survives serverless cold starts. You're not just building a chat interface—you're building a billing firewall.

The stack is deliberately opinionated: Next.js 15 with the App Router for server components and API routes, Supabase for auth and database with RLS policies that enforce per-user key access, and Upstash Redis for stateless rate limiting that works across Vercel's edge functions. Prisma handles schema migrations and type-safe queries.

The alternative—storing keys in client-side env vars or using a monolithic backend—is exactly how leaks happen. This setup ensures that even if a frontend bundle is decompiled, the API keys never leave the server's process memory.

Where this fits: if you're building a consumer AI app (e.g., a chatbot wrapper), this architecture is overkill—you'd just proxy through your own key. But for a SaaS where customers bring their own API keys or you resell tokens, this is the minimum viable security posture.

The rate limiting with Upstash is non-negotiable: without it, a single user can exhaust your OpenAI quota in minutes. Redis' atomic increment operations give you sub-millisecond checks that scale to thousands of concurrent requests without database locks.

Plain-English First

Building an AI SaaS is like opening a restaurant. The AI model is a catering service you pay per plate — impressive, but it charges for every dish whether or not you have paying customers. You need a cash register (Stripe) before you open the doors, a guest list with spending limits (Supabase Auth + rate limiting), and a kitchen layout that keeps the catering contract private (server-side API calls). This guide builds the entire restaurant before the first customer walks in.

Most AI SaaS tutorials stop at a chatbot demo. They skip the infrastructure that turns a prototype into a product: authentication, billing, multi-tenancy, rate limiting, and usage tracking. Without these, you have a tech demo, not a business.

This guide builds a complete, deployable AI SaaS application from the first commit to production. By the end, you have a running application with: authenticated users scoped to their own data, AI chat with streaming responses, per-user token quotas enforced before every AI call, Stripe metered billing that converts token consumption into revenue, and a Vercel deployment with isolated environments.

The stack is deliberate: Next.js 15 App Router for the application layer, Supabase for auth and data, Stripe for metered billing, Vercel AI SDK for model orchestration, and Upstash Redis for rate limiting. Each choice optimizes for developer velocity without sacrificing production readiness.

One warning before we start: the most expensive mistake in AI SaaS is building AI features before solving rate limiting and billing. The production incident below happened to a real team. It shapes every architectural decision in this guide.

What Building an AI SaaS on Next.js 15 Actually Entails

Building an AI SaaS on Next.js 15 means constructing a full-stack application where the frontend, API routes, and server-side logic are unified under a single framework, while the AI inference layer runs externally (e.g., OpenAI, Anthropic, or a self-hosted model). The core mechanic is that Next.js 15's React Server Components and Server Actions handle prompt construction, API key management, and response streaming without exposing secrets to the client. This architecture eliminates the need for a separate backend service for orchestration, reducing latency and deployment complexity.

In practice, you define Server Actions that call the AI provider's API using environment variables (e.g., process.env.OPENAI_API_KEY). The client invokes these actions via form submissions or useActionState, receiving streamed responses through Server-Sent Events or direct return values. Key properties: all API keys stay server-side, rate limiting is enforced at the edge or server level, and billing logic (token counting, usage tracking) runs in the same request lifecycle. This tight coupling means a single misconfigured environment variable or a missing 'use server' directive can leak credentials or break the entire pipeline.

You should use this pattern when your AI SaaS requires low-latency interactions (e.g., chat, code generation) and you want to avoid managing a separate Node.js or Python backend. It's especially relevant for startups or teams shipping quickly, as it reduces the surface area for security audits. However, for high-throughput or multi-model orchestration, a dedicated backend service still wins — Next.js 15's serverless functions have cold starts and 10-second timeout limits that can throttle complex AI workflows.

API Key Exposure via Server Actions
A Server Action that logs the request body to the console in development can print your AI provider key if you accidentally pass it as a parameter — always use environment variables, not function arguments.
Production Insight
A real incident: A startup deployed a Next.js 15 AI chat app where the Server Action accepted an apiKey parameter from the client for flexibility. During a demo, a user inspected the network tab, saw the key in the request payload, and used it to generate $12,000 in OpenAI bills overnight.
Symptom: The server action logs showed apiKey in the request body; the client-side code had a hidden input field for the key.
Rule of thumb: Never accept API keys from client input — always read them from server-only environment variables and validate that no client-side code references them.
Key Takeaway
1. All AI provider API keys must live in server-only environment variables — never pass them through client requests or Server Action arguments.
2. Use Next.js 15's Server Actions with 'use server' directives to keep prompt construction and token counting server-side, preventing client-side manipulation of billing.
3. Monitor serverless function timeouts (10s default) and cold starts — for streaming AI responses, consider using Edge Runtime or a dedicated backend to avoid dropped connections.

Prerequisites and Stack Overview

Before starting, verify the following accounts and tools are available.

Accounts required: Supabase (free tier is sufficient to start), Stripe (test mode for development), Vercel (Hobby plan supports this stack), Upstash (free tier supports the rate limiting patterns in this guide), and an OpenAI account with API access.

Local tools required: Node.js 20+ or Bun 1.1+, the Supabase CLI for migrations, the Stripe CLI for local webhook testing, and Git.

The stack choices are deliberate. Next.js 15 App Router provides Server Components, Route Handlers, and Server Actions in a single deployable unit — no separate backend service. Supabase bundles authentication, PostgreSQL with Row Level Security, and file storage behind one SDK. Stripe metered billing maps token consumption to revenue without custom invoice logic. Vercel AI SDK abstracts the AI model provider — swap between OpenAI and Anthropic without changing application code. Upstash provides serverless Redis with a per-request pricing model that suits Vercel's serverless execution model.

terminalBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Verify prerequisites
node --version          # Must be 20+
bun --version           # Alternative to Node1.1+
supabase --version      # Supabase CLI
stripe --version        # Stripe CLI

# Create Next.js 15 project
npx create-next-app@latest ai-saas \
  --typescript \
  --tailwind \
  --eslint \
  --app \
  --src-dir \
  --import-alias '@/*'

cd ai-saas

# Install all dependencies
npm install \
  @supabase/supabase-js \
  @supabase/ssr \
  stripe \
  @stripe/stripe-js \
  ai \
  @ai-sdk/openai \
  @ai-sdk/anthropic \
  zod \
  server-only \
  @upstash/redis \
  @upstash/ratelimit

# Initialize Supabase locally
supabase init
supabase start
Install server-only — It Prevents Key Exposure
  • The server-only package causes a build error if a server-only module is imported in a Client Component.
  • Add import 'server-only' to any file that imports API keys, the Stripe client, or the Supabase service role client.
  • This is the automated enforcement of the rule that prevented the production incident — do not skip it.
  • The build fails loudly at compile time rather than silently at runtime when a key leaks.
Production Insight
server-only is a zero-cost compile-time guard against the most common AI SaaS security mistake. The npm package contains one line of code that throws a build error when imported in a client bundle. Install it before writing any server-side logic.
Key Takeaway
Five accounts, three CLI tools, one npm install. Verify prerequisites before writing code — a missing CLI tool or account will block a specific step and waste hours of debugging.

Project Scaffold and Environment Validation

The project structure sets conventions for the entire codebase. Every module has a predictable location. Every import path is explicit.

The directory structure separates concerns: lib contains all server-side integrations (Supabase, Stripe, AI), components contains the UI, and app contains routing. Within lib, each integration is an isolated module. This prevents circular imports and makes each module independently testable.

Environment variables are validated at startup using a Zod schema. Missing or malformed variables fail the build — not a customer request at 2am. The validation runs once at module load time on the server. Client-safe variables use the NEXT_PUBLIC_ prefix; server-only variables do not.

src/lib/env.tsTYPESCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import { z } from 'zod'

const envSchema = z.object({
  // Supabase — NEXT_PUBLIC_ vars are safe to expose to the client
  NEXT_PUBLIC_SUPABASE_URL: z.string().url(),
  NEXT_PUBLIC_SUPABASE_ANON_KEY: z.string().min(1),
  // Service role key bypasses RLS — server-only, never NEXT_PUBLIC_
  SUPABASE_SERVICE_ROLE_KEY: z.string().min(1),

  // Stripe — publishable key is safe for client; secret key is server-only
  STRIPE_SECRET_KEY: z.string().startsWith('sk_'),
  STRIPE_WEBHOOK_SECRET: z.string().startsWith('whsec_'),
  NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY: z.string().startsWith('pk_'),

  // AI providers — server-only, never NEXT_PUBLIC_
  // OpenAI keys now support both sk- (legacy) and sk-proj- (project-scoped) formats
  OPENAI_API_KEY: z.string().min(1),
  ANTHROPIC_API_KEY: z.string().startsWith('sk-ant-').optional(),

  // Upstash Redis — for rate limiting
  UPSTASH_REDIS_REST_URL: z.string().url(),
  UPSTASH_REDIS_REST_TOKEN: z.string().min(1),

  // App
  NEXT_PUBLIC_APP_URL: z.string().url(),
  NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
})

// Parse at module load time — build fails if any required variable is missing
export const env = envSchema.parse(process.env)

export type Env = z.infer<typeof envSchema>
The .env.example Discipline
Commit a .env.example file with every variable name and a placeholder value. This file is the authoritative list of required variables. When a new engineer clones the repository, they copy .env.example to .env.local and fill in real values. The Zod schema in env.ts enforces that every variable in .env.example has a valid value before the app starts.
Production Insight
Zod env validation at module load time catches missing variables during next build, not during a customer request. This pattern prevents the most common category of production deployment failures — missing environment variables that only surface when a specific feature is triggered.
Key Takeaway
Directory structure enforces module boundaries. Zod env validation fails fast at build time. The server-only package enforces at compile time what code review misses at review time.

Database Schema and Row Level Security

The schema is designed for multi-tenancy from the first migration. Every table has a user_id column. Row Level Security policies enforce that users access only their own rows. No application-level authorization code is needed for basic data access — the database enforces it.

The schema has three domains. User management: profiles extends the Supabase auth.users table with application-specific fields including the Stripe customer ID and subscription status. Content: conversations and messages store the chat history scoped to each user. Usage tracking: usage_records stores the token count and cost of every AI call, which feeds directly into Stripe metered billing.

Indexes are placed on (user_id, created_at DESC) for all tenant-scoped tables. This composite index serves the dominant query pattern: fetch a user's recent items in reverse chronological order.

The service role key bypasses RLS. It is used only in webhook handlers and admin operations — never in user-facing request handlers.

supabase/migrations/00001_initial_schema.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
-- Enable UUID extension
create extension if not exists "uuid-ossp";

-- ============================================================
-- USER MANAGEMENT
-- ============================================================

-- Profiles extends auth.users — one row per authenticated user
create table public.profiles (
  id uuid references auth.users on delete cascade primary key,
  email text not null,
  full_name text,
  avatar_url text,
  -- Stripe integration
  stripe_customer_id text unique,
  subscription_status text not null default 'free'
    check (subscription_status in ('free', 'active', 'past_due', 'cancelled', 'trialing')),
  -- Token quota enforcement
  token_usage_current_month integer not null default 0,
  token_limit_monthly integer not null default 10000,
  -- Timestamps
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now()
);

-- ============================================================
-- CONTENT
-- ============================================================

create table public.conversations (
  id uuid primary key default gen_random_uuid(),
  user_id uuid not null references public.profiles(id) on delete cascade,
  title text not null default 'New Conversation',
  model text not null default 'gpt-4o-mini',
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now()
);

create table public.messages (
  id uuid primary key default gen_random_uuid(),
  conversation_id uuid not null references public.conversations(id) on delete cascade,
  user_id uuid not null references public.profiles(id) on delete cascade,
  role text not null check (role in ('user', 'assistant', 'system')),
  content text not null,
  created_at timestamptz not null default now()
);

-- ============================================================
-- USAGE TRACKING — feeds Stripe metered billing
-- ============================================================

create table public.usage_records (
  id uuid primary key default gen_random_uuid(),
  user_id uuid not null references public.profiles(id) on delete cascade,
  conversation_id uuid references public.conversations(id) on delete set null,
  model text not null,
  prompt_tokens integer not null,
  completion_tokens integer not null,
  total_tokens integer generated always as (prompt_tokens + completion_tokens) stored,
  cost_usd numeric(10, 6) not null,
  -- Stripe meter event ID for idempotency
  stripe_meter_event_id text unique,
  created_at timestamptz not null default now()
);

-- ============================================================
-- STRIPE WEBHOOK IDEMPOTENCY
-- ============================================================

create table public.processed_webhook_events (
  stripe_event_id text primary key,
  processed_at timestamptz not null default now()
);

-- ============================================================
-- INDEXES
-- ============================================================

create index idx_conversations_user_created
  on public.conversations(user_id, created_at desc);

create index idx_messages_conversation_created
  on public.messages(conversation_id, created_at asc);

create index idx_messages_user
  on public.messages(user_id);

create index idx_usage_records_user_created
  on public.usage_records(user_id, created_at desc);

-- ============================================================
-- ROW LEVEL SECURITY
-- ============================================================

alter table public.profiles enable row level security;
alter table public.conversations enable row level security;
alter table public.messages enable row level security;
alter table public.usage_records enable row level security;
alter table public.processed_webhook_events enable row level security;

-- Profiles: users read and update their own profile only
create policy "users_select_own_profile"
  on public.profiles for select
  using (auth.uid() = id);

create policy "users_update_own_profile"
  on public.profiles for update
  using (auth.uid() = id);

-- Conversations: users manage their own conversations only
create policy "users_select_own_conversations"
  on public.conversations for select
  using (auth.uid() = user_id);

create policy "users_insert_own_conversations"
  on public.conversations for insert
  with check (auth.uid() = user_id);

create policy "users_update_own_conversations"
  on public.conversations for update
  using (auth.uid() = user_id);

create policy "users_delete_own_conversations"
  on public.conversations for delete
  using (auth.uid() = user_id);

-- Messages: users manage their own messages only
create policy "users_select_own_messages"
  on public.messages for select
  using (auth.uid() = user_id);

create policy "users_insert_own_messages"
  on public.messages for insert
  with check (auth.uid() = user_id);

-- Usage records: users read their own usage only
-- Inserts happen via service role in Route Handlers after AI calls
create policy "users_select_own_usage"
  on public.usage_records for select
  using (auth.uid() = user_id);

-- Webhook events: no direct user access — service role only
-- No policies needed: all access via service role bypasses RLS

-- ============================================================
-- TRIGGERS
-- ============================================================

-- Auto-create profile when a user signs up via Supabase Auth
create or replace function public.handle_new_user()
returns trigger
language plpgsql
security definer set search_path = public
as $$
begin
  insert into public.profiles (id, email, full_name, avatar_url)
  values (
    new.id,
    new.email,
    new.raw_user_meta_data->>'full_name',
    new.raw_user_meta_data->>'avatar_url'
  );
  return new;
end;
$$;

create trigger on_auth_user_created
  after insert on auth.users
  for each row execute procedure public.handle_new_user();

-- Auto-update updated_at timestamp
create or replace function public.handle_updated_at()
returns trigger
language plpgsql
as $$
begin
  new.updated_at = now();
  return new;
end;
$$;

create trigger profiles_updated_at
  before update on public.profiles
  for each row execute procedure public.handle_updated_at();

create trigger conversations_updated_at
  before update on public.conversations
  for each row execute procedure public.handle_updated_at();
RLS as Your Authorization Layer
  • Every table has a user_id column and RLS policies. Application code never adds WHERE user_id = ? — the database enforces it automatically.
  • The service role key bypasses RLS entirely. Use it only in webhook handlers and admin operations, never in user-facing request handlers.
  • RLS policies must be tested as rigorously as application code. A wrong policy exposes all users' data, regardless of how correct your application code is.
  • The processed_webhook_events table uses service role only — no RLS policy is needed because user access is never intended.
Production Insight
Test your RLS policies in the Supabase dashboard using the Auth > Policies > Test Policy feature before deploying. A policy that looks correct in isolation can fail silently when combined with session state. The most common bug: auth.uid() returning null because middleware did not refresh the token.
Key Takeaway
Multi-tenancy is enforced at the database level, not the application level. The processed_webhook_events table is your Stripe idempotency guard. Composite indexes on (user_id, created_at DESC) serve the dominant query pattern across all tenant-scoped tables.

Authentication and Middleware

Supabase Auth handles the entire authentication lifecycle. The application does not implement auth logic — it consumes auth state from Supabase.

The middleware pattern is critical and must be implemented exactly as shown. The middleware runs on every request, calls getUser() to validate and refresh the session token, and redirects unauthenticated users away from protected routes. Without it, session tokens expire and users are logged out silently — or worse, RLS policies see a null user ID and block all data access.

The key distinction: always use getUser() in middleware and server-side code, never getSession(). getSession() returns cached session data that may be stale. getUser() makes a network request to Supabase to validate the token and return fresh user data. This is the single most common Supabase Auth bug in Next.js applications.

src/middleware.tsTYPESCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import { createServerClient } from '@supabase/ssr'
import { NextResponse, type NextRequest } from 'next/server'

// Routes that require authentication
const PROTECTED_ROUTES = ['/dashboard', '/chat', '/settings', '/billing']
// Routes that redirect authenticated users away
const AUTH_ROUTES = ['/login', '/signup']

export async function middleware(request: NextRequest) {
  let supabaseResponse = NextResponse.next({ request })

  const supabase = createServerClient(
    process.env.NEXT_PUBLIC_SUPABASE_URL!,
    process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY!,
    {
      cookies: {
        getAll() {
          return request.cookies.getAll()
        },
        setAll(cookiesToSet) {
          cookiesToSet.forEach(({ name, value }) =>
            request.cookies.set(name, value)
          )
          supabaseResponse = NextResponse.next({ request })
          cookiesToSet.forEach(({ name, value, options }) =>
            supabaseResponse.cookies.set(name, value, options)
          )
        },
      },
    }
  )

  // CRITICAL: Always call getUser() — not getSession()
  // getUser() validates the token server-side and refreshes it if expired
  // getSession() returns cached data that may be stale
  const { data: { user } } = await supabase.auth.getUser()

  const pathname = request.nextUrl.pathname

  // Redirect unauthenticated users away from protected routes
  const isProtectedRoute = PROTECTED_ROUTES.some(route =>
    pathname.startsWith(route)
  )
  if (isProtectedRoute && !user) {
    const redirectUrl = new URL('/login', request.url)
    redirectUrl.searchParams.set('redirectTo', pathname)
    return NextResponse.redirect(redirectUrl)
  }

  // Redirect authenticated users away from auth pages
  const isAuthRoute = AUTH_ROUTES.some(route => pathname.startsWith(route))
  if (isAuthRoute && user) {
    return NextResponse.redirect(new URL('/dashboard', request.url))
  }

  // CRITICAL: Return supabaseResponse — not NextResponse.next()
  // supabaseResponse carries the refreshed session cookies
  return supabaseResponse
}

export const config = {
  matcher: [
    // Run middleware on all routes except static files and Next.js internals
    '/((?!_next/static|_next/image|favicon.ico|.*\.(?:svg|png|jpg|jpeg|gif|webp)$).*)',
  ],
}
The getUser vs getSession Distinction
  • getUser() makes a server-side request to Supabase to validate and refresh the token. Always use this in middleware and server-side code.
  • getSession() reads from the local cookie without validation. The session data may be expired or tampered with. Never use this for authorization decisions.
  • The middleware must return supabaseResponse — not NextResponse.next(). The supabaseResponse carries the refreshed session cookies back to the browser.
  • If middleware returns NextResponse.next() instead of supabaseResponse, session cookies are not updated and users are logged out on the next request.
Production Insight
The Supabase documentation provides a middleware template that is correct. Copy it exactly. Teams that deviate from it — even slightly — introduce silent session management bugs that are difficult to reproduce because they depend on token expiry timing.
Key Takeaway
Middleware is not optional — it is the session refresh mechanism. getUser() over getSession() everywhere. Return supabaseResponse from middleware, not NextResponse.next(). The auth client is import 'server-only' — it never reaches a Client Component.

Rate Limiting with Upstash Redis

Rate limiting must exist before any AI call can be made. This is not an optimization — it is a business requirement. Without it, a single user or a viral launch can generate an unbounded API bill.

Upstash provides serverless Redis with per-request billing, which fits Vercel's serverless execution model. The @upstash/ratelimit package provides multiple algorithm implementations. For AI SaaS, use sliding window — it provides smooth rate limiting that prevents burst abuse while allowing legitimate usage.

The rate limiter uses the authenticated user ID as the identifier, not the IP address. IP-based rate limiting is trivially bypassed with a VPN. User ID-based limiting requires authentication, which means every limited request is traceable to a specific account.

Two rate limits are enforced: a per-minute limit that prevents burst abuse (10 requests per minute per user) and a per-day limit that enforces the daily token budget (100 requests per day on the free tier). Both checks happen before the AI call is initiated.

src/lib/rate-limit.tsTYPESCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import 'server-only'
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'
import { env } from '@/lib/env'

const redis = new Redis({
  url: env.UPSTASH_REDIS_REST_URL,
  token: env.UPSTASH_REDIS_REST_TOKEN,
})

// Per-minute limit — prevents burst abuse
// 10 requests per minute per user, sliding window
export const minuteRatelimit = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(10, '1 m'),
  analytics: true,
  prefix: 'ratelimit:minute',
})

// Per-day limit — enforces daily request budget on free tier
// 100 requests per day per user, sliding window
export const dailyRatelimit = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(100, '24 h'),
  analytics: true,
  prefix: 'ratelimit:daily',
})

export interface RateLimitResult {
  success: boolean
  limit: number
  remaining: number
  reset: number // Unix timestamp when the window resets
  reason?: 'minute_limit' | 'daily_limit'
}

// Check both rate limits for a user
// Returns the most restrictive result
export async function checkRateLimit(userId: string): Promise<RateLimitResult> {
  const [minuteResult, dailyResult] = await Promise.all([
    minuteRatelimit.limit(userId),
    dailyRatelimit.limit(userId),
  ])

  // Per-minute limit hit
  if (!minuteResult.success) {
    return {
      success: false,
      limit: minuteResult.limit,
      remaining: minuteResult.remaining,
      reset: minuteResult.reset,
      reason: 'minute_limit',
    }
  }

  // Per-day limit hit
  if (!dailyResult.success) {
    return {
      success: false,
      limit: dailyResult.limit,
      remaining: dailyResult.remaining,
      reset: dailyResult.reset,
      reason: 'daily_limit',
    }
  }

  return {
    success: true,
    limit: dailyResult.limit,
    remaining: dailyResult.remaining,
    reset: dailyResult.reset,
  }
}
The Token Budget Mental Model
  • Rate limit check: Is this user making too many requests per minute? Reject if yes.
  • Token quota check: Does this user have remaining tokens for this month? Reject if no.
  • AI call: execute only after both checks pass.
  • Usage record: write the actual token count after the response completes.
  • Never reverse this order. A quota check after the call is too late — the cost has already been incurred.
Production Insight
Rate limiting by user ID requires the user to be authenticated before the check runs. This is intentional — it means unauthenticated requests cannot even reach the AI call. Authentication and rate limiting work together as a two-layer access control system.
Key Takeaway
Two rate limits: per-minute burst protection and per-day budget enforcement. User ID as identifier, not IP. Both checks before the AI call — never after. Upstash sliding window prevents the burst abuse that a fixed window allows at the window boundary.

AI Orchestration with Vercel AI SDK

AI streaming responses require a Route Handler — not a Server Action. This is a critical architectural distinction. Server Actions return serializable values; they cannot return streaming Response objects with ReadableStream bodies. The Vercel AI SDK's useChat hook expects to call an HTTP endpoint that responds with a streaming body. That pattern requires a Route Handler at app/api/chat/route.ts.

Server Actions are appropriate for non-streaming AI operations: generating titles, summarizing content, classifying text, or any operation where you wait for the complete response before returning. For streaming chat responses — the primary use case in this guide — use a Route Handler.

The Route Handler enforces the complete request lifecycle: authenticate the user, check rate limits, verify token quota, call the AI model, record usage on completion, and return the streaming response. Every step happens server-side. The API key is a server-only environment variable that never appears in client bundles.

src/lib/ai.tsTYPESCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import 'server-only'
import { createOpenAI } from '@ai-sdk/openai'
import { createAnthropic } from '@ai-sdk/anthropic'
import { env } from '@/lib/env'

// AI clients — server-only, never imported in Client Components
// The server-only import above causes a build error if this file
// is transitively imported by any Client Component

export const openai = createOpenAI({
  apiKey: env.OPENAI_API_KEY,
})

export const anthropic = env.ANTHROPIC_API_KEY
  ? createAnthropic({ apiKey: env.ANTHROPIC_API_KEY })
  : null

// Model selection with fallback
// Primary: OpenAI gpt-4o-mini (cost-effective for most queries)
// Fallback: Anthropic claude-3-haiku (if OpenAI is unavailable)
export type SupportedModel =
  | 'gpt-4o'
  | 'gpt-4o-mini'
  | 'claude-3-5-sonnet-latest'
  | 'claude-3-haiku-20240307'

export function getModel(modelId: SupportedModel) {
  if (modelId.startsWith('gpt-')) {
    return openai(modelId)
  }
  if (modelId.startsWith('claude-') && anthropic) {
    return anthropic(modelId)
  }
  // Default fallback
  return openai('gpt-4o-mini')
}

// Approximate cost calculation for usage recording
// Prices in USD per 1M tokens — update when provider pricing changes
const MODEL_COSTS: Record<SupportedModel, { input: number; output: number }> = {
  'gpt-4o': { input: 2.50, output: 10.00 },
  'gpt-4o-mini': { input: 0.15, output: 0.60 },
  'claude-3-5-sonnet-latest': { input: 3.00, output: 15.00 },
  'claude-3-haiku-20240307': { input: 0.25, output: 1.25 },
}

export function calculateCost(
  model: SupportedModel,
  promptTokens: number,
  completionTokens: number
): number {
  const costs = MODEL_COSTS[model] ?? MODEL_COSTS['gpt-4o-mini']
  return (
    (promptTokens * costs.input + completionTokens * costs.output) / 1_000_000
  )
}
Route Handler for Streaming, Server Action for Everything Else
  • Streaming AI responses require a Route Handler at app/api/chat/route.ts — Server Actions cannot return ReadableStream responses.
  • Non-streaming AI operations (title generation, summarization, classification) use Server Actions — they wait for the complete response before returning.
  • The useChat hook from Vercel AI SDK calls a Route Handler via HTTP POST — it is not compatible with Server Actions.
  • The Route Handler enforces auth, rate limiting, and quota checks server-side — the client cannot bypass them.
Production Insight
The onFinish callback in streamText runs after the stream is complete but before the streaming response is fully consumed by the client. Usage recording in onFinish is reliable but adds latency to the cleanup path. For high-volume applications, move usage recording to a background job to avoid blocking the connection.
Key Takeaway
Route Handler for streaming, Server Action for non-streaming AI operations. The seven-step sequence in the Route Handler is non-negotiable: authenticate, parse, rate limit, quota check, call model, record usage, return stream. Remove any step and you have either a security hole or an unbounded cost.

Stripe Metered Billing

Stripe metered billing ties AI token consumption to revenue automatically. Each AI response records a meter event in Stripe. At the end of the billing period, Stripe aggregates all events and generates an invoice.

The integration has two paths. Subscription creation: a Checkout Session creates the customer, subscription, and payment method in one step. Usage recording: a meter event is created after each AI response, containing the token count for that response.

Webhooks handle the subscription lifecycle: payment success, payment failure, subscription cancellation, and trial expiry. The webhook handler must be idempotent — Stripe retries failed webhooks, and processing the same event twice can double-credit or double-charge a customer. The processed_webhook_events table (created in the schema migration) stores event IDs to prevent duplicate processing.

The webhook handler uses the raw request body for signature verification. Next.js App Router parses request bodies by default — use request.text() before stripe.webhooks.constructEvent() to get the unparsed body.

src/lib/stripe.tsTYPESCRIPT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import 'server-only'
import Stripe from 'stripe'
import { env } from '@/lib/env'

export const stripe = new Stripe(env.STRIPE_SECRET_KEY, {
  // Pin to a specific API version at project start
  // Update deliberately when Stripe releases breaking changes
  apiVersion: '2025-01-27.acacia',
  typescript: true,
})

export async function createCustomer(
  userId: string,
  email: string
): Promise<string> {
  const customer = await stripe.customers.create({
    email,
    metadata: { supabase_user_id: userId },
  })
  return customer.id
}

export async function createCheckoutSession(
  userId: string,
  email: string,
  stripeCustomerId: string | null,
  priceId: string
): Promise<string> {
  const session = await stripe.checkout.sessions.create({
    customer: stripeCustomerId ?? undefined,
    customer_email: stripeCustomerId ? undefined : email,
    client_reference_id: userId,
    line_items: [{ price: priceId, quantity: 1 }],
    mode: 'subscription',
    success_url: `${env.NEXT_PUBLIC_APP_URL}/dashboard?upgraded=true`,
    cancel_url: `${env.NEXT_PUBLIC_APP_URL}/billing`,
    // Collect tax automatically — required for most jurisdictions
    automatic_tax: { enabled: true },
    // Allow promo codes for growth campaigns
    allow_promotion_codes: true,
  })

  if (!session.url) throw new Error('Failed to create checkout session URL')
  return session.url
}

export async function createBillingPortalSession(
  stripeCustomerId: string
): Promise<string> {
  const session = await stripe.billingPortal.sessions.create({
    customer: stripeCustomerId,
    return_url: `${env.NEXT_PUBLIC_APP_URL}/billing`,
  })
  return session.url
}

export async function recordMeterEvent(
  stripeCustomerId: string,
  totalTokens: number,
  idempotencyKey: string
): Promise<void> {
  await stripe.billing.meterEvents.create(
    {
      event_name: 'ai_tokens_used',
      payload: {
        stripe_customer_id: stripeCustomerId,
        value: totalTokens.toString(),
      },
    },
    // Stripe idempotency key prevents duplicate meter events
    { idempotencyKey }
  )
}
Webhook Idempotency Is Not Optional
  • Stripe retries webhooks that return non-200 responses — any processing failure causes a retry.
  • Store the event ID before processing and check for it on every incoming webhook. If it exists, return 200 immediately.
  • Insert the event ID before processing, not after. If processing fails midway, the retry will re-enter the handler and the idempotency check will prevent double-processing.
  • Never return a non-200 status for unhandled event types — Stripe will retry indefinitely. Return 200 for event types you do not handle.
Production Insight
Test your webhook handler locally before deploying using npx stripe listen --forward-to localhost:3000/api/webhooks/stripe. This tunnels real Stripe events to your local machine. Test every event type listed in the switch statement by triggering them from the Stripe CLI: stripe trigger checkout.session.completed.
Key Takeaway
Metered billing records token consumption as Stripe meter events. The webhook handler is the subscription lifecycle manager. Idempotency via the processed_webhook_events table prevents double-processing. Raw body for signature verification — never call request.json() before stripe.webhooks.constructEvent().

Deployment and Environment Isolation

The application deploys to Vercel with three isolated environments: local development, preview (one per pull request), and production. Isolation means each environment has its own Supabase project, its own Stripe account in test mode for preview and production mode for production, and its own set of environment variables.

Sharing a Supabase project or Stripe account between environments is a common mistake with expensive consequences. A migration that runs correctly in preview can corrupt production if the environments share the same database. A test webhook can flip a production user's subscription status.

The deployment checklist ensures nothing is missed across all three environments.

After initial deployment, reset the monthly token usage counter for all users at the start of each billing period. This runs as a scheduled Supabase Edge Function or a cron job via Vercel Cron.

.github/workflows/deploy.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
name: CI and Deploy

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - run: npm ci

      # Type check — catches missing env vars at build time via Zod
      - run: npm run build
        env:
          NEXT_PUBLIC_SUPABASE_URL: ${{ secrets.PREVIEW_SUPABASE_URL }}
          NEXT_PUBLIC_SUPABASE_ANON_KEY: ${{ secrets.PREVIEW_SUPABASE_ANON_KEY }}
          SUPABASE_SERVICE_ROLE_KEY: ${{ secrets.PREVIEW_SUPABASE_SERVICE_ROLE }}
          STRIPE_SECRET_KEY: ${{ secrets.PREVIEW_STRIPE_SECRET_KEY }}
          STRIPE_WEBHOOK_SECRET: ${{ secrets.PREVIEW_STRIPE_WEBHOOK_SECRET }}
          NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY: ${{ secrets.PREVIEW_STRIPE_PUBLISHABLE_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          UPSTASH_REDIS_REST_URL: ${{ secrets.UPSTASH_REDIS_REST_URL }}
          UPSTASH_REDIS_REST_TOKEN: ${{ secrets.UPSTASH_REDIS_REST_TOKEN }}
          NEXT_PUBLIC_APP_URL: https://preview.your-app.com

      - run: npm run lint
Set a Hard Spending Cap in the OpenAI Dashboard
Rate limiting and token quotas protect against normal usage abuse. They do not protect against bugs in your rate limiting code. Set a hard monthly spending cap in the OpenAI dashboard as a final safety net. Even if every server-side protection fails, the cap prevents an unbounded bill. Set it to 2x your expected monthly cost during beta. Increase it only after you have confidence in your rate limiting implementation.
Production Insight
Run supabase db push --linked to apply migrations to production — not supabase migration up which applies to the local instance. Verify by checking the Supabase dashboard for the production project after migration. A migration applied only to the local instance is the most common cause of production database errors after deployment.
Key Takeaway
Three environments, three isolated Supabase projects, three Stripe configurations. Never share database or billing infrastructure between environments. The pre-launch checklist is not optional — missed items from it are the source of most post-launch incidents.
● Production incidentPOST-MORTEMseverity: high

The Free Tier Abuse Incident

Symptom
The OpenAI invoice for the month showed five-figure API usage charges accumulated overnight. No paying customers existed — the product was in free beta with no usage caps.
Assumption
The team assumed their invite-only beta would limit usage organically. They planned to add rate limiting after launch once they understood usage patterns.
Root cause
The Hacker News post went viral. Thousands of users signed up within hours. Each user could make unlimited API calls — no per-user quota, no rate limiting, no usage caps. Worse: the developer had imported the OpenAI client in a file that was rendered as a Client Component. Next.js bundled the API key into the client JavaScript, making it visible in browser DevTools and extractable for direct API bypass without even using the application.
Fix
Implemented three protection layers immediately. First: per-user daily token limits stored in the database, checked before every AI call. Second: server-side rate limiting via Upstash Redis using a sliding window algorithm — 20 requests per user per minute, hard-blocked server-side. Third: moved all AI calls to Route Handlers with the API key stored as a server-only environment variable, never prefixed with NEXT_PUBLIC_. Added Stripe metered billing to convert future free usage into tracked revenue events from day one.
Key lesson
  • Never import AI SDK clients or API keys in files that could be rendered as Client Components — use server-only environment variables and Route Handlers or Server Actions for all AI calls.
  • Rate limiting is a launch requirement, not a post-launch feature. Implement it before the first user signs up.
  • A free tier without usage caps is an unlimited financial liability. Token quotas and rate limits must exist before public access.
  • Monitor your AI provider billing dashboard daily during any launch period — by the time the invoice arrives, the damage is done.
Production debug guideCommon symptoms when building AI-powered SaaS applications with Next.js 15 and Supabase7 entries
Symptom · 01
AI streaming response cuts off mid-sentence or times out after 10 seconds
Fix
Add export const maxDuration = 60 to your Route Handler file. Vercel Hobby plan caps at 10 seconds by default — Pro plan supports up to 300 seconds. Verify the stream reader is handling backpressure correctly and not buffering the entire response before sending.
Symptom · 02
Stripe webhook returns 400 on every request
Fix
Next.js App Router parses request bodies automatically. Stripe signature verification requires the raw unparsed body. Use const rawBody = await request.text() before calling stripe.webhooks.constructEvent(). Do not call request.json() before verification.
Symptom · 03
Supabase RLS policies block authenticated users from accessing their own data
Fix
Check that auth.uid() matches the user_id column type in your table. Both must be uuid. Also verify your middleware is running and refreshing the session — stale tokens cause auth.uid() to return null, which RLS treats as an unauthenticated request.
Symptom · 04
Users get logged out randomly after 10-60 minutes
Fix
Your middleware is missing or calling getSession() instead of getUser(). Only getUser() performs token validation and refresh. Replace all getSession() calls in middleware with getUser(). The middleware must run on every request that requires authentication.
Symptom · 05
Upstash rate limiter allows more requests than the configured limit
Fix
Check that your rate limiter is using the authenticated user ID as the identifier, not the IP address. IP-based rate limiting is trivially bypassed. Use await getUser() to retrieve the user ID server-side and pass it as the limiter identifier.
Symptom · 06
OpenAI API key visible in browser DevTools or network tab
Fix
The key is being imported in a Client Component or a file that is transitively imported by one. Run: grep -rn 'OPENAI' src/app --include='.tsx' --include='.ts' and check every match. Remove NEXT_PUBLIC_ prefix if present. Move all AI client instantiation to server-only files.
Symptom · 07
Stripe webhook handler processes the same event twice
Fix
Your handler is not idempotent. Store processed Stripe event IDs in a database table on first receipt. Before processing, check if the event ID already exists. If it does, return 200 immediately without processing. Stripe retries webhooks that return non-200 responses.
★ AI SaaS Quick Debug Cheat SheetFast diagnostics for the most common AI SaaS infrastructure failures. Copy-paste ready. Run from project root.
Suspected API key exposure in client bundle
Immediate action
Check compiled client chunks for server-only secrets
Commands
grep -rn 'sk-' .next/static/chunks/ --include='*.js'
grep -rn 'OPENAI\|ANTHROPIC\|sk-' src/app --include='*.tsx' | grep -v 'server-only\|// server'
Fix now
Move all AI client instantiation to src/lib/ai.ts with import 'server-only' at the top. Route all AI calls through Route Handlers or Server Actions. Never prefix AI keys with NEXT_PUBLIC_.
Stripe webhook signature verification fails+
Immediate action
Confirm raw body is being used for verification
Commands
npx stripe listen --forward-to localhost:3000/api/webhooks/stripe
curl -X POST http://localhost:3000/api/webhooks/stripe -H 'Content-Type: application/json' -d '{}'
Fix now
Replace request.json() with const rawBody = await request.text() before stripe.webhooks.constructEvent(). Add export const runtime = 'nodejs' to the webhook route file — edge runtime does not support raw body reading reliably.
Slow queries on multi-tenant tables+
Immediate action
Check for missing indexes on user_id columns
Commands
SELECT tablename, indexname FROM pg_indexes WHERE tablename IN ('conversations','usage_records','messages');
EXPLAIN ANALYZE SELECT * FROM conversations WHERE user_id = '00000000-0000-0000-0000-000000000000' ORDER BY created_at DESC LIMIT 20;
Fix now
Add composite indexes on (user_id, created_at DESC) for all tenant-scoped tables. Run: CREATE INDEX CONCURRENTLY idx_conversations_user_created ON conversations(user_id, created_at DESC);
Environment variable missing at runtime after successful build+
Immediate action
Confirm variable is available in the correct runtime context
Commands
vercel env ls
grep -rn 'process.env' src/lib/env.ts
Fix now
Server-only variables must not have NEXT_PUBLIC_ prefix. Client variables must have NEXT_PUBLIC_ prefix. Redeploy after adding variables to Vercel — existing deployments do not inherit newly added env vars.
Architecture Decision: Where to Execute AI Logic
ConcernRoute Handler (app/api/chat/route.ts)Server ActionWhen to Use Server Action
Streaming responsesSupported — toDataStreamResponse() returns a streaming ResponseNot supported — Server Actions return serializable values, not ReadableStreamNever for streaming — always use Route Handler
useChat hook compatibilityFull — useChat calls a Route Handler via HTTP POSTNot compatible — useChat expects an HTTP endpointNever for useChat
Secret managementAutomatic — server-only environment variablesAutomatic — server-only environment variablesBoth are equally secure
Non-streaming AI (title generation, summarization)Works but adds routing overheadPreferred — no HTTP endpoint needed, direct TypeScript callAll non-streaming AI operations
Type safetyManual — parse request.json() and validateEnd-to-end — TypeScript types flow from client to serverServer Actions when type safety matters more than streaming
Rate limitingApplied before model call in the handlerApplied at the top of the action functionBoth support rate limiting equally
Error handlingReturn NextResponse.json with status codesThrow errors — caught by error boundaries or try/catch in the clientServer Actions when error boundary handling is preferred

Key takeaways

1
Build in this exact order
scaffold, schema, auth, rate limiting, AI, billing, deployment. Each layer depends on the previous one — skipping any layer creates expensive rework.
2
Use import 'server-only' in every file that handles API keys, the Stripe client, or the Supabase admin client. This is compile-time enforcement that prevents the most common AI SaaS security mistake.
3
Streaming AI responses require a Route Handler
not a Server Action. Server Actions cannot return ReadableStream responses. Non-streaming AI operations belong in Server Actions.
4
Rate limiting uses the authenticated user ID as the identifier, not the IP address. Both limits
per-minute burst and per-day budget — must be checked before every AI call.
5
Stripe webhook handlers must be idempotent. Store the event ID before processing and check for duplicates on every request. A non-idempotent handler will eventually double-charge a customer.
6
Set a hard spending cap in the OpenAI dashboard as the last line of defense. Rate limiting protects against abuse. The spending cap protects against bugs in your rate limiting code.

Common mistakes to avoid

6 patterns
×

Building AI features before rate limiting and billing infrastructure

Symptom
AI features work perfectly in development. At launch, unexpected traffic drives API costs to five figures before anyone can respond. Adding rate limiting and billing retroactively requires refactoring every AI call site.
Fix
Follow the seven-step build order: scaffold, schema, auth, rate limiting, AI, billing, deployment. Rate limiting and billing are prerequisites for AI features, not additions to them.
×

Importing AI clients or API keys in files that become Client Components

Symptom
API keys are visible in browser DevTools under the Sources tab or in compiled JavaScript chunks in .next/static/. Attackers extract the key and make direct API calls, bypassing all rate limits and quotas.
Fix
Add import 'server-only' to every file that imports the AI client, Stripe client, or Supabase admin client. This causes a build error if the file is transitively imported by a Client Component. Run grep -rn 'sk-' .next/static/chunks/ after every build to verify no keys leaked.
×

Using getSession() instead of getUser() in middleware or server-side code

Symptom
Users appear authenticated but RLS policies fail with permission denied. Session tokens expire silently and users are not redirected to login. Data access fails intermittently depending on when the token was last refreshed.
Fix
Replace all getSession() calls in middleware and server-side code with getUser(). getUser() validates the token with Supabase servers and returns fresh user data. getSession() reads a local cookie that may be expired. The middleware must return supabaseResponse — not NextResponse.next().
×

Using a Server Action for streaming AI responses

Symptom
The AI response appears all at once after a long delay instead of streaming word-by-word. The useChat hook from Vercel AI SDK does not work with Server Actions. toDataStreamResponse() causes a TypeScript error in a Server Action context.
Fix
Move the streaming AI call to a Route Handler at app/api/chat/route.ts. Server Actions are for non-streaming AI operations — title generation, summarization, classification. useChat requires an HTTP endpoint that returns a streaming Response.
×

Non-idempotent Stripe webhook handlers

Symptom
Users receive doubled credits after subscription payment. Subscription status toggles between active and cancelled. The issue is intermittent and difficult to reproduce because it depends on Stripe retry timing.
Fix
Store the Stripe event ID in the processed_webhook_events table before processing. Check for the event ID at the start of every webhook request. If found, return 200 immediately. Insert the event ID before processing begins so that retries during processing are handled correctly.
×

Sharing a Supabase project or Stripe account across environments

Symptom
A migration that passes in the preview environment corrupts the production database. A test payment webhook changes a real customer's subscription status. Test conversations appear in production user accounts.
Fix
Create separate Supabase projects and Stripe accounts for development, preview, and production. The cost of separate projects is trivial. The cost of corrupted production data or incorrect billing is not.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How would you design the architecture for a multi-tenant AI SaaS that mu...
Q02SENIOR
Explain how you would implement rate limiting for an AI SaaS without blo...
Q03JUNIOR
What is Row Level Security and how does it differ from application-level...
Q04SENIOR
Why use a Route Handler instead of a Server Action for streaming AI resp...
Q01 of 04SENIOR

How would you design the architecture for a multi-tenant AI SaaS that must not expose API keys to the client?

ANSWER
Three mechanisms working together. First: all AI clients are instantiated in server-only modules — files with import 'server-only' at the top, which causes a build error if transitively imported by a Client Component. This is a compile-time enforcement that catches mistakes before deployment. Second: streaming AI responses use Route Handlers, not Server Actions — Route Handlers return Response objects with streaming bodies, and the API key never leaves the server. Third: multi-tenancy is enforced at the database level via Row Level Security. Every table has a user_id column and RLS policies that scope all queries to the authenticated user. Even if application code has a bug that omits a user filter, the database rejects the query. The combination of compile-time key protection, server-side execution, and database-level tenancy means no single bug can compromise either API keys or user data isolation.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
Why Next.js 15 App Router instead of a separate Express or Fastify backend?
02
Can I use a different database instead of Supabase?
03
How do I handle AI model provider outages?
04
How do I reset monthly token usage at the start of each billing period?
05
When should I migrate from this stack to something more complex?
N
Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Every example here is drawn from a real system.

Follow
Verified
production tested
May 23, 2026
last updated
1,510
articles · all by Naren
🔥

That's React.js. Mark it forged?

7 min read · try the examples if you haven't

Previous
Creating Reusable Component Libraries with shadcn/ui
44 / 47 · React.js
Next
Building Multi-Agent AI Systems with Next.js and LangGraph