Intermediate 10 min · April 14, 2026

Developer Productivity Stack 2026 — Trade-offs & Failures

Q: Is Neovim worth the learning curve for a team?

Neovim is worth the investment for individual engineers who prioritize terminal integration, startup speed, and long-term composability. For teams, standardize the formatter and linter configuration — not the editor. Biome's output is identical regardless of whether the engineer uses Neovim, VS Code, or Zed. Editor choice is personal; consistent code output is a team requirement. If you standardize on Neovim as a team, budget for a two-week ramp period per engineer and maintain a VS Code settings repository as a fallback for engineers who are not productive in Neovim within that window.

Q: How do you handle Bun incompatibility with a dependency?

Three-step resolution. First, check whether the issue is a known Bun bug or an intentional compatibility gap — most incompatibilities with modern packages are fixed in Bun releases within weeks. Second, add a Node.js matrix job in GitHub Actions that runs the affected test suite with Node.js alongside the Bun job. This catches regressions without blocking Bun usage elsewhere. Third, if the dependency is critical and the incompatibility is structural (C++ addon that assumes Node.js-specific V8 APIs), keep Node.js as the runtime for that specific package in the monorepo and use Bun for everything else. In our 400-dependency monorepo, fewer than five packages required this treatment after migrating in late 2025.

Q: How do you prevent Turborepo cache from serving stale artifacts?

Define precise inputs for every task in turbo.json. The inputs field determines the cache key — if any listed input changes, the cache is invalidated for that task. Include all source files, config files, and environment variable names that affect the task output. Exclude non-deterministic outputs from cacheable tasks: no timestamps, no random IDs, no process IDs in build artifacts. Validate cache correctness by running the same task twice with identical inputs and comparing outputs — they must be byte-for-byte identical. If a task cannot produce deterministic output, set cache: false and accept the rebuild cost. For the dev task, cache: false and persistent: true are always correct — development servers must not cache.

Q: Can Biome fully replace ESLint for enterprise projects?

Biome covers 90%+ of the ESLint rules that most teams actually enforce. The remaining gaps are in specialized plugins: eslint-plugin-security (no Biome equivalent for several vulnerability detection rules), eslint-plugin-jsx-a11y (Biome's accessibility rules cover most but not all), and custom project-specific rules (Biome does not support custom rule plugins). Our approach: run Biome for all standard formatting and linting, keep ESLint in a narrow security-only config for the rules with no Biome equivalent. This gives the speed benefit of Biome for 95% of checks and keeps the specific ESLint rules that have no alternative. Biome's plugin system is on the roadmap — re-evaluate the split configuration annually.

Q: Why use Neon instead of a local Docker Postgres for development?

Three reasons. First, startup time — Neon branches are instant, Docker Postgres takes 15-30 seconds to start on first run and requires Docker Desktop running in the background (1.5-2GB memory on macOS). Second, branch isolation — each engineer gets a Neon branch derived from a shared snapshot. Multiple engineers can work simultaneously without database state interference. Docker requires manual database resets to isolate state between engineers. Third, CI parity — integration tests in CI run against a Neon branch using the same connection pattern as local development. With Docker, CI and local use different database configurations, which is a source of 'passes locally, fails in CI' issues. The trade-off is network latency — Neon adds 5-20ms of network overhead per query compared to a local socket connection. For integration tests that run hundreds of queries, this adds seconds. We accept the trade-off for isolation benefits; time-critical code paths have unit tests with mocked database calls.

Q: How do you manage feature flags without a complex feature flag service?

PostHog provides feature flags as part of the analytics SDK — no separate service to deploy or maintain. Flags are defined in the PostHog dashboard, evaluated server-side in Server Components using the PostHog Node SDK, and passed down to client components as props. The server-side evaluation ensures flags are resolved before the page renders — no layout shift, no loading state for flag values. For flags that need to be evaluated at the edge (middleware, CDN rules), we use a simple JSON config committed to the repository and deployed with the application. Edge flags update on deployment; PostHog flags update in real time. This two-tier approach covers 95% of feature flag use cases without the complexity of a dedicated service.

AI-generated tests passed while hiding a six-figure reconciliation bug.

Naren Founder & Principal Engineer

20+ years shipping production ML systems and the infrastructure behind them. Drawn from code that ran under real load.

✓ Production

production tested

July 04, 2026

last updated

2,165

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide

⚡Quick Answer

Productivity in 2026 is measured by time-to-merge, not tool count — every tool either compresses that timeline or adds friction
This stack (Neovim, Cursor + Claude, Bun, Turborepo, Biome, Drizzle, Vercel) runs a B2B SaaS platform with 80K+ daily active users across two engineers
Biggest risk: AI assistants generate tests that reproduce the same logic errors as the code they test — boundary conditions require manual test cases

✦ Definition~90s read

What is My 2026 Developer Productivity Stack (Tools & Workflow)?

This article is a brutally honest, real-world postmortem of a specific developer productivity stack assembled in 2026. It's not a hype piece or a 'best tools' listicle. Instead, it walks through a deliberate, opinionated selection—Neovim with LazyVim, Cursor with Claude, Bun, Turborepo with remote caching, and Biome—and then dissects where each component actually fails in production.

★

A developer productivity stack is your personal workshop — the combination of tools, shortcuts, and workflows that let you go from idea to shipped code with the least friction.

The core argument is that productivity stacks are fragile, context-dependent systems where the integration friction between tools often negates their individual gains. You'll learn why Bun's speed breaks your Node.js ecosystem assumptions, why Turborepo's remote caching is a hidden cost center, and why Biome's all-in-one approach forces you to abandon mature ESLint/Prettier configurations.

This is for senior engineers who have already tried the shiny tools and need to understand the real trade-offs before committing a team to them. The article assumes you know the basics and want the unfiltered failure modes—the silent regressions, the CI pipeline surprises, and the cognitive overhead that doesn't show up in benchmarks.

Plain-English First

A developer productivity stack is your personal workshop — the combination of tools, shortcuts, and workflows that let you go from idea to shipped code with the least friction. In 2026, the workshop has shifted toward AI-assisted coding, local-first development, and opinionated toolchains that make decisions for you so you can focus on the hard problems. The tools listed here are not the newest or the most popular — they are the ones that survived daily production use and are still in the stack after two years.

⚙ Browser compatibility

Latest versions — ✓ supported

Chrome	Firefox	Safari	Edge
✓	✓	✓	✓

Developer productivity in 2026 is defined by one metric: time from intent to deployed change. Every tool in your stack either compresses that timeline or adds friction. This article documents the specific combination I use daily across a B2B SaaS platform serving 80K+ daily active users — maintained by two engineers.

This is not a survey of every tool on the market. Tools that are not listed were evaluated and did not survive production use. Where relevant, I name them and explain why they were cut.

The stack covers seven layers: editor, AI coding assistant, runtime and package manager, monorepo and build system, formatting and linting, terminal workflow, CI/CD pipeline, database and ORM, and deployment and observability. Each layer has trade-offs documented, failure modes named, and configuration shown.

Common misconception: productivity means typing faster. It does not. Productivity means fewer decisions, fewer context switches, and fewer round-trips to CI for things that should have been caught locally.

One warning before the stack: the most expensive incident in the past two years came not from a tool failure but from an AI assistant failure. That incident shapes how every tool in this stack is used — it is documented first.

Why Your Developer Productivity Stack Is Already Failing You

A developer productivity stack is the integrated set of tools, frameworks, and practices that reduce friction from code authoring to production deployment. The core mechanic is feedback loop compression: every tool in the stack must shorten the time between writing a change and knowing it works correctly. In Java, this means a stack that includes a fast build tool (Gradle with build caching), a reliable test runner (JUnit 5 with parallel execution), a static analysis engine (Error Prone), and a deployment pipeline that can ship in under 10 minutes. Anything slower than that is not productivity — it's overhead.

The key property is latency symmetry: the time to run a single unit test should be within 2 seconds, a full module compile under 30 seconds, and a CI pipeline under 10 minutes. If any layer exceeds these thresholds, developers context-switch, batch work, or skip verification entirely. The stack must also enforce consistency automatically — formatting, linting, and dependency management should be pre-commit hooks, not code review comments. In practice, the most productive stacks are opinionated: they trade flexibility for speed and safety.

Use a productivity stack when your team exceeds 5 developers or your codebase exceeds 50,000 lines. Below that, the overhead of configuring the stack outweighs the benefits. Above that, the cost of manual processes — slow builds, flaky tests, inconsistent style — compounds exponentially. The real value is not in any single tool but in the integration: a change that compiles, passes tests, and is deployable in under 15 minutes. That's the threshold where developer flow state becomes sustainable.

⚠ Stack ≠ Tool Collection

A productivity stack is not a list of tools you install. It's a system where each component's latency directly affects the others. A fast IDE with a 10-minute CI build is still a 10-minute feedback loop.

📊 Production Insight

Teams adopt a microservice architecture with 20+ services but keep a monorepo build that takes 45 minutes. Developers start skipping tests and merging without CI green. Rule: if your build takes longer than a bathroom break, developers will work around it — and that's when production bugs slip in.

🎯 Key Takeaway

Feedback loop latency is the single metric that determines whether a stack improves or degrades productivity.

Consistency automation (formatting, linting, dependency checks) must be pre-commit, not post-review.

A stack that takes more than 2 hours to set up per developer will never be adopted fully — optimize for zero-config onboarding.

thecodeforge.io

Developer Productivity Stack

Editor: Neovim + LazyVim

Neovim with the LazyVim distribution is my primary editor. The decision is not about vim keybindings — it is about composability, startup speed, and terminal integration.

LazyVim provides a curated plugin ecosystem with sane defaults. LSP configuration via nvim-lspconfig, syntax highlighting via treesitter, and debug adapters work out of the box. Configuration is a Lua overlay on top of LazyVim's defaults — updates flow without merge conflicts between my customizations and upstream changes.

The key advantage over VS Code is context preservation. Neovim runs inside tmux sessions. Detaching from a session and reattaching from a different machine restores the exact state: open buffers, terminal output, unsaved changes. VS Code Remote SSH approximates this but adds round-trip latency for every keypress and requires a persistent server process on the remote machine.

Zed is worth watching — its performance on large codebases is comparable to Neovim, and its built-in AI features reduce the need for a separate AI assistant. I evaluated Zed for two weeks in Q1 2026 and returned to Neovim primarily because Zed's plugin ecosystem does not yet match nvim-lspconfig's language server coverage for the languages in this stack.

Editor choice is individual — it is not a team decision. Standardize the formatter and linter, not the editor.

~/.config/nvim/lua/plugins/editor.luaLUA

return {
  {
    'LazyVim/LazyVim',
    opts = {
      colorscheme = 'tokyonight-storm',
    },
  },
  -- TypeScript language server
  {
    'neovim/nvim-lspconfig',
    opts = {
      servers = {
        -- ts_ls is the correct server name in nvim-lspconfig 2025+
        -- (renamed from tsserver in earlier versions)
        ts_ls = {
          settings = {
            typescript = {
              inlayHints = {
                includeInlayParameterNameHints = 'all',
                includeInlayFunctionParameterTypeHints = true,
                includeInlayVariableTypeHints = true,
              },
            },
          },
        },
        -- Biome as LSP for lint diagnostics inline
        biome = {},
      },
    },
  },
  -- Format on save via Biome
  {
    'stevearc/conform.nvim',
    opts = {
      formatters_by_ft = {
        typescript = { 'biome' },
        typescriptreact = { 'biome' },
        javascript = { 'biome' },
        javascriptreact = { 'biome' },
        json = { 'biome' },
        jsonc = { 'biome' },
      },
      -- Format on save, but not if save is triggered by autoread
      format_on_save = { timeout_ms = 500, lsp_fallback = true },
    },
  },
  -- Git integration
  {
    'lewis6991/gitsigns.nvim',
    opts = {
      signs = {
        add = { text = '+' },
        change = { text = '~' },
        delete = { text = '_' },
      },
    },
  },
}

Mental Model

Editor Selection Framework

The best editor is the one that disappears — you stop thinking about the tool and think only about the problem.

Startup speed matters when you open the editor 50+ times per day — Neovim opens in under 50ms; VS Code takes 1-3 seconds
Terminal integration matters when your workflow includes SSH sessions, remote log tailing, and database CLI access
Plugin composability matters when your stack changes quarterly — swap LSP servers and formatters without rewriting config
Onboarding cost is real — if your team cannot set up the editor in under 10 minutes, editor standardization is a team-wide tax
Standardize the formatter and linter configuration, not the editor — Biome's output is identical regardless of which editor runs it

📊 Production Insight

Neovim config drift across team members creates inconsistent tooling behavior. Shared formatter configs (biome.json) are more important than shared editor configs. One engineer using VS Code and one using Neovim produce identical formatted output when both run Biome — the editor is irrelevant to code quality.

🎯 Key Takeaway

Neovim's advantage is composability and terminal integration, not vim keybindings. Editor choice is personal — formatter and linter configuration is a team decision. If your editor config exceeds 200 lines, you are configuring more than you are coding.

AI Coding Assistant: Cursor + Claude

Cursor with Claude integration is the primary AI coding assistant. The critical differentiator over GitHub Copilot is context management: Cursor indexes the entire codebase and allows explicit file references in chat. Agent mode handles multi-file refactoring in a single session — rename a hook, update all call sites, generate updated tests, and update the Storybook story without switching windows.

The production incident above changed how AI assistance is used. The key insight: AI generates tests that reproduce the same logic errors as the code they test. When AI writes both the implementation and the tests for the same feature, you have zero independent verification — the engineer's review is the only check. That is not enough for business-critical logic.

I use AI assistants for four categories: boilerplate generation (CRUD operations, type definitions, component scaffolding), refactoring (rename across files, extract functions, update import paths), code explanation (what does this function do, what are the edge cases), and test structure generation (scaffold the test file, write the describe blocks — humans write the assertions for business logic).

I do not use AI for: architecture decisions, security-sensitive logic (authentication, authorization, encryption), financial calculations or aggregations, boundary-condition tests, or any code I cannot explain to a teammate without reading it.

.cursorrulesMARKDOWN

# Project Conventions for AI Assistance
# Last updated: 2026-04-14
# These rules apply to all AI-generated code in this repository

## What AI should generate
- Boilerplate: CRUD operations, type definitions, component shells
- Refactoring: renames, extractions, import path updates
- Test structure: describe blocks, test names, mock setup
- Documentation: JSDoc comments, README sections

## What AI must NOT generate without explicit human review
- Financial calculations, aggregations, or reconciliation logic
- Authentication or authorization logic
- Boundary conditions in date ranges, pagination, or numeric comparisons
- Database migration files — generate the schema change, human writes the migration
- Security-sensitive operations: encryption, token validation, input sanitization

## Code Style
- TypeScript strict mode — no `any` types, no `as` type assertions without a comment
- Use Zod for runtime validation at all API boundaries (Server Actions, API routes)
- Prefer `async/await` over `.then()` chains
- Return Result types for errors in library code — never throw except in React error boundaries
- Named exports only — no default exports

## Architecture
- React Server Components by default — add 'use client' only for interactivity or browser APIs
- Server Actions for all mutations — no REST endpoints for internal operations
- Database access through repository functions in src/db/repositories/ — no Drizzle queries in components
- Environment variables accessed only through src/env.ts (validated with Zod at startup)

## Testing
- Vitest for unit and integration tests — colocated at src/**/*.test.ts
- Playwright for E2E — one test file per critical user journey in e2e/
- AI generates test structure (describe, it, beforeEach) — humans write business logic assertions
- Boundary conditions MUST be written manually: date ranges, pagination edges, null/undefined handling

## Import conventions
- Import from barrel exports in src/components/ui — not direct file paths
- Import types with `import type` — enforced by Biome
- No barrel exports (index.ts) for internal modules — import directly from source

## Anti-patterns — flag and reject
- useEffect for data fetching — use Server Components or use() hook
- Raw Drizzle queries in React components — use repository functions
- Hardcoded color values in Tailwind — use semantic tokens from globals.css @theme
- forwardRef wrappers — ref is a standard prop in React 19

⚠ AI Assistant Anti-Patterns

📊 Production Insight

In our measurement across a three-month period, Cursor reduced boilerplate scaffolding time by roughly 50% and increased review time for complex logic by roughly 20%. Net productivity gain is real but smaller than marketing claims suggest. Track time-to-merge, not lines generated — time-to-merge is the only metric that reflects actual throughput.

🎯 Key Takeaway

AI assistants are best for boilerplate and refactoring — worst for architecture, security, and financial logic. The .cursorrules file is the highest-leverage configuration in the AI workflow. If you cannot explain what the AI wrote without reading it, do not ship it.

thecodeforge.io

Developer Productivity Stack

Runtime and Package Manager: Bun

Bun has replaced Node.js as the primary runtime and npm as the package manager. The switch was driven by three concrete improvements measured on our monorepo: install speed, test execution speed, and startup time.

Package installation with Bun is 5-15x faster than npm on cold installs with no cached lockfile, and 3-5x faster than pnpm. On a monorepo with 400+ dependencies, bun install runs in under 10 seconds versus 90+ seconds with npm. Combined with Turborepo remote caching, unchanged packages are never reinstalled.

Bun's test runner executes Vitest-compatible tests 2-3x faster than Node.js on our test suite. The native TypeScript transpiler eliminates the compilation step for test execution. In watch mode, this is the difference between feedback in under one second versus two to four seconds — which affects how frequently you run tests.

The trade-off is ecosystem compatibility. Bun does not support all native Node.js C++ addons. In our stack, fewer than 5% of dependencies had Bun compatibility issues, all of which were resolved by the time we migrated in late 2025. We maintain a Node.js matrix job in CI to catch regressions against libraries with known compatibility history.

package.jsonJSON

{
  "name": "@acme/app",
  "scripts": {
    "dev": "next dev --turbopack",
    "build": "next build",
    "start": "next start",
    "test": "bun test",
    "test:watch": "bun test --watch",
    "test:coverage": "bun test --coverage",
    "test:node": "node --experimental-vm-modules node_modules/.bin/vitest run",
    "lint": "biome check --write .",
    "lint:ci": "biome ci .",
    "typecheck": "tsc --noEmit",
    "db:generate": "drizzle-kit generate",
    "db:migrate": "drizzle-kit migrate",
    "db:studio": "drizzle-kit studio",
    "db:seed": "bun run src/db/seed.ts",
    "setup": "bun install && bun run db:migrate && bun run db:seed && cp .env.example .env.local",
    "setup:verify": "bun run typecheck && bun run lint:ci && bun test"
  },
  "dependencies": {
    "next": "15.x",
    "react": "19.x",
    "react-dom": "19.x",
    "drizzle-orm": "latest",
    "@neondatabase/serverless": "latest",
    "zod": "^3",
    "@t3-oss/env-nextjs": "latest"
  },
  "devDependencies": {
    "@biomejs/biome": "latest",
    "typescript": "5.x",
    "drizzle-kit": "latest",
    "@playwright/test": "latest",
    "husky": "latest",
    "lint-staged": "latest",
    "commitlint": "latest",
    "@commitlint/config-conventional": "latest"
  }
}

💡Bun Migration Strategy

Step 1 — Replace the package manager only: run bun install instead of npm install. Lowest risk, immediate gain on install speed. Do this first and run your full test suite before changing anything else.
Step 2 — Replace the test runner: Bun's test runner is compatible with Vitest's API for most use cases. Update the test script to bun test and verify all tests pass.
Step 3 — Replace the runtime: change node to bun in dev scripts. Verify all native dependencies work before this step.
Step 4 — Add a Node.js matrix job in CI: run tests with node as well as bun to catch compatibility regressions early.

📊 Production Insight

Bun's speed gains are most significant in large monorepos where install and test times scale with dependency count. On a project with fewer than 50 dependencies, the gains are marginal and may not justify the migration effort. Measure your install and test times before switching — if install takes under 15 seconds and tests run under 30 seconds, Bun will not materially change your workflow.

🎯 Key Takeaway

Bun's primary advantage is speed — 5-15x faster on cold installs, 2-3x faster on test execution on our monorepo. Ecosystem compatibility is the trade-off. Maintain a Node.js CI fallback and migrate in three stages: package manager, then test runner, then runtime.

Monorepo and Build System: Turborepo with Remote Caching

Turborepo manages the monorepo build graph. The single most important feature is remote caching — when a package's inputs have not changed, Turborepo restores its build artifacts from the remote cache instead of rebuilding. This transforms CI from a 12-minute operation to a 2-3 minute operation for typical PRs on our 12-package monorepo.

The monorepo follows a packages-and-apps structure. Shared libraries live in packages/: ui (shadcn/ui components), config (TypeScript, Biome, Tailwind configs), db (Drizzle schema and repositories), validation (shared Zod schemas). Deployable applications live in apps/: web (Next.js), api (Hono background API), and admin (Next.js internal tools).

Task pipelines define dependency relationships. build depends on ^build (build all dependencies first). test depends on build. typecheck depends on ^build. lint and format run independently with no dependencies — Turborepo parallelizes them.

The failure mode is cache poisoning via non-deterministic output. If a task produces different output for identical input — timestamps embedded in build artifacts, random IDs in generated code, environment variables not listed in env — the cache serves the first output forever until manually invalidated. Every cacheable task must produce identical output for identical input.

turbo.jsonJSON

{
  "$schema": "https://turbo.build/schema.json",
  "ui": "tui",
  "tasks": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": [".next/**", "!.next/cache/**", "dist/**"],
      "inputs": [
        "src/**",
        "tsconfig.json",
        "package.json",
        "next.config.ts",
        "tailwind.config.ts"
      ]
    },
    "test": {
      "dependsOn": ["build"],
      "outputs": ["coverage/**"],
      "inputs": [
        "src/**",
        "test/**",
        "vitest.config.ts",
        "bun.test.ts"
      ]
    },
    "typecheck": {
      "dependsOn": ["^build"],
      "outputs": [],
      "inputs": ["src/**", "tsconfig.json"]
    },
    "lint": {
      "dependsOn": [],
      "outputs": [],
      "inputs": ["src/**", "biome.json"]
    },
    "db:generate": {
      "cache": false,
      "outputs": ["drizzle/**"]
    },
    "dev": {
      "cache": false,
      "persistent": true
    }
  }
}

Mental Model

Why Monorepo Over Polyrepo

A monorepo is not about putting everything in one repository — it is about making cross-package changes atomic.

Atomic commits: a breaking API change in packages/db and its consumer update in apps/web ship as one commit — no coordinated PR dance across repositories
Shared tooling: one biome.json, one tsconfig base, one GitHub Actions workflow file for the entire codebase
Dependency deduplication: one version of React and Zod, not five slightly different versions across five repositories
Discoverability: engineers find shared code without consulting external documentation or knowing which repository owns it

📊 Production Insight

A monorepo without Turborepo remote caching is slower than separate repositories because CI rebuilds everything every time. Remote caching is not an optimization — it is the mechanism that makes the monorepo feasible. Without it, do not use a monorepo. For a team of two on a 12-package monorepo, remote caching reduced average CI time from 12 minutes to 2.5 minutes per PR.

🎯 Key Takeaway

Turborepo's value is entirely in remote caching — without it, a monorepo is a CI performance regression. Cache poisoning from non-deterministic output is the primary failure mode. Define precise inputs for every task and verify by running the same task twice and comparing outputs.

Formatting and Linting: Biome

Biome has replaced ESLint + Prettier as the unified formatting and linting tool. Written in Rust, Biome formats and lints a 200-file project in 150-300ms where ESLint + Prettier took 8-12 seconds. In watch mode and pre-commit hooks, this is the difference between feeling instant and feeling sluggish.

Configuration is a single biome.json — no plugin conflicts, no version mismatches between eslint-config-* packages, no separate .prettierrc file. When a new engineer joins, they run bun install and Biome works. There is no ESLint plugin resolution step.

Biome's formatter produces output nearly identical to Prettier. The linter covers 90%+ of ESLint rules we actually enforce. The remaining rules come from eslint-plugin-security, which has no Biome equivalent yet — we run ESLint in a narrow security-only config alongside Biome for that specific case.

The migration from ESLint + Prettier to Biome is covered by biome migrate, which converts most configurations automatically. The primary friction is custom ESLint plugins — evaluate Biome's built-in rule equivalents before deciding which plugins to keep.

biome.jsonJSON

{
  "$schema": "https://biomejs.dev/schemas/latest/schema.json",
  "vcs": {
    "enabled": true,
    "clientKind": "git",
    "useIgnoreFile": true
  },
  "files": {
    "ignoreUnknown": false,
    "ignore": ["node_modules", ".next", "dist", "coverage", "drizzle"]
  },
  "organizeImports": {
    "enabled": true
  },
  "linter": {
    "enabled": true,
    "rules": {
      "recommended": true,
      "correctness": {
        "noUnusedVariables": "error",
        "noUnusedImports": "error",
        "useExhaustiveDependencies": "error"
      },
      "style": {
        "noNonNullAssertion": "warn",
        "useImportType": "error",
        "noDefaultExport": "warn"
      },
      "suspicious": {
        "noExplicitAny": "error",
        "noConsoleLog": "warn"
      },
      "security": {
        "noDangerouslySetInnerHtml": "error"
      }
    }
  },
  "formatter": {
    "enabled": true,
    "indentStyle": "space",
    "indentWidth": 2,
    "lineWidth": 100,
    "lineEnding": "lf"
  },
  "javascript": {
    "formatter": {
      "quoteStyle": "single",
      "semicolons": "asNeeded",
      "trailingCommas": "all",
      "arrowParentheses": "always"
    }
  }
}

🔥Biome vs ESLint + Prettier: Honest Trade-off

Biome is faster and simpler but does not support the ESLint plugin ecosystem. The specific gaps in our stack: eslint-plugin-security (no Biome equivalent for several rules) and custom project-specific rules. We run ESLint in a narrow security-only config alongside Biome. If your project depends on eslint-plugin-jsx-a11y, note that Biome's a11y rules cover most of the same ground — evaluate the specific rules you actually enforce before deciding to keep ESLint.

📊 Production Insight

Linting speed affects code quality more than teams realize. When lint takes more than 3 seconds, engineers run it less frequently or disable it in their editor. Biome running in under 300ms means it runs on every save, every commit, and in CI without friction.

🎯 Key Takeaway

Biome replaces ESLint + Prettier with a single Rust-based tool that runs 30-50x faster. The trade-off is plugin ecosystem compatibility. If your project depends on specialized ESLint plugins, audit Biome's built-in rule equivalents before migrating — most teams find 90%+ coverage without keeping ESLint.

Terminal Workflow: tmux + Session Scripts

tmux manages all terminal sessions. Every project has a dedicated session with four pre-configured windows: editor, dev server, git operations, and log tailing. Attaching to a session restores the entire context — no manual window arrangement, no re-running dev server commands.

Session scripts automate setup. Running tms project-name creates or attaches to a session with the correct layout, starts the dev server, opens lazygit, and tails the relevant log stream. The script is idempotent — running it twice attaches to the existing session rather than creating a duplicate.

The key insight is that terminal sessions are persistent work contexts, not disposable windows. Detaching from a project, switching context for three hours, and reattaching restores the session exactly as left — dev server running, last test output visible, git diff intact.

For development on remote machines, tmux sessions run on a Fly.io development machine. SSH in from any laptop and attach to the same session. Development environment is machine-independent — a broken laptop means attaching from a different machine with no setup time.

~/.local/bin/tmsBASH

#!/usr/bin/env bash
# tmux session manager
# Usage: tms [project-name]
# If no project name given, uses current directory name
# Idempotent: attaches to existing session if it exists

set -euo pipefail

PROJECT_NAME=${1:-$(basename "$(pwd)")}
SESSION="dev-${PROJECT_NAME}"
PROJECT_DIR=${2:-$(pwd)}

# Check if session already exists
if tmux has-session -t "$SESSION" 2>/dev/null; then
  echo "Attaching to existing session: $SESSION"
  tmux attach-session -t "$SESSION"
  exit 0
fi

echo "Creating new session: $SESSION"

# Window 1: Editor
tmux new-session -d -s "$SESSION" -n editor -c "$PROJECT_DIR"
tmux send-keys -t "$SESSION:editor" 'nvim .' Enter

# Window 2: Dev server
tmux new-window -t "$SESSION" -n server -c "$PROJECT_DIR"
tmux send-keys -t "$SESSION:server" 'bun run dev' Enter

# Window 3: Git (lazygit)
tmux new-window -t "$SESSION" -n git -c "$PROJECT_DIR"
tmux send-keys -t "$SESSION:git" 'lazygit' Enter

# Window 4: Logs — tails Fly.io logs for the app matching project name
# Falls back to local docker compose logs if fly cli not available
tmux new-window -t "$SESSION" -n logs -c "$PROJECT_DIR"
if command -v flyctl &>/dev/null; then
  tmux send-keys -t "$SESSION:logs" "flyctl logs --app ${PROJECT_NAME} --tail" Enter
else
  tmux send-keys -t "$SESSION:logs" 'docker compose logs -f --tail=100' Enter
fi

# Window 5: Tests in watch mode
tmux new-window -t "$SESSION" -n tests -c "$PROJECT_DIR"
tmux send-keys -t "$SESSION:tests" 'bun test --watch' Enter

# Return to editor
tmux select-window -t "$SESSION:editor"

tmux attach-session -t "$SESSION"

💡tmux Session Design Principles

One session per project — never mix unrelated work in the same session; context is the value
Five standard windows: editor, server, git, logs, tests — the tests window running bun test --watch provides continuous feedback without a manual trigger
Session scripts must be idempotent — running them twice attaches to the existing session, never creates a duplicate
Name sessions with a prefix (dev-) to distinguish from ad-hoc terminal sessions created outside the script

📊 Production Insight

Context switching between projects without tmux costs 5-10 minutes of setup ritual per switch. With session scripts, switching takes 10 seconds — run tms other-project in a new terminal and both sessions persist independently. Over a day with three to four context switches, this saves 20-30 minutes.

🎯 Key Takeaway

tmux sessions are persistent work contexts, not disposable terminals. Session scripts eliminate the setup ritual entirely. Your development environment should be one command away — tms project-name — not four terminals and four commands.

CI/CD Pipeline: GitHub Actions with Turborepo Caching

GitHub Actions runs the CI pipeline. The pipeline is intentionally thin — it only runs what cannot run locally. Linting, formatting, and type-checking run in pre-commit hooks locally and are not duplicated in CI. CI runs: unit tests, integration tests, E2E tests (on main branch only), security scan, and deployment.

Turborepo remote caching is the CI performance mechanism. When a PR changes only the web app, CI restores cached build artifacts for every unchanged package — the UI library, config packages, and validation schemas rebuild from cache in seconds rather than minutes. A typical PR that touches one app rebuilds one app.

The pipeline has three sequential stages: verify (typecheck, unit tests), integrate (build, integration tests), deploy (preview for PRs, production for main). Each stage gates the next — a type error blocks integration tests from running, saving the resources that would be spent on tests that will fail regardless.

Deployment targets: Vercel for the Next.js application, Fly.io for background workers and the Hono API. Both support instant rollback — Vercel via deployment history, Fly.io via flyctl releases rollback.

.github/workflows/ci.ymlYAML

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

env:
  TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
  TURBO_TEAM: ${{ vars.TURBO_TEAM }}
  # Skip env validation in CI — secrets are injected directly
  SKIP_ENV_VALIDATION: 'true'

jobs:
  verify:
    name: Typecheck and Unit Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: oven-sh/setup-bun@v2
        with:
          bun-version: latest

      - name: Install dependencies
        run: bun install --frozen-lockfile

      - name: Type check
        run: bun run typecheck

      - name: Unit tests
        run: bun test --coverage
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

      - name: Security lint
        run: bunx eslint --config eslint.security.config.js 'src/**/*.{ts,tsx}'

  integrate:
    name: Build and Integration Tests
    needs: verify
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: oven-sh/setup-bun@v2
        with:
          bun-version: latest

      - name: Install dependencies
        run: bun install --frozen-lockfile

      - name: Build (with Turborepo cache)
        run: bun run build
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

      - name: Run database migrations on test branch
        run: bun run db:migrate
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

      - name: Integration tests
        run: bun test --testPathPattern='*.integration.test.ts'
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

  e2e:
    name: End-to-End Tests
    needs: integrate
    # E2E only runs on main — too expensive to run on every PR
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: oven-sh/setup-bun@v2
        with:
          bun-version: latest

      - name: Install dependencies
        run: bun install --frozen-lockfile

      - name: Install Playwright browsers
        run: bunx playwright install --with-deps chromium

      - name: Run E2E tests
        run: bunx playwright test
        env:
          PLAYWRIGHT_BASE_URL: ${{ secrets.STAGING_URL }}

  deploy-preview:
    name: Deploy Preview
    needs: integrate
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Deploy to Vercel preview
        uses: amondnet/vercel-action@v25
        with:
          vercel-token: ${{ secrets.VERCEL_TOKEN }}
          vercel-org-id: ${{ secrets.VERCEL_ORG_ID }}
          vercel-project-id: ${{ secrets.VERCEL_PROJECT_ID }}
          scope: ${{ secrets.VERCEL_ORG_ID }}

  deploy-production:
    name: Deploy Production
    needs: [integrate, e2e]
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Deploy to Vercel production
        uses: amondnet/vercel-action@v25
        with:
          vercel-token: ${{ secrets.VERCEL_TOKEN }}
          vercel-org-id: ${{ secrets.VERCEL_ORG_ID }}
          vercel-project-id: ${{ secrets.VERCEL_PROJECT_ID }}
          vercel-args: '--prod'
          scope: ${{ secrets.VERCEL_ORG_ID }}

⚠ CI Pipeline Anti-Patterns

📊 Production Insight

CI that takes more than 5 minutes trains engineers to batch changes and avoid small PRs. Small PRs are easier to review, safer to deploy, and faster to revert. Slow CI works against all of this. For a team of two with 8-10 PRs per week, reducing CI from 12 minutes to 2.5 minutes saved approximately 80 minutes of waiting per week and encouraged smaller, more frequent PRs.

🎯 Key Takeaway

CI should only run what cannot run locally. Pre-commit hooks handle lint, format, and type-check. CI handles tests, security scans, and deployment. Turborepo remote caching makes the monorepo CI fast. Every minute of CI time is paid by every engineer on every PR — optimize it ruthlessly.

Database and ORM: Drizzle + Neon

Drizzle ORM manages the database layer. The schema is defined in TypeScript using Drizzle's schema builder — there is no separate schema file, no code generation step, and no client to regenerate after schema changes. The TypeScript types derive directly from the schema definition.

Neon provides serverless Postgres. The two properties that matter for this stack: branch databases and instant cold starts. Each pull request can run against a dedicated Neon branch database — isolated, disposable, and seeded from a snapshot of production-anonymized data. This eliminated the shared development database that caused flaky integration tests when multiple engineers worked simultaneously.

The repository pattern keeps database logic out of components. All Drizzle queries live in src/db/repositories/. Components call repository functions — they never import drizzle or construct queries directly. This is enforced by .cursorrules and by TypeScript's module boundaries.

Migrations are SQL files generated by drizzle-kit and committed to the repository. The migration history is version controlled alongside the schema — the database state is always derivable from the repository history.

src/db/schema.tsTYPESCRIPT

import { pgTable, text, timestamp, uuid, integer, boolean, decimal } from 'drizzle-orm/pg-core'

export const users = pgTable('users', {
  id: uuid('id').primaryKey().defaultRandom(),
  email: text('email').notNull().unique(),
  name: text('name').notNull(),
  createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
  updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
})

export const subscriptions = pgTable('subscriptions', {
  id: uuid('id').primaryKey().defaultRandom(),
  userId: uuid('user_id').notNull().references(() => users.id, { onDelete: 'cascade' }),
  planId: text('plan_id').notNull(),
  status: text('status', {
    enum: ['active', 'cancelled', 'past_due', 'trialing'],
  }).notNull(),
  currentPeriodStart: timestamp('current_period_start', { withTimezone: true }).notNull(),
  currentPeriodEnd: timestamp('current_period_end', { withTimezone: true }).notNull(),
  // Financial columns — boundary conditions tested manually
  // See: production incident in this article
  cancelledAt: timestamp('cancelled_at', { withTimezone: true }),
  trialEndsAt: timestamp('trial_ends_at', { withTimezone: true }),
})

// Type exports — derived from schema, no codegen required
export type User = typeof users.$inferSelect
export type NewUser = typeof users.$inferInsert
export type Subscription = typeof subscriptions.$inferSelect
export type NewSubscription = typeof subscriptions.$inferInsert

Try it live

⚠ Database Boundary Condition Protocol

📊 Production Insight

Shared development databases cause flaky integration tests when multiple engineers work simultaneously — mutations from one engineer's test run affect another's. Neon branch databases eliminated this entirely. Each engineer runs against their own isolated database branch. CI runs against a fresh branch per job. Flaky integration tests dropped from 15% failure rate to under 1% after the switch.

🎯 Key Takeaway

Drizzle provides type-safe queries without codegen. Neon provides instant-branch Postgres for isolated development and test environments. The repository pattern keeps query logic out of components. Boundary conditions in date ranges require manual test coverage — not AI-generated tests.

Deployment and Observability: Vercel, Fly.io, Sentry, PostHog

The deployment layer has four components: Vercel for the Next.js application, Fly.io for background workers and the Hono API, Sentry for error tracking, and PostHog for product analytics and session replay.

Vercel handles Next.js deployment with zero configuration for the standard stack. Edge network deployment, preview URLs for every PR, and instant rollback via deployment history. The main trade-off is cost at scale — Vercel's pricing is reasonable for teams but becomes significant at high traffic volumes. Fly.io is the escape valve for workloads that do not fit Vercel's model: long-running jobs, WebSocket servers, background workers.

Sentry captures errors in production and links them to the deployment that introduced them. The integration with Next.js is configured in next.config.ts using @sentry/nextjs. Source maps are uploaded at build time — production errors show the original TypeScript source, not the compiled output.

PostHog provides feature flags, event tracking, and session replay. Feature flags allow shipping code to production gated behind a flag — a new feature ships to 5% of users, monitored for errors, then rolled out progressively. This reduces deployment risk without requiring separate staging environments for every change.

next.config.tsTYPESCRIPT

import { withSentryConfig } from '@sentry/nextjs'
import type { NextConfig } from 'next'

const nextConfig: NextConfig = {
  experimental: {
    // Turbopack is stable in Next.js 15+ — enabled by default with next dev --turbopack
    // ppr: true, // Partial Prerendering — evaluate for your use case
  },
  // Enforce that all environment variables are validated at build time
  // via src/env.ts — missing variables fail the build, not the runtime
  serverExternalPackages: ['@neondatabase/serverless'],
}

export default withSentryConfig(nextConfig, {
  // Sentry organization and project from environment variables
  org: process.env.SENTRY_ORG,
  project: process.env.SENTRY_PROJECT,
  // Upload source maps to Sentry at build time
  // Allows production errors to show original TypeScript source
  silent: true,
  widenClientFileUpload: true,
  // Hide source maps from client bundle — uploaded to Sentry only
  hideSourceMaps: true,
  disableLogger: true,
})

Try it live

Mental Model

The Observability Minimum

You do not need a full observability platform on day one — you need to know when something breaks before your users tell you.

Error tracking (Sentry) is mandatory from the first production deployment — zero tolerance for silent failures
Uptime monitoring (Better Uptime or Vercel's built-in) catches availability issues that Sentry misses
Database query monitoring catches performance regressions before they become user-facing slowness
Session replay (PostHog) is the fastest way to reproduce UI bugs reported by users who cannot describe what they did

📊 Production Insight

The most expensive production bugs we have seen are the silent ones — no error logs, no exceptions, no alerts. The reconciliation incident in this article was silent for 11 days. Observability at the application level (Sentry errors, PostHog funnel drops) caught a category of issue that infrastructure monitoring missed entirely. Add application-level observability before you need it.

🎯 Key Takeaway

Vercel plus Fly.io covers Next.js applications and background services with instant rollback on both. Sentry is non-negotiable from day one of production. PostHog's feature flags reduce deployment risk without requiring per-feature staging environments. Silent failures are the most expensive — instrument the application, not just the infrastructure.

Testing Strategy: The Three-Layer Approach

The testing strategy has three layers that run at different speeds and catch different categories of bugs. The production incident clarified the most important rule: AI generates test structure, humans write assertions for business logic.

Layer 1 — Unit tests with Bun test (Vitest-compatible). Colocated with source files as *.test.ts. Run on every save in watch mode and in pre-commit hooks. Fast: the full unit test suite runs in under 10 seconds. Cover: pure functions, utility logic, Zod schema validation, repository functions against a Neon branch. Boundary conditions are written manually — date ranges, pagination edges, null/undefined inputs, arithmetic boundaries.

Layer 2 — Integration tests with Bun test. Colocated in src/ as *.integration.test.ts. Run in CI after unit tests pass. Slower: 60-90 seconds for the full suite. Cover: Server Actions with real database operations, API route handlers, multi-step user flows that cross module boundaries. Each integration test runs against a fresh Neon branch — isolated state, no interference between tests.

Layer 3 — E2E tests with Playwright. Located in e2e/. Run in CI on main branch merges only. Slow: 4-8 minutes for the full suite. Cover: one test per critical user journey — signup, onboarding completion, subscription upgrade, core product action. E2E tests verify the system works end-to-end; unit and integration tests verify that the individual components work correctly.

src/db/repositories/subscriptions.test.tsTYPESCRIPT

import { describe, it, expect, beforeEach } from 'bun:test'
import { getSubscriptionsInPeriod } from './subscriptions'
import { db } from '@/db/client'
import { subscriptions } from '@/db/schema'
import { testDb } from '@/test/helpers/db'

// These tests were written manually after the production incident
// They test the SPECIFICATION (inclusive boundaries) not the implementation
// Do not replace with AI-generated tests

describe('getSubscriptionsInPeriod', () => {
  beforeEach(async () => {
    await testDb.reset()
    await testDb.seed.subscriptions()
  })

  it('includes subscriptions starting exactly on the start date (inclusive boundary)', async () => {
    const startDate = new Date('2026-01-01T00:00:00Z')
    const endDate = new Date('2026-01-31T23:59:59Z')

    // Subscription starting exactly at startDate must be included
    await testDb.insert.subscription({
      currentPeriodStart: startDate, // exactly on boundary
      currentPeriodEnd: new Date('2026-01-31T00:00:00Z'),
      status: 'active',
    })

    const results = await getSubscriptionsInPeriod(startDate, endDate)

    expect(results).toHaveLength(1)
    // Business rule: period start is inclusive — subscriptions starting on the
    // exact start date are part of the period
  })

  it('includes subscriptions ending exactly on the end date (inclusive boundary)', async () => {
    const startDate = new Date('2026-01-01T00:00:00Z')
    const endDate = new Date('2026-01-31T23:59:59Z')

    // Subscription ending exactly at endDate must be included
    await testDb.insert.subscription({
      currentPeriodStart: new Date('2026-01-15T00:00:00Z'),
      currentPeriodEnd: endDate, // exactly on boundary
      status: 'active',
    })

    const results = await getSubscriptionsInPeriod(startDate, endDate)

    expect(results).toHaveLength(1)
    // Business rule: period end is inclusive — subscriptions ending on the
    // exact end date are part of the period
  })

  it('excludes subscriptions starting after the end date', async () => {
    const startDate = new Date('2026-01-01T00:00:00Z')
    const endDate = new Date('2026-01-31T23:59:59Z')

    await testDb.insert.subscription({
      currentPeriodStart: new Date('2026-02-01T00:00:00Z'), // one day after endDate
      currentPeriodEnd: new Date('2026-02-28T00:00:00Z'),
      status: 'active',
    })

    const results = await getSubscriptionsInPeriod(startDate, endDate)

    expect(results).toHaveLength(0)
  })

  it('returns empty array when no subscriptions exist in period', async () => {
    const startDate = new Date('2025-01-01T00:00:00Z')
    const endDate = new Date('2025-01-31T23:59:59Z')

    const results = await getSubscriptionsInPeriod(startDate, endDate)

    expect(results).toHaveLength(0)
  })
})

Try it live

⚠ The AI Testing Rule

📊 Production Insight

AI-generated tests gave us false confidence for three months before the production incident. The tests passed. Reviews passed. The business behavior was wrong. The missing check was independent verification of the specification — not the implementation. Manual boundary tests are the only independent check when the same engineer writes both the code and the tests.

🎯 Key Takeaway

Three test layers: unit tests (fast, colocated, every save), integration tests (CI only, real database), E2E tests (main branch only, critical user journeys). AI generates structure; humans write business logic assertions. Boundary conditions are never AI-generated.

Stop Writing Commit Messages. Use Conventional Commits with Semantic Release.

Manual commit messages are a waste of mental cycles. They break changelogs, confuse CI/CD, and make you look unprofessional on PRs. Adopt Conventional Commits enforced by commitlint and commitizen. Pair it with semantic-release for fully automated versioning and changelog generation. The WHY is simple: your commit history becomes a searchable, automated asset instead of a messy narrative. Configure husky to run commitlint on commit-msg hook. Use commitizen's cz-conventional-changelog adapter for a CLI prompt that guides you through types, scopes, and descriptions. Then let semantic-release parse those commits into major/minor/patch bumps. No more 'update stuff' commits. No more manual version bumps. No more stale CHANGELOG.md. Your future self and your teammates will thank you every time they grep for that regression introduced in v2.3.0.

.commitlintrc.jsonJSON

{
  "extends": ["@commitlint/config-conventional"],
  "rules": {
    "scope-case": [2, "always", "lowerCase"],
    "header-max-length": [2, "always", 100]
  }
}

🔥Production Trap:

Don't skip the 'BREAKING CHANGE' footer. Many teams adopt conventions but fail to document breaking changes. Without this marker, semantic-release bumps a patch instead of a major version. Your consumers get silently broken APIs in production.

🎯 Key Takeaway

If it's not in the commit message, it's not in the changelog. Automate that or waste time writing release notes by hand.

Eliminate Random Test Flakiness with Deterministic Seed Configuration

Flaky tests are productivity vampires. They drain trust in CI, cause false positives, and make developers skip test suite runs entirely. The root cause is almost always non-deterministic behavior: random seeds, async timing, or shared state. Fix it by making every test execution reproducible. For Python, set PYTHONHASHSEED=0 in your test config. For Jest (JavaScript/TypeScript), use a fixed --seed flag. In Go, use rand.New(rand.NewSource(42)). This forces your RNG-based tests to produce identical results across runs. You'll catch ordering-dependent tests that only fail on the third CI retry. Your CI pipeline becomes predictable instead of a lottery. When a test fails, you can reproduce it locally with zero guesswork. No more 'works on my machine' – because you control the randomness.

jest.config.jsJAVASCRIPT

module.exports = {
  // ... other config
  testSequencer: './custom-sequencer.js',
  // Force deterministic seed for randomized tests
  seed: 42,
  // Ensure CI always runs in same timezone
  globalSetup: './jest.globalSetup.js'
};

Output

✓ passes (1 test) - jest with seed=42

✓ Deterministic_Feature_Test (8 ms)

Test Suites: 1 passed, 1 total

Tests: 12 passed, 12 total

Try it live

⚠ Production Trap:

Setting a static seed breaks tests that rely on randomness for coverage (e.g., fuzzers or property-based testing). Use a fixed seed for CI regression runs but allow an env variable override (SEED=$RANDOM) for local ad-hoc runs.

🎯 Key Takeaway

Flaky tests are a debt you pay with every merge. Fix the seed, fix the trust, fix the pipeline.

● Production incidentPOST-MORTEMseverity: high

AI-Generated Tests Reproduced the Same Logic Error as AI-Generated Code

Symptom

Monthly reconciliation reports showed a six-figure discrepancy between expected and actual fund allocations. No error logs. No exceptions thrown. The system appeared healthy across all monitoring dashboards.

Assumption

The team assumed the discrepancy was caused by an upstream API returning stale data during a known maintenance window two weeks earlier.

Root cause

An AI coding assistant generated a date-range filter for the reconciliation query. The generated code used exclusive end-date comparison (date < endDate) instead of inclusive (date <= endDate). The engineer reviewed the code and approved it. The AI-generated test suite also used the same off-by-one boundary — both production code and tests agreed on the wrong behavior. The tests passed. The reviews passed. The bug shipped.

Fix

Added boundary-condition test cases written manually by engineers based on the business specification — not generated by AI and not derived from the implementation. Added a reconciliation checksum that compares total allocated versus total received at the batch level before committing any batch. Added a daily automated alert for any allocation discrepancy exceeding a defined threshold.

Key lesson

AI-generated tests validate the implementation, not the specification — they will reproduce the same logic errors as the code they test because they derive from the same mental model
Boundary conditions — date ranges, pagination limits, off-by-one arithmetic, inclusive versus exclusive comparisons — require manual test cases written against the business rule, not the code
Financial and reconciliation systems need independent checksums at every aggregation boundary — the correctness of the code is not sufficient evidence that the output is correct
An AI assistant that generates both the code and the tests for the same feature provides zero independent verification — the reviewer is the only independent check

Production debug guideWhen your tools are slowing you down instead of speeding you up7 entries

Symptom · 01

AI assistant suggestions require heavy editing on every completion

→

Fix

Add or update your .cursorrules file with project-specific conventions, anti-patterns, and architecture decisions. Generic context produces generic code. Reference your actual type files and hook patterns explicitly in the rules.

Symptom · 02

Local development environment takes more than 30 seconds to start

→

Fix

Profile dev server startup. If webpack is the bottleneck, migrate to Next.js with Turbopack (enabled by default in Next.js 15+). If dependency install is the bottleneck, switch from npm or pnpm to Bun. If database startup is the bottleneck, replace local Docker Postgres with a Neon development branch.

Symptom · 03

CI pipeline takes more than 5 minutes for a typical PR

→

Fix

Move lint, format, and type-check to pre-commit hooks. Add Turborepo remote caching — unchanged packages should restore from cache, not rebuild. Profile which CI job is the bottleneck: if it is unit tests, check for missing test isolation causing sequential runs; if it is E2E, parallelize across browser workers.

Symptom · 04

New engineers take more than one day to set up the local environment

→

Fix

Your setup documentation is missing or outdated. Create a single bun run setup command that handles everything: install dependencies, run database migrations, seed development data, copy environment variable templates. Test it on a fresh machine or clean Docker container monthly.

Symptom · 05

Context switching between projects kills flow state

→

Fix

Automate session setup with tmux session scripts. One command should create or reattach to a project session with editor, dev server, git, and logs preconfigured. Context switching should take five seconds, not five minutes.

Symptom · 06

Turborepo cache is serving stale build artifacts

→

Fix

Check your turbo.json inputs definition for the failing task. Any file that affects the output must be listed as an input. Run turbo build --dry to see what the cache key includes. If the task produces non-deterministic output (timestamps, random IDs), it cannot be cached — set cache: false for that task.

Symptom · 07

Drizzle migration fails in CI but passes locally

→

Fix

Ensure CI is running migrations against a clean database branch, not a shared development database. Neon branch databases eliminate shared-state migration conflicts. Check that the migration file was committed — Drizzle generates migration SQL files that must be version controlled alongside schema changes.

2026 Developer Tool Stack Comparison

Category	Current Choice	Primary Alternative	When to Choose Alternative
Editor	Neovim + LazyVim	VS Code / Zed	Team uniformity matters more than individual speed. Zed is the closest competitor on performance — re-evaluate in 6 months as its plugin ecosystem matures.
AI Assistant	Cursor + Claude	GitHub Copilot	GitHub Enterprise requirement, single-editor constraint, or lower per-seat budget. Copilot's inline completions are strong — the gap is multi-file Agent mode.
Runtime	Bun	Node.js 22+	Native C++ addon dependencies that Bun does not support, or team has existing Node.js tooling deeply integrated in CI.
Monorepo	Turborepo	Nx	50+ packages requiring affected-based testing and deep project graph analysis. Nx's code generation is also stronger for larger teams.
Formatter + Linter	Biome + ESLint (security only)	ESLint + Prettier	Heavy dependence on ESLint plugin ecosystem (jsx-a11y, custom rules). Biome does not support plugins — evaluate built-in equivalents first.
Terminal	tmux + session scripts	Warp	Prefer GUI terminal with built-in AI autocomplete and do not have cloud sync restrictions. Warp requires account creation for team features.
CI/CD	GitHub Actions	Dagger / CircleCI	Need local CI execution (Dagger) or complex multi-platform pipeline orchestration (CircleCI). GitHub Actions covers 95% of use cases with less setup.
ORM	Drizzle	Prisma	Prefer generated client and schema introspection over schema-as-code. Prisma's migration workflow is smoother for teams new to database management.
Database	Neon (serverless Postgres)	Supabase / PlanetScale	Need Row Level Security and auth integration (Supabase) or horizontal sharding for high-write workloads (PlanetScale). Neon's branch databases are unmatched for development workflows.
Error tracking	Sentry	Highlight.io / Datadog	Need unified infrastructure and APM alongside error tracking (Datadog) or prefer open-source self-hosted option (Highlight.io).

⚙ Quick Reference

12 commands from this guide

File	Command / Code	Purpose
~.confignvimluapluginseditor.lua	return {	Editor
.cursorrules	- Boilerplate: CRUD operations, type definitions, component shells	AI Coding Assistant
package.json	{	Runtime and Package Manager
turbo.json	{	Monorepo and Build System
biome.json	{	Formatting and Linting
~.localbintms	set -euo pipefail	Terminal Workflow
.githubworkflowsci.yml	name: CI	CI/CD Pipeline
srcdbschema.ts	export const users = pgTable('users', {	Database and ORM
next.config.ts	const nextConfig: NextConfig = {	Deployment and Observability
srcdbrepositoriessubscriptions.test.ts	describe('getSubscriptionsInPeriod', () => {	Testing Strategy
.commitlintrc.json	{	Stop Writing Commit Messages. Use Conventional Commits with
jest.config.js	module.exports = {	Eliminate Random Test Flakiness with Deterministic Seed Conf

Key takeaways

Productivity is measured by time-to-merge

not lines of code, not tool count, not hours spent. If a tool does not reduce time-to-merge, remove it.

AI assistants generate tests that reproduce the same logic errors as the code they test

boundary conditions require manual test cases written against the specification, not the implementation.

Local-first tools (Biome, Bun, Turborepo remote caching) eliminate CI round-trips for checks that should run in under three seconds before every commit.

Neon branch databases eliminate the largest source of flaky integration tests

shared mutable state. Each engineer and each CI job gets an isolated database.

The .cursorrules file is the highest-leverage configuration in the AI workflow

it encodes your architecture, naming conventions, and anti-patterns in a form that Cursor enforces automatically.

Silent production failures are the most expensive

Sentry error tracking and application-level observability are mandatory from the first production deployment, not after the first incident.

Document your stack or it rots

STACK.md, bun run setup, and a CONTRIBUTING.md with Git conventions are not optional for any team beyond solo development.

Common mistakes to avoid

6 patterns

Adopting tools without measuring time-to-merge before and after

Symptom

Team spends more time configuring and maintaining tools than writing code. New engineer onboarding takes three days because the toolchain setup is complex, undocumented, and machine-dependent.

Fix

Track time-to-merge as the primary productivity metric. Measure it for two weeks before adopting a new tool and two weeks after. If the tool does not reduce time-to-merge by at least 10%, remove it. Document your stack in a single STACK.md with setup instructions and the reasoning behind each choice.

Letting AI generate tests for the same feature it just implemented

Symptom

Test suite passes. Production behavior is wrong. Root cause: AI derives test assertions from the implementation rather than the specification, reproducing identical logic errors in both.

Fix

Establish the AI testing rule: AI generates test structure (describe blocks, mocks, setup), humans write business logic assertions. Boundary conditions — date ranges, numeric limits, inclusive versus exclusive comparisons — are always written manually. Document the business rule in a comment above the assertion.

Running lint and format checks in CI that already run in pre-commit hooks

Symptom

CI takes 8+ minutes including a Biome lint pass that takes 30 seconds in CI but 200ms locally. Engineers wait for CI to tell them about issues they could catch before pushing.

Fix

Move all lint, format, and type-check to pre-commit hooks via Husky and lint-staged. CI runs only what cannot run locally: unit tests, integration tests, security scans, deployments. The pre-commit hook is the fast feedback loop; CI is the correctness guarantee.

Using a shared development database for integration tests

Symptom

Integration tests pass locally but fail in CI intermittently. Root cause: concurrent test runs mutate shared database state. Flaky tests train engineers to ignore failures and merge anyway.

Fix

Use Neon branch databases — one branch per engineer for development, one fresh branch per CI job for integration tests. Isolated database state eliminates the source of test flakiness. Flaky tests should be treated as bugs, not noise.

No environment variable validation at startup

Symptom

Application starts without error, then crashes at runtime when a feature that requires a missing environment variable is first accessed. The error occurs in production, not at boot.

Fix

Validate all environment variables at startup using Zod and @t3-oss/env-nextjs. Missing or malformed variables throw at build time — the deployment fails before traffic is routed to a broken instance. The error is caught in CI, not by users.

No single setup command for the development environment

Symptom

New engineers take two to three days to get the local environment working. Each engineer's setup is slightly different, causing works-on-my-machine issues in shared tools and scripts.

Fix

Create bun run setup that handles everything: bun install, database branch creation, migration run, seed data load, and .env.local generation from .env.example. Test it monthly on a fresh machine or clean container. If it fails, fix it before onboarding the next engineer.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

How do you evaluate whether a new developer tool is worth adopting?

Q02SENIOR

What is your strategy for using AI coding assistants without creating te...

Q03SENIOR

Why choose Drizzle over Prisma for a production application?

Q04SENIOR

How do you design a CI pipeline that engineers do not want to bypass?

Q05SENIOR

How do you handle database migrations safely in a continuous deployment ...

Q01 of 05SENIOR

How do you evaluate whether a new developer tool is worth adopting?

ANSWER

I measure time-to-merge before and after adoption over a two-week period. That metric captures total throughput — how long it takes from starting work on a change to having it deployed. If the tool does not reduce time-to-merge by at least 10%, it is not worth the maintenance and onboarding cost. Beyond the metric, I evaluate four properties: setup complexity (can a new engineer configure it in under 10 minutes?), maintenance burden (does it require frequent config updates or version pinning?), failure mode (does it degrade gracefully or block all development when it breaks?), and team impact (does it save individual time but add team-wide complexity?). The last property is the most commonly missed — a tool that makes one engineer 20% faster but adds 30 minutes of onboarding and config debt per new hire is a net negative for a growing team.

FAQ · 6 QUESTIONS

Frequently Asked Questions

Is Neovim worth the learning curve for a team?

How do you handle Bun incompatibility with a dependency?

How do you prevent Turborepo cache from serving stale artifacts?

Can Biome fully replace ESLint for enterprise projects?

Why use Neon instead of a local Docker Postgres for development?

How do you manage feature flags without a complex feature flag service?

Naren Founder & Principal Engineer

20+ years shipping production ML systems and the infrastructure behind them. Drawn from code that ran under real load.

✓ Verified

production tested

July 04, 2026

last updated

2,165

articles · all by Naren

🔥

That's Tools. Mark it forged?

10 min read · try the examples if you haven't