Senior 3 min · June 25, 2026

Design a Proximity Service: Geo-Hashing, Quadtrees, and Production Pitfalls

Q: What is the best way to find nearby points in a database?

Use a spatial index. Geohash is the easiest to implement on any SQL database—just index a geohash column and query with prefix matching. For more dynamic density, use a quadtree or R-tree.

Q: What's the difference between geohash and quadtree?

Geohash uses a fixed grid; quadtree adapts to data density. Geohash is simpler and works with standard B-tree indexes. Quadtree is more memory-efficient for sparse data and more precise for dense clusters.

Q: How do I handle geohash boundary issues?

Always query the 9-cell grid: the user's cell plus its 8 neighbors. This ensures you don't miss POIs just outside the cell but within the search radius.

Q: What happens to geohash precision at high latitudes?

Geohash cells distort near the poles—they become narrower in longitude. Use a library that handles this (e.g., geohash-int) or switch to a quadtree for polar regions.

Design a proximity service for production.

Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Notes here come from systems that actually shipped.

✓ Production

production tested

June 25, 2026

last updated

1,663

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

To design a proximity service, you index locations using a spatial data structure like a geohash or quadtree, then query for points within a bounding box or radius. The key is balancing write throughput and query latency at scale.

✦ Definition~90s read

What is Design a Proximity Service?

A proximity service finds nearby points of interest (POIs) within a given radius from a user's location. It's the engine behind 'find restaurants near me' or 'nearest gas station' features.

★

Imagine you're at a music festival and want to find the nearest food truck.

Plain-English First

Imagine you're at a music festival and want to find the nearest food truck. You don't ask every truck in the entire festival—you look at the map, see which trucks are in your section, then check distances. A proximity service does the same: it organizes locations into 'sections' (like geohash cells or quadtree nodes) so it only checks nearby ones, not the whole world.

Most proximity services fail because they treat it as a simple distance calculation problem. They slap a Haversine formula on every POI and wonder why queries take seconds at 10 million locations. The real challenge isn't math—it's indexing. Without a spatial index, you're scanning the entire dataset for every request. That doesn't scale. By the end of this article, you'll know how to build a proximity service that handles millions of POIs with sub-100ms queries, using geohashes and quadtrees, and you'll learn the production gotchas that burn teams at 3AM.

Why Naive Distance Calculations Fail at Scale

The first thing every junior does is compute Haversine distance between the user and every POI. That's O(n) per query. At 10 million POIs and 1000 queries/second, you're doing 10 billion distance calculations per second. Even with vectorized math, that's a CPU meltdown. The fix: pre-filter with a spatial index. You only compute exact distances on a small candidate set (usually <100 POIs). Without this, your service dies under load.

NaiveProximity.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// DON'T DO THIS — naive O(n) query
function findNearbyNaive(userLat, userLng, radiusKm, allPois) {
    let results = [];
    for (let poi of allPois) {
        let dist = haversine(userLat, userLng, poi.lat, poi.lng);
        if (dist <= radiusKm) {
            results.push(poi);
        }
    }
    return results;
}
// For 10M POIs, this takes ~500ms per query on a single core.
// At 1000 QPS, you need 500 cores. Not happening.

Output

500ms per query on 10M POIs (single core).

Production Trap:

Haversine is CPU-heavy (trig functions). If you must compute it, pre-calculate a bounding box first to filter out obviously far points. But really, use a spatial index.

thecodeforge.io

Proximity Service Design: Geo-Hashing & Quadtrees

Design Proximity Service

Geohashing: The Simple Spatial Index That Works

Geohashing divides the world into a grid of cells, each with a string ID. Nearby points share a common prefix. To find POIs near a user, you compute the geohash of the user's location at a precision that gives cells of ~1km, then query all POIs with that prefix. This is a simple string prefix match—fast on any database with a B-tree index. The trade-off: edge cases near cell boundaries. A point just outside the user's cell but still within radius will be missed unless you also query neighboring cells.

GeohashQuery.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Geohash-based proximity query (pseudo-SQL)
// Assume table 'pois' has columns: id, lat, lng, geohash (varchar, indexed)

-- Step 1: Compute user's geohash at precision 6 (~1.2km cells)
-- Step 2: Query POIs with matching prefix
SELECT id, lat, lng, name
FROM pois
WHERE geohash LIKE 'u4pruy%'  -- prefix for user's cell
   OR geohash LIKE 'u4pruz%'  -- neighbor cells (8 neighbors)
   OR geohash LIKE 'u4prv0%'
   -- ... include all 9 cells

-- Step 3: Filter by exact distance (Haversine on small set)
-- This step is optional if cell size is small enough.

-- Index: CREATE INDEX idx_geohash ON pois(geohash);
-- This is a B-tree index, so prefix match is O(log n).

Output

Query returns ~100 candidates from 10M POIs in <10ms.

Senior Shortcut:

Use geohash precision that matches your radius. For 1km radius, precision 6 (cell ~1.2km). For 10km, precision 5 (~4.8km). Always query the 9-cell grid (center + 8 neighbors) to avoid boundary misses.

thecodeforge.io

Geohash Query Pipeline

Design Proximity Service

Quadtrees: When You Need Dynamic Resolution

Geohashes have fixed grid size. If your POI density varies wildly (downtown vs rural), you waste memory in sparse areas and lose precision in dense areas. Quadtrees adapt: they recursively subdivide a region into four quadrants until each leaf contains at most a threshold number of POIs. Queries traverse the tree, pruning branches that fall outside the search radius. This gives O(log n) query time and memory proportional to POI density. The downside: writes can be expensive if the tree needs rebalancing.

QuadtreeProximity.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

class Quadtree {
    constructor(boundary, capacity) {
        this.boundary = boundary; // {x, y, w, h}
        this.capacity = capacity; // max POIs per node before split
        this.points = [];
        this.children = null;
    }

    insert(point) {
        if (!this.boundary.contains(point)) return false;
        if (this.children === null && this.points.length < this.capacity) {
            this.points.push(point);
            return true;
        }
        if (this.children === null) this.subdivide();
        for (let child of this.children) {
            if (child.insert(point)) return true;
        }
        return false; // shouldn't happen
    }

    query(range, found) {
        if (!this.boundary.intersects(range)) return;
        for (let p of this.points) {
            if (range.contains(p)) found.push(p);
        }
        if (this.children !== null) {
            for (let child of this.children) child.query(range, found);
        }
    }

    subdivide() {
        let {x, y, w, h} = this.boundary;
        let halfW = w / 2, halfH = h / 2;
        this.children = [
            new Quadtree({x, y, w: halfW, h: halfH}, this.capacity),
            new Quadtree({x: x+halfW, y, w: halfW, h: halfH}, this.capacity),
            new Quadtree({x, y: y+halfH, w: halfW, h: halfH}, this.capacity),
            new Quadtree({x: x+halfW, y: y+halfH, w: halfW, h: halfH}, this.capacity)
        ];
        // Redistribute points to children
        for (let p of this.points) {
            for (let child of this.children) child.insert(p);
        }
        this.points = []; // clear parent
    }
}

// Usage:
let tree = new Quadtree({x: -180, y: -90, w: 360, h: 180}, 10);
// Insert 10M POIs...
// Query: find points within 1km of (lat, lng)
let range = {x: userLng-0.01, y: userLat-0.01, w: 0.02, h: 0.02}; // approx 1km
let nearby = [];
tree.query(range, nearby);

Output

Query returns ~50 candidates from 10M POIs in <5ms.

Never Do This:

Don't rebuild the entire quadtree on every write. Use a balanced tree or batch updates. Otherwise, a single POI insertion triggers O(n) rebalancing, killing write throughput.

thecodeforge.io

Geohash vs Quadtree Tradeoffs

Design Proximity Service

Production Gotchas: Thread Pools, Memory, and Edge Cases

I've seen a proximity service bring down a payments system because the geohash query thread pool was exhausted at 3AM. The service used a single thread pool for both reads and writes. A bulk write operation (importing 100K POIs) blocked all read queries, causing cascading timeouts. Fix: separate read and write thread pools. Also, watch out for geohash precision at high latitudes—cells get distorted near poles. Use a library that handles this (e.g., geohash-int). Memory: quadtrees with full POI objects blow up. Store only IDs and coordinates in the tree; fetch details from cache or DB.

ThreadPoolSeparation.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Java ExecutorService example
ExecutorService readPool = Executors.newFixedThreadPool(20); // read queries
ExecutorService writePool = Executors.newFixedThreadPool(5);  // POI inserts

// Submit read tasks to readPool, write tasks to writePool
// This prevents bulk writes from starving reads.

// Also set a timeout on read tasks:
Future<List<POI>> future = readPool.submit(queryTask);
try {
    List<POI> results = future.get(100, TimeUnit.MILLISECONDS);
} catch (TimeoutException e) {
    // Return empty or fallback
    future.cancel(true);
}

Output

Read queries never blocked by writes. Timeouts handled gracefully.

Interview Gold:

Q: 'How does your proximity service handle concurrent writes and reads?' A: 'Separate thread pools for reads and writes. Read pool has more threads. Write pool has a bounded queue to backpressure bulk imports.'

When Not to Use a Proximity Service — Simpler Alternatives

If your dataset is small (<10K POIs) and query rate is low (<10 QPS), just compute Haversine on the fly. No index needed. If you only need 'closest one' (not radius), use a nearest-neighbor search with a k-d tree. If your POIs are static (e.g., store locations), precompute distances and cache them. Don't over-engineer. A proximity service with geohashes and quadtrees is overkill for a local pizza shop's website.

Senior Shortcut:

Rule of thumb: if your dataset fits in memory (<1M POIs) and you can tolerate 100ms queries, use a simple in-memory list with a bounding box pre-filter. Only reach for geohash/quadtree when you exceed 1M POIs or need <10ms queries.

● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom

A proximity service for a food delivery app started returning 500s every 15 minutes. The container had 4GB RAM and was OOM-killed.

Assumption

The team assumed a memory leak in the geohash library. They restarted the container and bumped RAM to 8GB.

Root cause

The quadtree implementation stored all POI objects in memory, including full address strings and metadata. With 2 million POIs, each ~2KB, the tree consumed 4GB. The OOM killer struck during peak traffic when new POIs were inserted, causing tree rebalancing that temporarily doubled memory usage.

Fix

Switched to a disk-backed R-tree using SQLite's R*Tree module. Reduced memory footprint to 200MB. Also added a memory limit on the quadtree's node cache.

Key lesson

Never store full objects in spatial indexes.
Store only IDs and coordinates; fetch details on demand from a cache or DB.

Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries

Symptom · 01

Queries return incomplete results (missing POIs near boundaries)

→

Fix

1. Check if you query neighbor geohash cells. 2. Verify geohash precision matches radius. 3. For quadtree, check if the query range correctly covers the radius (add margin).

Symptom · 02

High memory usage in quadtree

→

Fix

1. Check if full POI objects are stored in tree nodes. 2. Reduce to ID+coords only. 3. Set a max node capacity to prevent deep trees. 4. Consider disk-backed R-tree.

Symptom · 03

Write latency spikes during POI insertion

→

Fix

1. Check if quadtree rebalances on every insert. 2. Batch inserts and rebuild tree periodically. 3. Use a write-ahead log and async index update.

★ Proximity Service Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.

Query returns no results even though POIs exist−

Immediate action

Check geohash precision and neighbor cells

Commands

SELECT geohash FROM pois LIMIT 5;

SELECT COUNT(*) FROM pois WHERE geohash LIKE 'u4pruy%';

Fix now

Increase geohash precision or add neighbor cell query.

OOM kill on container+

High CPU usage on queries+

Thread pool exhausted+

Feature / Aspect	Geohash	Quadtree
Query time	O(log n) with B-tree index	O(log n) average, O(n) worst-case if unbalanced
Write time	O(log n) (insert geohash string)	O(log n) average, O(n) worst-case on split
Memory overhead	Low (just string index)	Higher (tree nodes, but adaptive)
Boundary precision	Fixed grid size	Adaptive to density
Implementation complexity	Simple (string prefix match)	Moderate (recursive tree)
Best for	Uniform density, simple setup	Varying density, high precision needs

Key takeaways

Never compute Haversine on all POIs—use a spatial index to pre-filter.

Geohash is simple and works with any DB that has a B-tree index; always query neighbor cells.

Quadtrees adapt to density but can blow memory if you store full objects; store only IDs.

Separate read and write thread pools to avoid cascading failures during bulk imports.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

How does a quadtree handle concurrent inserts and queries in a multi-thr...

Q02SENIOR

When would you choose geohash over quadtree in a production system?

Q03SENIOR

What happens when you query a geohash at the International Date Line? Ho...

Q04JUNIOR

What is a spatial index and why is it needed for proximity services?

Q05SENIOR

Your proximity service returns 500s under load. You suspect the quadtree...

Q06SENIOR

Design a proximity service for a global food delivery app with 10M resta...

Q01 of 06SENIOR

How does a quadtree handle concurrent inserts and queries in a multi-threaded environment?

ANSWER

Use a read-write lock. Queries acquire read lock (shared), inserts acquire write lock (exclusive). For higher throughput, use a lock-free quadtree or partition the tree by region (e.g., separate trees for each continent).

FAQ · 4 QUESTIONS

Frequently Asked Questions

What is the best way to find nearby points in a database?

What's the difference between geohash and quadtree?

How do I handle geohash boundary issues?

What happens to geohash precision at high latitudes?

Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Notes here come from systems that actually shipped.

✓ Verified

production tested

June 25, 2026

last updated

1,663

articles · all by Naren

🔥

That's Real World. Mark it forged?

3 min read · try the examples if you haven't