Senior 3 min · June 25, 2026

Design a Proximity Service: Geo-Hashing, Quadtrees, and Production Pitfalls

Design a proximity service for production.

N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Notes here come from systems that actually shipped.

Follow
Production
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

To design a proximity service, you index locations using a spatial data structure like a geohash or quadtree, then query for points within a bounding box or radius. The key is balancing write throughput and query latency at scale.

✦ Definition~90s read
What is Design a Proximity Service?

A proximity service finds nearby points of interest (POIs) within a given radius from a user's location. It's the engine behind 'find restaurants near me' or 'nearest gas station' features.

Imagine you're at a music festival and want to find the nearest food truck.
Plain-English First

Imagine you're at a music festival and want to find the nearest food truck. You don't ask every truck in the entire festival—you look at the map, see which trucks are in your section, then check distances. A proximity service does the same: it organizes locations into 'sections' (like geohash cells or quadtree nodes) so it only checks nearby ones, not the whole world.

Most proximity services fail because they treat it as a simple distance calculation problem. They slap a Haversine formula on every POI and wonder why queries take seconds at 10 million locations. The real challenge isn't math—it's indexing. Without a spatial index, you're scanning the entire dataset for every request. That doesn't scale. By the end of this article, you'll know how to build a proximity service that handles millions of POIs with sub-100ms queries, using geohashes and quadtrees, and you'll learn the production gotchas that burn teams at 3AM.

Why Naive Distance Calculations Fail at Scale

The first thing every junior does is compute Haversine distance between the user and every POI. That's O(n) per query. At 10 million POIs and 1000 queries/second, you're doing 10 billion distance calculations per second. Even with vectorized math, that's a CPU meltdown. The fix: pre-filter with a spatial index. You only compute exact distances on a small candidate set (usually <100 POIs). Without this, your service dies under load.

NaiveProximity.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// io.thecodeforge — System Design tutorial

// DON'T DO THIS — naive O(n) query
function findNearbyNaive(userLat, userLng, radiusKm, allPois) {
    let results = [];
    for (let poi of allPois) {
        let dist = haversine(userLat, userLng, poi.lat, poi.lng);
        if (dist <= radiusKm) {
            results.push(poi);
        }
    }
    return results;
}
// For 10M POIs, this takes ~500ms per query on a single core.
// At 1000 QPS, you need 500 cores. Not happening.
Output
500ms per query on 10M POIs (single core).
Production Trap:
Haversine is CPU-heavy (trig functions). If you must compute it, pre-calculate a bounding box first to filter out obviously far points. But really, use a spatial index.
Proximity Service Design: Geo-Hashing & Quadtrees THECODEFORGE.IO Proximity Service Design: Geo-Hashing & Quadtrees From naive distance calc to production-ready spatial indexing Naive Distance Calculation O(n) scan fails at scale Geohashing Grid-based index with Z-order curve Quadtrees Dynamic resolution for uneven density Production Gotchas Thread pools, memory, edge cases Simpler Alternatives When not to use proximity service ⚠ Geohash edge cells cause false negatives Query neighbor cells or use quadtree overlap THECODEFORGE.IO
thecodeforge.io
Proximity Service Design: Geo-Hashing & Quadtrees
Design Proximity Service

Geohashing: The Simple Spatial Index That Works

Geohashing divides the world into a grid of cells, each with a string ID. Nearby points share a common prefix. To find POIs near a user, you compute the geohash of the user's location at a precision that gives cells of ~1km, then query all POIs with that prefix. This is a simple string prefix match—fast on any database with a B-tree index. The trade-off: edge cases near cell boundaries. A point just outside the user's cell but still within radius will be missed unless you also query neighboring cells.

GeohashQuery.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// io.thecodeforge — System Design tutorial

// Geohash-based proximity query (pseudo-SQL)
// Assume table 'pois' has columns: id, lat, lng, geohash (varchar, indexed)

-- Step 1: Compute user's geohash at precision 6 (~1.2km cells)
-- Step 2: Query POIs with matching prefix
SELECT id, lat, lng, name
FROM pois
WHERE geohash LIKE 'u4pruy%'  -- prefix for user's cell
   OR geohash LIKE 'u4pruz%'  -- neighbor cells (8 neighbors)
   OR geohash LIKE 'u4prv0%'
   -- ... include all 9 cells

-- Step 3: Filter by exact distance (Haversine on small set)
-- This step is optional if cell size is small enough.

-- Index: CREATE INDEX idx_geohash ON pois(geohash);
-- This is a B-tree index, so prefix match is O(log n).
Output
Query returns ~100 candidates from 10M POIs in <10ms.
Senior Shortcut:
Use geohash precision that matches your radius. For 1km radius, precision 6 (cell ~1.2km). For 10km, precision 5 (~4.8km). Always query the 9-cell grid (center + 8 neighbors) to avoid boundary misses.
Geohash Query PipelineTHECODEFORGE.IOGeohash Query PipelineFrom user location to nearby POIs in 4 stepsUser LocationLat/lng from requestCompute GeohashPrecision ~1km cellPrefix QueryDB scan by geohash prefixFilter & ReturnHaversine refine on results⚠ Prefix query returns all POIs in cell; refine with Haversine for exact radiusTHECODEFORGE.IO
thecodeforge.io
Geohash Query Pipeline
Design Proximity Service

Quadtrees: When You Need Dynamic Resolution

Geohashes have fixed grid size. If your POI density varies wildly (downtown vs rural), you waste memory in sparse areas and lose precision in dense areas. Quadtrees adapt: they recursively subdivide a region into four quadrants until each leaf contains at most a threshold number of POIs. Queries traverse the tree, pruning branches that fall outside the search radius. This gives O(log n) query time and memory proportional to POI density. The downside: writes can be expensive if the tree needs rebalancing.

QuadtreeProximity.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
// io.thecodeforge — System Design tutorial

class Quadtree {
    constructor(boundary, capacity) {
        this.boundary = boundary; // {x, y, w, h}
        this.capacity = capacity; // max POIs per node before split
        this.points = [];
        this.children = null;
    }

    insert(point) {
        if (!this.boundary.contains(point)) return false;
        if (this.children === null && this.points.length < this.capacity) {
            this.points.push(point);
            return true;
        }
        if (this.children === null) this.subdivide();
        for (let child of this.children) {
            if (child.insert(point)) return true;
        }
        return false; // shouldn't happen
    }

    query(range, found) {
        if (!this.boundary.intersects(range)) return;
        for (let p of this.points) {
            if (range.contains(p)) found.push(p);
        }
        if (this.children !== null) {
            for (let child of this.children) child.query(range, found);
        }
    }

    subdivide() {
        let {x, y, w, h} = this.boundary;
        let halfW = w / 2, halfH = h / 2;
        this.children = [
            new Quadtree({x, y, w: halfW, h: halfH}, this.capacity),
            new Quadtree({x: x+halfW, y, w: halfW, h: halfH}, this.capacity),
            new Quadtree({x, y: y+halfH, w: halfW, h: halfH}, this.capacity),
            new Quadtree({x: x+halfW, y: y+halfH, w: halfW, h: halfH}, this.capacity)
        ];
        // Redistribute points to children
        for (let p of this.points) {
            for (let child of this.children) child.insert(p);
        }
        this.points = []; // clear parent
    }
}

// Usage:
let tree = new Quadtree({x: -180, y: -90, w: 360, h: 180}, 10);
// Insert 10M POIs...
// Query: find points within 1km of (lat, lng)
let range = {x: userLng-0.01, y: userLat-0.01, w: 0.02, h: 0.02}; // approx 1km
let nearby = [];
tree.query(range, nearby);
Output
Query returns ~50 candidates from 10M POIs in <5ms.
Never Do This:
Don't rebuild the entire quadtree on every write. Use a balanced tree or batch updates. Otherwise, a single POI insertion triggers O(n) rebalancing, killing write throughput.
Geohash vs Quadtree TradeoffsTHECODEFORGE.IOGeohash vs Quadtree TradeoffsFixed grid vs adaptive resolutionGeohashFixed grid size per precision levelSimple string prefix queriesWastes memory in sparse areasLoses precision in dense areasQuadtreeRecursive subdivision by densityComplex tree traversal queriesMemory efficient in sparse areasHigh precision in dense areasUse geohash for uniform density; quadtree for variable density at scaleTHECODEFORGE.IO
thecodeforge.io
Geohash vs Quadtree Tradeoffs
Design Proximity Service

Production Gotchas: Thread Pools, Memory, and Edge Cases

I've seen a proximity service bring down a payments system because the geohash query thread pool was exhausted at 3AM. The service used a single thread pool for both reads and writes. A bulk write operation (importing 100K POIs) blocked all read queries, causing cascading timeouts. Fix: separate read and write thread pools. Also, watch out for geohash precision at high latitudes—cells get distorted near poles. Use a library that handles this (e.g., geohash-int). Memory: quadtrees with full POI objects blow up. Store only IDs and coordinates in the tree; fetch details from cache or DB.

ThreadPoolSeparation.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// io.thecodeforge — System Design tutorial

// Java ExecutorService example
ExecutorService readPool = Executors.newFixedThreadPool(20); // read queries
ExecutorService writePool = Executors.newFixedThreadPool(5);  // POI inserts

// Submit read tasks to readPool, write tasks to writePool
// This prevents bulk writes from starving reads.

// Also set a timeout on read tasks:
Future<List<POI>> future = readPool.submit(queryTask);
try {
    List<POI> results = future.get(100, TimeUnit.MILLISECONDS);
} catch (TimeoutException e) {
    // Return empty or fallback
    future.cancel(true);
}
Output
Read queries never blocked by writes. Timeouts handled gracefully.
Interview Gold:
Q: 'How does your proximity service handle concurrent writes and reads?' A: 'Separate thread pools for reads and writes. Read pool has more threads. Write pool has a bounded queue to backpressure bulk imports.'

When Not to Use a Proximity Service — Simpler Alternatives

If your dataset is small (<10K POIs) and query rate is low (<10 QPS), just compute Haversine on the fly. No index needed. If you only need 'closest one' (not radius), use a nearest-neighbor search with a k-d tree. If your POIs are static (e.g., store locations), precompute distances and cache them. Don't over-engineer. A proximity service with geohashes and quadtrees is overkill for a local pizza shop's website.

Senior Shortcut:
Rule of thumb: if your dataset fits in memory (<1M POIs) and you can tolerate 100ms queries, use a simple in-memory list with a bounding box pre-filter. Only reach for geohash/quadtree when you exceed 1M POIs or need <10ms queries.
● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom
A proximity service for a food delivery app started returning 500s every 15 minutes. The container had 4GB RAM and was OOM-killed.
Assumption
The team assumed a memory leak in the geohash library. They restarted the container and bumped RAM to 8GB.
Root cause
The quadtree implementation stored all POI objects in memory, including full address strings and metadata. With 2 million POIs, each ~2KB, the tree consumed 4GB. The OOM killer struck during peak traffic when new POIs were inserted, causing tree rebalancing that temporarily doubled memory usage.
Fix
Switched to a disk-backed R-tree using SQLite's R*Tree module. Reduced memory footprint to 200MB. Also added a memory limit on the quadtree's node cache.
Key lesson
  • Never store full objects in spatial indexes.
  • Store only IDs and coordinates; fetch details on demand from a cache or DB.
Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries
Symptom · 01
Queries return incomplete results (missing POIs near boundaries)
Fix
1. Check if you query neighbor geohash cells. 2. Verify geohash precision matches radius. 3. For quadtree, check if the query range correctly covers the radius (add margin).
Symptom · 02
High memory usage in quadtree
Fix
1. Check if full POI objects are stored in tree nodes. 2. Reduce to ID+coords only. 3. Set a max node capacity to prevent deep trees. 4. Consider disk-backed R-tree.
Symptom · 03
Write latency spikes during POI insertion
Fix
1. Check if quadtree rebalances on every insert. 2. Batch inserts and rebuild tree periodically. 3. Use a write-ahead log and async index update.
★ Proximity Service Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.
Query returns no results even though POIs exist
Immediate action
Check geohash precision and neighbor cells
Commands
SELECT geohash FROM pois LIMIT 5;
SELECT COUNT(*) FROM pois WHERE geohash LIKE 'u4pruy%';
Fix now
Increase geohash precision or add neighbor cell query.
OOM kill on container+
Immediate action
Check memory usage of quadtree
Commands
top -p $(pgrep -f proximity)
jmap -histo <pid> | head -20
Fix now
Reduce POI object size in tree; use disk-backed index.
High CPU usage on queries+
Immediate action
Check if Haversine is computed on all POIs
Commands
perf top -p <pid>
strace -c -p <pid>
Fix now
Add spatial index (geohash/quadtree) to pre-filter.
Thread pool exhausted+
Immediate action
Check thread pool queue size
Commands
jstack <pid> | grep 'pool'
curl /actuator/metrics/jvm.threads.live
Fix now
Separate read/write pools; add backpressure on writes.
Feature / AspectGeohashQuadtree
Query timeO(log n) with B-tree indexO(log n) average, O(n) worst-case if unbalanced
Write timeO(log n) (insert geohash string)O(log n) average, O(n) worst-case on split
Memory overheadLow (just string index)Higher (tree nodes, but adaptive)
Boundary precisionFixed grid sizeAdaptive to density
Implementation complexitySimple (string prefix match)Moderate (recursive tree)
Best forUniform density, simple setupVarying density, high precision needs

Key takeaways

1
Never compute Haversine on all POIs—use a spatial index to pre-filter.
2
Geohash is simple and works with any DB that has a B-tree index; always query neighbor cells.
3
Quadtrees adapt to density but can blow memory if you store full objects; store only IDs.
4
Separate read and write thread pools to avoid cascading failures during bulk imports.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How does a quadtree handle concurrent inserts and queries in a multi-thr...
Q02SENIOR
When would you choose geohash over quadtree in a production system?
Q03SENIOR
What happens when you query a geohash at the International Date Line? Ho...
Q04JUNIOR
What is a spatial index and why is it needed for proximity services?
Q05SENIOR
Your proximity service returns 500s under load. You suspect the quadtree...
Q06SENIOR
Design a proximity service for a global food delivery app with 10M resta...
Q01 of 06SENIOR

How does a quadtree handle concurrent inserts and queries in a multi-threaded environment?

ANSWER
Use a read-write lock. Queries acquire read lock (shared), inserts acquire write lock (exclusive). For higher throughput, use a lock-free quadtree or partition the tree by region (e.g., separate trees for each continent).
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What is the best way to find nearby points in a database?
02
What's the difference between geohash and quadtree?
03
How do I handle geohash boundary issues?
04
What happens to geohash precision at high latitudes?
N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Notes here come from systems that actually shipped.

Follow
Verified
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
🔥

That's Real World. Mark it forged?

3 min read · try the examples if you haven't

Previous
Design Object Storage (S3)
24 / 40 · Real World
Next
Design Ticketmaster