Senior 11 min · March 17, 2026

Elasticsearch Basics

Q: When should I use Elasticsearch vs PostgreSQL full-text search?

PostgreSQL full-text search is good for moderate-scale search on data already in PostgreSQL — saves you from running a separate service. Elasticsearch wins for: very large datasets (100M+ documents), complex relevance tuning, high query throughput, real-time analytics, and when you need features like auto-complete, faceted search, or fuzzy matching at scale.

Q: What is the difference between match and term queries?

match runs the query through the same analyser as the indexed field — it tokenises, lowercases, and handles stemming. Use match for full-text search on text fields. term does an exact, unanalysed match — useful for keyword fields, IDs, and status values. Searching a text field with term often returns no results because the field was analysed (lowercased, tokenised) during indexing.

Q: How do I choose the number of shards for an index?

A common rule: aim for 20-50GB per shard. For a 200GB index, use 5-10 shards. For time-series data, start with 1 primary shard per day and use ILM for rollover. Also consider your node count: each node can handle up to ~20 shards safely. More shards = more cluster state overhead. It's easier to start with fewer shards and reindex later than to reduce shards after creation.

Q: What is the difference between filter context and query context in Elasticsearch?

In query context, scoring is calculated — it tells you how well a document matches. Use `must`, `should` for queries. In filter context, scoring is irrelevant — only yes/no matching. Use `filter` for range, term, exists, etc. Filters are cacheable, so they're much faster for repeated searches. Always prefer filter context for conditions that don't need relevance scoring.

Elasticsearch basics — what it is, when to use it, indices and documents, full-text search with match queries, aggregations, and how it differs from relational databases..

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

✓ Production

production tested

June 10, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Elasticsearch is a distributed search and analytics engine built on Apache Lucene.
Data is stored as JSON documents in indices — no fixed schema required.
Excels at full-text search with tokenisation, stemming, and relevance scoring.
match queries do full-text search; term queries do exact, unanalysed matching.
Aggregations let you run analytics on millions of documents in near-real-time.
Not a primary database — no transactions, eventual consistency, and no ACID guarantees.

✦ Definition~90s read

What is Elasticsearch Basics?

Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It solves the problem of making large volumes of unstructured or semi-structured data instantly searchable and aggregatable — think of it as a JSON document store with a hyper-optimized full-text search engine under the hood.

★

Imagine a library where instead of checking every book one by one, the librarian has a master index card for every unique word that immediately points to all books containing it.

Unlike traditional relational databases that rely on exact-match queries and rigid schemas, Elasticsearch indexes data into an inverted index that maps every unique term to the documents containing it, enabling sub-second search across billions of documents. It's the backbone of log analytics (ELK Stack), e-commerce product search, and observability pipelines at companies like Netflix, Uber, and Wikipedia, but it's a poor choice for ACID-compliant transactional workloads or complex joins — that's what PostgreSQL or MongoDB are for.

At its core, Elasticsearch stores data as JSON documents grouped into indices, which are logical namespaces analogous to a database in SQL. Each document is a flat or nested JSON object with fields that can be of different types — text, keyword, integer, date, geo_point, and more.

When you index a document, Elasticsearch automatically analyzes string fields: it tokenizes the text into terms, normalizes them (lowercasing, stemming, removing stop words), and builds an inverted index mapping each term to the document IDs where it appears. This inverted index is what makes full-text search blazing fast — instead of scanning every row, Elasticsearch looks up the term in the index and retrieves the matching documents in O(1) time.

Full-text search queries (like match or match_phrase) leverage this inverted index to find documents by relevance, using TF-IDF or BM25 scoring to rank results. In contrast, keyword fields store the exact, unanalyzed value and are used for exact matches, filtering, sorting, and aggregations — for example, filtering by status: "active" or aggregating by category.

The distinction is critical: use text for searchable content like product descriptions or log messages, and keyword for structured data like email addresses, tags, or enums. Aggregations, Elasticsearch's analytics superpower, let you compute metrics (sums, averages, percentiles) and build complex data bucketing (histograms, date ranges, nested groupings) across millions of documents in real time — powering dashboards like Kibana's visualizations or e-commerce facet counts.

Plain-English First

Imagine a library where instead of checking every book one by one, the librarian has a master index card for every unique word that immediately points to all books containing it. Elasticsearch works the same way: it builds an index of every word in your data so it can find things instantly, even across billions of documents.

Elasticsearch is a distributed search and analytics engine built on Apache Lucene that makes large volumes of unstructured data instantly searchable. It powers critical production systems like log analytics, e-commerce search, and observability pipelines at scale. Understanding how its inverted index, mappings, and aggregations work is essential for building fast, reliable search — and for avoiding the data loss and performance failures that come from treating it like a general-purpose database.

What Elasticsearch Actually Does

Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It indexes JSON documents into inverted indices, enabling near-real-time full-text search, structured queries, and aggregations across terabytes of data. The core mechanic is sharding: each index is split into primary and replica shards, distributed across a cluster for parallelism and fault tolerance.

Documents are stored in indices, each shard is a complete Lucene index. Writes go to primary shards, then replicate; reads can hit any replica. This gives you sub-second search on billions of documents — but only if you design your mapping upfront. Dynamic mapping is convenient in dev; in production it causes field explosion, index bloat, and silent query failures.

Use Elasticsearch when you need free-text search, log analytics, or real-time aggregations on semi-structured data. It is not a primary data store — no transactions, no strong consistency guarantees. Teams that treat it as a general-purpose database hit split-brain scenarios, data loss on cluster splits, and painful reindexing. It excels as a secondary index or analytics layer behind a durable source of truth.

Mapping Is Not Optional

Dynamic mapping in production leads to field type conflicts, index bloat, and silent query misses. Always define explicit mappings before ingesting data.

Production Insight

A team ingested logs with dynamic mapping enabled; a single malformed field created 10,000+ new fields, causing the cluster to run out of heap and crash.

Symptom: nodes throw OutOfMemoryError during indexing, cluster status turns red, and shards fail to allocate.

Rule: cap the total number of fields per index with index.mapping.total_fields.limit (default 1000) and disable dynamic mapping for production indices.

Key Takeaway

Elasticsearch is a search engine, not a database — never use it as your source of truth.

Design your mapping before ingesting a single document; dynamic mapping is a dev convenience, not a production strategy.

Shard count is fixed at index creation — choose based on your data volume and node count, not default values.

thecodeforge.io

Elasticsearch Core Concepts Flow

Elasticsearch Basics

Documents and Indices

An index is like a database table but stores JSON documents. Documents are automatically given a unique _id (or you can assign one). The mapping defines the schema (field types, analysers). Unlike SQL, the schema can be dynamic — new fields are automatically added if dynamic mapping is enabled. But dynamic mapping is dangerous in production (see incident above). Each index is composed of one or more shards that are distributed across the cluster. That's how Elasticsearch scales horizontally.

ExampleBASH

# Index a document (HTTP PUT/POST to the REST API)
PUT /articles/_doc/1
{
  "title": "Understanding Big O Notation",
  "content": "Big O describes the rate of growth of an algorithm...",
  "author": "Alice Chen",
  "published": "2025-03-17",
  "tags": ["algorithms", "computer science", "performance"]
}

# Response:
# { "_index": "articles", "_id": "1", "result": "created" }

# Get document by ID
GET /articles/_doc/1

# Delete document
DELETE /articles/_doc/1

# Bulk indexing (much faster than individual requests)
POST /_bulk
{ "index": { "_index": "articles", "_id": "2" } }
{ "title": "AVL Trees", "content": "Self-balancing BST..." }
{ "index": { "_index": "articles", "_id": "3" } }
{ "title": "Huffman Coding", "content": "Variable-length encoding..." }

Output

{ "_index": "articles", "_id": "1", "result": "created" }

Index vs Database Table

Don't map too literally. An index in Elasticsearch is more like a schema namespace than a table. You can store unrelated documents in the same index, but it's usually a bad idea because different documents have different mappings. Stick to one document type per index in 7.x+ — the type concept is deprecated.

Production Insight

Dynamic mapping is the silent killer. It looks convenient until your mapping consumes all heap.

Always set explicit mappings for indices that receive high-cardinality fields.

Rule of thumb: every thousand fields in mapping adds ~1% heap overhead on each shard.

Fix: PUT index/_mapping { "dynamic": false } before ingesting untrusted data.

Key Takeaway

An index is a logical namespace for documents with a shared mapping.

Dynamic mapping is a trap for high-cardinality data.

Always explicitly define mappings for production indices.

Full-Text Search Queries

Elasticsearch's killer feature is full-text search. When you send a match query, Elasticsearch analyses the query string (tokenises, lowercases, applies stemmers) and compares it against the analysed text in the inverted index. Relevance scoring (TF-IDF or BM25) ranks results. You can boost fields, require clauses, and exclude terms using the bool query. multi_match lets you search across multiple fields with individual boosts.

ExampleBASH

# match query: full-text search with relevance scoring
GET /articles/_search
{
  "query": {
    "match": {
      "content": "algorithm performance"
    }
  }
}

# multi_match: search across multiple fields
GET /articles/_search
{
  "query": {
    "multi_match": {
      "query": "binary tree",
      "fields": ["title^2", "content", "tags"]  // title weighted 2x
    }
  }
}

# bool query: combine must, should, must_not
GET /articles/_search
{
  "query": {
    "bool": {
      "must":     [ { "match": { "content": "sorting" } } ],
      "should":   [ { "match": { "tags": "algorithms" } } ],
      "must_not": [ { "match": { "title": "deprecated" } } ],
      "filter":   [ { "range": { "published": { "gte": "2024-01-01" } } } ]
    }
  }
}

Output

{ "hits": { "total": { "value": 5 }, "hits": [...] } }

How Full-Text Search Actually Works

Each field is analysed: tokenised, lowercased, stemmed.
The inverted index stores (token → list of document IDs + position).
A match query converts the search phrase into tokens, then looks up each token in the inverted index.
Documents matching more tokens (and rarer tokens) get higher scores.
The term query skips analysis — it looks for the exact token as stored in the inverted index.

Production Insight

Using term on a text field returns nothing because the field was analysed.

match analyses the query, term does not.

If you need exact match on a string, map it as keyword and use term.

Failure story: team searched user emails with term on a text field — zero results for hours.

Key Takeaway

match = analysed full-text search.

term = exact, unanalysed lookup on keyword fields.

Understand analysis before writing queries.

How the Inverted Index Powers Search

The inverted index is what makes full-text search possible without a full scan. When you index a text field, the analyser produces a list of tokens. For every unique token, the inverted index stores a sorted list of document IDs that contain that token, plus the position(s) within the document. When a match query arrives, Elasticsearch looks up each token in the inverted index — that's a hash table lookup, not a linear scan. The search returns the document IDs, scores them, and retrieves the top hits. The process is the same whether you have 1,000 or 100,000,000 documents — search latency scales with the number of distinct tokens, not the document count. That's why Elasticsearch can respond in milliseconds on huge datasets.

ExampleBASH

# Test how a text field gets analysed (produces tokens)
POST /_analyze
{
  "analyzer": "standard",
  "text": "Elasticsearch is a search engine"
}

# Response tokens:
# ["elasticsearch", "is", "a", "search", "engine"]

# For each token, the inverted index stores doc IDs
# Token "search" -> [doc1, doc5, doc9]
# Token "engine" -> [doc1, doc3]

Output

{ "tokens": [ { "token": "elasticsearch" }, { "token": "is" }, { "token": "a" }, { "token": "search" }, { "token": "engine" } ] }

Inverted Index ≠ Full-Text Search on SQL

SQL databases with GIN indexes also use inverted indexes, but the tokenisation and query optimiser are far less capable. Elasticsearch's inverted index stores term frequencies, positions, and optionally payloads — all needed for BM25 scoring and phrase queries.

Production Insight

The inverted index is why Elasticsearch handles partial matches, typos, and stemming.

If you ever see full scans (check _profile on a slow query), it means your mapping doesn't match the query — e.g., a wildcard on a keyword field.

Always profile slow queries: GET index/_search { "profile": true }.

Key Takeaway

The inverted index is a token→docID hash table built during indexing.

It makes search O(number of distinct tokens) not O(number of documents).

Without it, you're scanning — that's not Elasticsearch.

Inverted Index Structure

Step 1

elasticsearch

is

a

search

engine

Tokens from analysis

Step 2

search -> [doc1, doc5, doc9]

engine -> [doc1, doc3]

elasticsearch -> [doc1, doc4]

Inverted index: token → sorted doc IDs

Full-Text vs Keyword: When to Use Each

Elasticsearch offers two fundamental ways to handle string fields: text (analysed, full-text) and keyword (exact, unanalysed). Choosing the wrong type leads to missing results or broken filters. A text field uses an analyser to break the string into tokens; a keyword field stores the entire string as a single token. Use text for human-readable content you want to search by relevance (blog posts, product descriptions). Use keyword for identifiers, enums, tags, emails, URLs — anything you filter or aggregate on. In many production schemas, the same string field is mapped as both: a text sub-field for match queries and a keyword sub-field for term and aggregations.

ExampleBASH

# Multi-field mapping: title is both text and keyword
PUT /products/_mapping
{
  "properties": {
    "product_name": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    }
  }
}

# Full-text search on product_name
GET /products/_search
{ "query": { "match": { "product_name": "wireless mouse" } } }

# Exact filter on product_name.keyword
GET /products/_search
{ "query": { "term": { "product_name.keyword": "Wireless Mouse MX Master 3" } } }

# Aggregation on the keyword field
GET /products/_search
{
  "size": 0,
  "aggs": {
    "brands": {
      "terms": { "field": "product_name.keyword" }
    }
  }
}

When Ignore Above Saves You

Set ignore_above (default 256 for keyword) to avoid indexing unreasonably long strings. A 10KB string stored as keyword will blow up the in-memory field data cache. This is a common cause of slow aggregations.

Production Insight

A common production pattern: map field as text with a .keyword sub-field.

Search with match, filter/aggregate with term on .keyword.

Never use term on a pure text field — you'll get zero results because the actual stored token is analysed.

Failure: A team aggregated product names on text field (which tokenised) and got per-token counts, not per-product.

Aggregations — Analytics

Aggregations let you compute analytics over search results without writing custom code. They're like SQL GROUP BY on steroids: nested aggregations, date histograms, percentile ranks, geo distance, and more. Aggregations run on the shards in parallel and results are merged on the coordinating node. The size: 0 in the request avoids returning documents when you only want the aggregated data.

ExampleBASH

# Count articles per author and average title length
GET /articles/_search
{
  "size": 0,  // don't return documents, only aggregation results
  "aggs": {
    "by_author": {
      "terms": { "field": "author.keyword", "size": 10 },
      "aggs": {
        "avg_content_length": {
          "avg": { "field": "content_length" }
        }
      }
    },
    "articles_over_time": {
      "date_histogram": {
        "field": "published",
        "calendar_interval": "month"
      }
    }
  }
}

Output

{ "aggregations": { "by_author": { "buckets": [...] } } }

Aggregation Performance Trap

Aggregations on high-cardinality fields (e.g., user IDs, IPs) can consume huge memory and slow down the cluster. The terms aggregation with a large size requests many buckets from each shard, then the coordinating node sorts and merges. If size is 10000, each shard sends 10000 buckets — multiply by number of shards. Use search.max_buckets to cap memory. Pre-filter with a query to reduce the dataset before aggregating.

Production Insight

Aggregations are expensive. They are not free analytics.

Each bucket requires memory on the coordinating node.

Hard limit: search.max_buckets defaults to 10000 — hitting it kills the query.

Fix for large cardinality: use a composite aggregation with pagination, not a single terms with huge size.

Key Takeaway

Aggregations are powerful but memory-hungry.

Limit bucket size with search.max_buckets and pre-filter with queries.

For high-cardinality fields, use composite aggregation.

Mapping, Analysis and Tokenisation

Mapping defines how documents and their fields are stored and indexed. Analysis is the process of converting a text field into tokens (terms) that go into the inverted index. It consists of a character filter, a tokeniser, and token filters. By default, Elasticsearch uses the standard analyser (lowercases, splits on punctuation, removes common words). You can create custom analysers with different tokenisers (whitespace, keyword, n-gram) and filters (synonym, stemmer, stop). Understanding analysis is critical to getting search results that match user intent.

ExampleBASH

# Create an index with a custom analyser
PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "stop", "unique"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_custom_analyzer"
      }
    }
  }
}

# Test the analyser
POST /my_index/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text": "Elasticsearch Search Engine"
}
# Result: tokens => ["elasticsearch", "search", "engine"]

Analysis Is Field-Level

Each text field can have its own analyser. The same string can be analysed differently in different fields. For example, a title field might use a standard analyser with synonyms, while a content field uses a simpler analyser. You can also set search_analyzer separately from index_analyzer.

Production Insight

If users complain about missing results, the first thing to check is analyser mismatch.

If the index analyser is different from the search analyser, query terms may not match indexed terms.

Failure: The team used an n-gram analyser for autocomplete on a field, but forgot to set the same analyser for search — queries returned nothing.

Key Takeaway

Analysis determines how text is tokenised and searched.

Mismatched analysers between index and search cause missing results.

Always test analysis with _analyze endpoint before deploying mappings.

Mapping & Analyzers: Technical Reference Guide

This reference covers the essential mapping types, parameter configuration, and custom analyser components you'll use in production. The mapping is the schema definition for an index — it tells Elasticsearch how to store and index each field. Key field types: text (full-text), keyword (exact match), integer, long, float, double, boolean, date (ISO 8601), nested (array of objects), flattened (arbitrary key-value), geo_point, ip. Each type has parameters: index (true/false), store (separate stored field), doc_values (for aggregations/sorting), norms (scoring), copy_to (combine fields).

Analysers consist of: character filters (strip HTML, replace patterns), tokeniser (standard, whitespace, edge_ngram), token filters (lowercase, stemmer, stop, synonym, asciifolding). Custom analysers are defined in index settings and referenced in mappings.

Production-critical parameters: ignore_above (keyword length limit), coerce (auto-convert numbers), eager_global_ordinals (pre-load for high cardinality keyword fields used in aggregations), fields (multi-field for same string).

ExampleBASH

# Full-featured mapping with custom analyser
PUT /logs/_mapping
{
  "dynamic": false,
  "properties": {
    "message": {
      "type": "text",
      "analyzer": "my_standard_analyzer",
      "fields": {
        "keyword": { "type": "keyword", "ignore_above": 512 },
        "ngram": { "type": "text", "analyzer": "ngram_analyzer" }
      }
    },
    "severity": {
      "type": "keyword",
      "doc_values": false,  # not aggregated
      "norms": false         # no scoring needed
    },
    "timestamp": { "type": "date" }
  },
  "settings": {
    "index.mapping.total_fields.limit": 200,
    "analysis": {
      "char_filter": {
        "html_strip": { "type": "html_strip" }
      },
      "tokenizer": {
        "edge_ngram": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 10
        }
      },
      "filter": {
        "my_synonyms": {
          "type": "synonym",
          "synonyms": ["fast, quick", "small, little"]
        }
      },
      "analyzer": {
        "my_standard_analyzer": {
          "type": "custom",
          "char_filter": ["html_strip"],
          "tokenizer": "standard",
          "filter": ["lowercase", "my_synonyms", "stop"]
        },
        "ngram_analyzer": {
          "type": "custom",
          "tokenizer": "edge_ngram",
          "filter": ["lowercase"]
        }
      }
    }
  }
}

Beware of Synonymous Token Bloat

Synonym filters can expand the inverted index dramatically. A field with 20 synonyms per token can multiply index size by 20x. Keep synonym lists small and use synonym_graph filter for multi-word synonyms to avoid false positives.

Production Insight

Mapping is schema-on-write, not schema-on-read.

Once a mapping is published, changing a field type requires reindexing (create new index, reindex documents, swap alias).

Use _template for index patterns to enforce mapping consistency across time-based indices.

Real incident: A team changed a text field to keyword via PUT mapping — Elasticsearch silently ignored the change, and queries kept failing until they reindexed.

Key Takeaway

Mappings are written when documents are indexed — you can't change them in-place.

Use index templates and explicit mappings for production.

Master analyser parameters: tokeniser, character filter, token filters.

Cluster Architecture, Sharding and Scaling

An Elasticsearch cluster consists of nodes: master-eligible nodes (handle cluster state), data nodes (store data and execute queries), ingest nodes (pre-process documents), and coordinating nodes (route requests). Data is split into shards — each shard is a separate Lucene index. When you index a document, it is routed to a primary shard based on _routing (default: document ID hash). The primary shard synchronously replicates to replica shards on other nodes. Scaling means adding data nodes — shards are automatically rebalanced. But you cannot change the number of primary shards after index creation. So right-sizing shards at index creation time is critical.

ExampleBASH

# Check cluster health
GET /_cluster/health?pretty
# Sample output: { "status": "green", "number_of_nodes": 5, "active_shards": 20 }

# View shard allocation
GET /_cat/shards?v

# Create index with custom shard count and replication
PUT /my_logs
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 2
  }
}

Output

{ "cluster_health": { "status": "green", "active_shards": 20 } }

Production Insight

Too many shards is worse than too few.

Each shard uses memory for metadata, file handles, and cluster state.

A cluster with 1000+ shards per node will degrade performance regardless of data size.

The formula: (number_of_nodes * 20) shards is a safe upper limit per cluster.

Fix: Use ILM (Index Lifecycle Management) to roll over indices and shrink shards over time.

Key Takeaway

Shards are the unit of horizontal scalability.

Primary shard count is fixed at index creation — get it right.

Too many shards hurts performance; ILM helps manage shard lifecycle.

Choosing Number of Shards

IfTime-series data, docs/day < 500K

→

UseStart with 1 shard per day. Use rollover index + ILM.

IfFull-text search index, < 10GB

→

UseStart with 1 shard. You can reindex later.

IfLog aggregation, > 50GB/day

→

UseUse data streams with ILM. 5-20 shards per day depending on node count.

Shard & Replica Distribution Layout

Visualising how shards and replicas are distributed across nodes helps you understand resilience and read scalability. Each index has a set of primary shards (P0, P1, ...). Each primary shard has zero or more replica shards (R0, R1, ...) on different nodes. Elasticsearch ensures primary and replica shards for the same shard number are never allocated on the same node — this guarantees high availability. When you have replicas, read requests can be served by any copy, distributing load. The diagram below shows a 2-node cluster with an index of 3 primary shards and 1 replica each.

ExampleBASH

# Create index with 3 primary shards and 1 replica
PUT /my_index
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  }
}

# View allocation breakdown
GET _cat/shards/my_index?v&h=index,shard,prirep,state,node
# index   shard prirep state   node
# my_index 0     p      STARTED node-1
# my_index 0     r      STARTED node-2
# my_index 1     p      STARTED node-2
# my_index 1     r      STARTED node-1
# my_index 2     p      STARTED node-1
# my_index 2     r      STARTED node-2

Shard Routing Logic

When you index a document, Elasticsearch computes shard = hash(_routing) % number_of_primary_shards. The default routing is the document _id. If you need to group related documents on the same shard (e.g., all user documents for user 42), specify a custom _routing value. But beware: using a custom routing skips the hash and can cause uneven shard sizes — never use a high-cardinality routing key.

Production Insight

Replica shards balance read load. A query against 3 primaries + 3 replicas can be answered by any of the 6 shards.

But write throughput doesn't increase with replicas — replicas consume indexing overhead.

Failure: A team set 5 replicas on a write-heavy index, hoping to scale writes — but writes became slower because each doc was duplicated 6 times.

Fix: Only increase replicas for read-heavy workloads. Use ILM to manage replica count for time-series data (hot: high replicas, warm: low replicas).

Key Takeaway

Primaries accept writes, replicas serve reads and provide failover.

Primary and replica for the same shard never coexist on one node.

Use custom routing sparingly — it clusters documents but risks uneven shard sizes.

Shard & Replica Distribution across 2 Nodes

Why Your Elasticsearch Cluster Died at 3 AM — Heap Pressure and GC Pauses

Every production meltdown I’ve seen with Elasticsearch traces back to one thing: the JVM heap. It’s not disk space. It’s not network. It’s the garbage collector stalling because you treated Elasticsearch like a document store instead of a search engine.

Elasticsearch runs on Lucene. Lucene writes to segments on disk, not to heap. But your aggregations, parent-child relationships, and heavy terms queries build in-memory hash tables that eat young gen alive. When those promotions fail, you get long GC pauses. Nodes drop out of the cluster. Shards go missing. You wake up to a 503.

The fix starts before deployment. Set -Xmx to no more than 50% of available RAM. The rest goes to the OS page cache — that’s where Lucene actually lives. Use _cat/nodes?v&h=heap.percent,ram.percent to monitor the split. If heap percent stays above 85 under load, you’re trading speed for stability. Split the index, cache fewer fields, or push aggregations to dedicated coordinating nodes.

CheckHeapPressure.sqlSQL

// io.thecodeforge — database tutorial

// Check per-node heap and RAM usage across cluster
GET _cat/nodes?v&h=name,node.role,heap.current,heap.percent,ram.percent

// Sample output:
// name      node.role heap.current heap.percent ram.percent
// es-data-1 di         12.8gb       82           91
// es-data-2 di         14.1gb       94           88
// es-master m          2.1gb        22           45

// If heap.percent > 85 for data nodes:
// 1. Reduce shard count per node (max 20 per GB heap)
// 2. Disable fielddata on text fields you don't need
PUT /_cluster/settings
{
  "transient": {
    "indices.fielddata.cache.size": "20%"
  }
}

Output

name node.role heap.current heap.percent ram.percent

es-data-1 di 12.8gb 82 91

es-data-2 di 14.1gb 94 88

es-master m 2.1gb 22 45

Production Trap:

Adding more RAM to a node won't fix heap pressure if you don't resize the JVM heap. Elasticsearch needs free RAM for the OS cache. If you configure both to 32GB on a 64GB box, you'll still crash on GC. Read the Elastic 'Heap: Sizing and Swapping' guide before you touch production.

Key Takeaway

Keep JVM heap under 50% of physical RAM. Monitor heap.percent vs ram.percent. If either is above 85, you have a design problem.

The Write Path — Why Your Indexing Throughput Is Lying to You

You watch the indexing rate on the monitoring page and think everything’s fine. Then you do a forced refresh and all those buffered writes flood the disk. Your search latency spikes. Users complain. You had 10k docs/second incoming but the disk queue depth hit 100. That’s not throughput. That’s deferred pain.

Elasticsearch doesn’t write to disk on every _bulk request. It keeps segments in memory until a refresh (every second by default) or a flush (to translog). That translog is your safety net if the node crashes — but replaying it on startup can take minutes if you let it grow. A shard with 1 million uncommitted docs after an unclean shutdown? Good luck.

The move: use index.translog.sync_interval to control flush frequency. For bulk-heavy pipelines, set it to 30s and live with a slightly larger recovery window. Never use _forcemerge until indexing is fully done — it rewrites segments and destroys the file system cache. Batch your writes. Keep segment count under 150 per shard. And for the love of god, disable replicas during reindexing and turn them back on after.

TuneWritePerformance.sqlSQL

// io.thecodeforge — database tutorial

// Optimize index for high-volume writes (bulk load only)
PUT /logs_write_optimized
{
  "settings": {
    "index": {
      "refresh_interval": "-1",           // turn off auto-refresh
      "number_of_replicas": 0,            // no replicas during load
      "translog": {
        "sync_interval": "30s",
        "durability": "async"            // safer: 'request' for durability
      },
      "segments": {
        "total_fields": 2000
      }
    }
  }
}

// After all data indexed:
PUT /logs_write_optimized/_settings
{
  "index": {
    "refresh_interval": "30s",
    "number_of_replicas": 1
  }
}
POST /logs_write_optimized/_forcemerge?max_num_segments=5

Output

{

"acknowledged": true,

"shards_acknowledged": true,

"index": "logs_write_optimized"

}

Senior Shortcut:

Use a separate ingest pipeline with grok or dissect to parse fields before they hit the indexing node. CPU-intensive parsing on the ingest node keeps your data nodes focused on writing segments. Split the work.

Key Takeaway

Indexing throughput is a lie until you look at translog size, segment count, and disk queue depth. Batch, flush sporadically, and never force-merge during live indexing.

Setup and Installation — Get Elasticsearch Running Before You Burn an Hour

Most engineers waste time fighting installation when they should be fighting bad queries. Elasticsearch runs on Java, so your first check is the JVM version. OpenJDK 17 or 21. No exceptions. Download the tarball from Elastic's official site — neverapt or brew outside of dev. Untar, run bin/elasticsearch, and it boots on port 9200. If you want Docker, use the official image with explicit memory limits: -Xms1g -Xmx1g. Defaults will eat your laptop alive. Production installs need the elasticsearch.yml file tuned for your cluster. Set discovery.type: single-node for local testing, then discovery.seed_hosts for real setups. The data and logs directories must be on separate volumes. Don't run as root — create a dedicated elasticsearch user. Verify with curl -XGET 'localhost:9200'. If you see a cluster name, you're live. If you see an error, check your heap. Always.

SetupAndVerify.sqlSQL

// io.thecodeforge — database tutorial

-- Download and extract Elasticsearch
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.12.0-linux-x86_64.tar.gz
tar -xzf elasticsearch-8.12.0-linux-x86_64.tar.gz
cd elasticsearch-8.12.0

-- Configure minimum production settings
echo "cluster.name: prod-cluster" >> config/elasticsearch.yml
echo "node.name: node-1" >> config/elasticsearch.yml
echo "path.data: /var/lib/elasticsearch" >> config/elasticsearch.yml
echo "path.logs: /var/log/elasticsearch" >> config/elasticsearch.yml
echo "discovery.seed_hosts: [\"10.0.0.1\", \"10.0.0.2\"]" >> config/elasticsearch.yml

-- Start Elasticsearch (non-root user recommended)
useradd -m elasticsearch
chown -R elasticsearch:elasticsearch /opt/elasticsearch-8.12.0
su elasticsearch -c './bin/elasticsearch -d -p pid'

-- Health check
curl -XGET 'localhost:9200/_cluster/health?pretty'

Output

{

"cluster_name" : "prod-cluster",

"status" : "green",

"number_of_nodes" : 3,

"active_primary_shards" : 45,

"active_shards" : 90

}

Heap Trap:

Never set heap size above 31GB. The JVM stops using compressed OOPs after that, and your performance tanks. Use -Xms4g -Xmx4g for most workloads — bigger isn't better.

Key Takeaway

Elasticsearch is a JVM process first. Validate your JVM, memory, and disk paths before you write a single query.

Security and Access Control — Stop Running Elasticsearch With No Pants

Out of the box, Elasticsearch listens on 0.0.0.0:9200 with no authentication. That's a lawsuit waiting to happen. The first thing you do is enable the Elasticsearch security features. In version 8+, it's automatic — but only if you didn't override it. For older clusters, set xpack.security.enabled: true in elasticsearch.yml. Then run bin/elasticsearch-setup-passwords auto to generate credentials for the elastic superuser. Store those in a vault, not a sticky note. Next, create roles per use case: read_only for dashboards, ingest for Logstash, admin for your CI. Use the security API, not the config file. TLS is non-negotiable for transport between nodes. Generate certificates with bin/elasticsearch-certutil cert. If you're on Kubernetes, use the Elasticsearch operator — it handles secrets and certs for you. API keys are preferred over passwords for automation. Revoke them with one DELETE call. Audit logs catch the intern who drops an index at 2 AM. Turn them on with xpack.security.audit.enabled: true.

SecurityHardening.sqlSQL

// io.thecodeforge — database tutorial

-- Enable security in config
echo "xpack.security.enabled: true" >> config/elasticsearch.yml
echo "xpack.security.transport.ssl.enabled: true" >> config/elasticsearch.yml
echo "xpack.security.transport.ssl.verification_mode: certificate" >> config/elasticsearch.yml

-- Generate TLS certificates for inter-node communication
bin/elasticsearch-certutil cert --pem --out tls-certs.zip

-- Create a read-only role via API
curl -XPOST 'localhost:9200/_security/role/read_only' \
  -H 'Content-Type: application/json' \
  -u elastic:YOUR_PASSWORD \
  -d '{
    "cluster": ["monitor"],
    "indices": [{
      "names": ["logs-*"],
      "privileges": ["read", "view_index_metadata"]
    }]
  }'

Output

{

"role": {

"created": true

}

Senior Shortcut:

Use API keys for any service talking to Elasticsearch. You can expire them, revoke them instantly, and they don't clog your logs with failed password attempts.

Key Takeaway

Never let Elasticsearch face the network without authentication. Enable security before you ingest a single document.

Why Data Ingestion Kills Your Cluster — The Ingestion Pipeline

Elasticsearch is only as fast as the data you feed it. Most engineers skip the ingestion pipeline and wonder why indexing blocks, field mappings drift, or documents disappear. The why: raw data from logs, APIs, or databases arrives in messy shapes. Elasticsearch can't search what it can't parse. The how: use ingest nodes with built-in processors — grok for unstructured logs, date index for timestamps, and set/remove for field cleanup. A pipeline runs before indexing, transforming each document. This prevents mapping explosions, reduces storage by dropping null fields, and improves query speed by pre-computing derived values. The production trap: without a pipeline, every source change requires reindexing your entire cluster. Pipeline versioning lets you evolve schemas without downtime. Key rule: always attach a pipeline to your index template — it's your data's first line of defense.

DefineIngestPipeline.sqlSQL

// io.thecodeforge — database tutorial

PUT _ingest/pipeline/log_cleanup
{
  "description": "Clean raw logs before indexing",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["%{COMBINEDAPACHELOG}"]
      }
    },
    {
      "date": {
        "field": "timestamp",
        "formats": ["ISO8601"]
      }
    },
    {
      "remove": {
        "field": ["extra_headers", "raw_bytes"]
      }
    }
  ]
}

Output

{"acknowledged":true}

Production Trap:

Unlimited processor chains cause heap pressure. Cap pipelines to 10 processors max and test with simulated data before deploying.

Key Takeaway

Ingest pipelines transform data once at write time, not repeatedly at query time.

Advanced Querying — Why bool Queries Crush Your Lucene Score

AdvancedBoolQuery.sqlSQL

// io.thecodeforge — database tutorial

GET /logs/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "message": "error" } }
      ],
      "filter": [
        { "term": { "status": 500 } },
        { "range": { "@timestamp": { "gte": "now-7d" } } }
      ],
      "should": [
        { "match": { "severity": "critical" } }
      ],
      "minimum_should_match": 1
    }
  }
}

Output

{"hits":{"total":{"value":42}}}

Production Trap:

Wrongly placing date ranges in 'must' instead of 'filter' kills cache — your hot query path becomes a cold start every time.

Key Takeaway

Use bool query's filter clause for any non-ranking condition — it unlocks query caching and faster execution.

Prerequisites to Learn Elasticsearch

Before building anything useful with Elasticsearch, you need a solid grasp of RESTful APIs (curl or Postman) and JSON structure, because every operation from indexing to search is a JSON-over-HTTP call. You must understand your data’s shape — fields that will be searched, aggregated, or sorted — so you can define a proper mapping. Without that, Elasticsearch will infer types, often disastrously (e.g., treating a ZIP code as a numeric field). You should also know the difference between structured (numeric, date) and unstructured (full-text) data; they require different analyzers. Cluster concepts matter: a single-node setup hides shard/replica behavior that will bite you in production. Finally, install Java 17+ (or the version required by your Elasticsearch release). A common pain point: default JVM heap settings (1 GB) cause immediate OutOfMemoryErrors when indexing even modest datasets. Tune heap to 50% of available RAM, not more than 32 GB, to avoid compressed object pointer overhead. Start small: index a few documents, run a match query, then scale to bulk ingestion.

check_health.sqlSQL

// io.thecodeforge — database tutorial
// Verify prerequisites: heap size and node roles
GET _cat/nodes?v&h=name,heap.current,heap.percent,node.role

// Expected output shows heap usage <75%
// and node.role includes 'm' (master-eligible)
// If heap.percent >85, resize heap immediately.
// curl localhost:9200/_cat/nodes?v&h=name,heap*

Output

node-a 512mb 68 mdi

node-b 768mb 72 mdi

node-c 256mb 91 di

Production Trap:

Never let heap exceed 32 GB — JVM stops using compressed OOPs, memory doubles per object, and GC pauses spike. On a 64 GB machine, allocate 31 GB to ES, leave the rest for OS page cache.

Key Takeaway

Master REST, JSON, and data types before touching a cluster — wrong mappings are the #1 cause of silent failures.

Comparisons and Differences

Elasticsearch is often confused with relational databases (MySQL, PostgreSQL) and other search engines (Solr). The fundamental difference: ES is a distributed, near-real-time document store with inverted index architecture, not a structured table. Joins are expensive — ES prefers denormalized, nested, or parent-child mappings. Unlike traditional DBs, ES has no ACID transactions across shards; it favors eventual consistency for speed. Compare with Solr: Solr has better built-in faceting for e-commerce, but ES wins on ecosystem (Kibana, Beats, Logstash) and horizontal scalability driven by Zen Discovery. Against OpenSearch? It's a fork; performance and API are nearly identical, but ES has a larger community and more aggressive feature development. The biggest difference from any SQL database: ES does not support multi-row transactions or foreign keys. If you need strong consistency, pair ES with a source-of-truth DB and treat it as a secondary index. For time-series data (logs, metrics), ES excels over MongoDB because of inverted indices and aggregation pipelines. Choose ES when you need full-text search, analytics dashboards, or high-ingestion-rate logging — not for banking ledgers.

compare_query.sqlSQL

// io.thecodeforge — database tutorial
// ES match query vs SQL LIKE — fundamentally different
GET products/_search
{
  "query": {
    "match": {
      "description": "running shoes"
    }
  }
}

// Equivalent SQL would be:
// SELECT * FROM products WHERE description LIKE '%running shoes%';
// SQL is sequential scan; ES uses inverted index scoring.
// ES returns relevance rankings; SQL returns exact substring.

Output

{

"took": 3,

"hits": {

"total": 12,

"max_score": 5.8,

"hits": [

{"_score": 5.8, "_source": {"description": "lightweight running shoes"}},

{"_score": 3.2, "_source": {"description": "shoes for trail running"}}

]

}

Production Trap:

Never treat ES as a transactional database. If you need atomic updates across fields, use _update_by_query after verifying version conflicts won't cascade. ES is a search engine, not a ledger.

Key Takeaway

ES is built for search and aggregation at scale, not ACID — use it as a secondary index alongside a relational DB for writes.

Conclusion

Elasticsearch is a powerful tool, but it demands respect for its distributed nature and memory-sensitive internals. The key takeaway from this series: mapping and analysis define how your data is stored and searched — get them wrong, and you'll chase phantom relevance issues. Cluster architecture, shard distribution, and heap tuning are not optional; they determine whether your cluster survives a traffic spike or implodes at 3 AM. The write path and ingestion pipelines are the primary bottlenecks — always measure throughput with bulk requests and monitor GC logs. Security, often an afterthought, should be configured before the first document is indexed (enable TLS, set password for elastic user). Comparisons with SQL databases or Solr clarify use cases: ES excels at full-text search and log analytics, not relational consistency. Finally, prerequisites like REST fluency and data-type awareness form the foundation; skip them and you'll wrestle with inferred mappings that corrupt search accuracy. Start small, monitor relentlessly, and treat every mapping change as a migration. Elasticsearch rewards understanding — it punishes blind optimism. Use the provided code snippets as daily checklists, and you'll push to production with confidence.

final_checklist.sqlSQL

// io.thecodeforge — database tutorial
// Pre-deployment checklist — must run all
GET _cluster/health?wait_for_status=green&timeout=30s
GET _cat/allocation?v&h=shards,disk.percent,ip
GET _nodes/stats/jvm?filter_path=nodes.*.jvm.gc.collectors.young.collection_time_in_millis

// Expected:
// - status: green (all shards active)
// - disk.percent < 80 on every node
// - young GC time < 10s per hour
// If any check fails, stop and fix.

Output

status green

shards 42

disk 65

GC 4200ms

Production Trap:

A green cluster status doesn't mean query performance is fine — always validate shard placement and disk usage after scaling. Uneven shard distribution causes hot spots even when all nodes are healthy.

Key Takeaway

Elasticsearch is a precision instrument — treat mapping, memory, and monitoring as non-negotiable, and it will serve you at scale.

● Production incidentPOST-MORTEMseverity: high

Mapping Explosion Caused Cluster OOM

Symptom

Elasticsearch cluster health went red. Nodes threw CircuitBreakingException errors. The cluster stopped accepting write requests. The logs showed thousands of new field mappings per minute.

Assumption

The team assumed that dynamic mapping would handle any incoming JSON, and that Elasticsearch would manage memory automatically.

Root cause

Each log event contained unique field names (like user_1234_activity). Dynamic mapping created a new field in the mapping for every unique key. Within hours, the mapping object grew to millions of fields, consuming all heap and triggering the field data circuit breaker.

Fix

Disabled dynamic mapping for high-cardinality fields by setting dynamic: false or dynamic: strict on the index template. Mapped only known fields explicitly. Used a flattened data type or nested key-value pair for unknown fields. Applied a limit on field count via index.mapping.total_fields.limit.

Key lesson

Dynamic mapping is safe for low-cardinality, known schemas. Never use it for user-generated keys or log fields with unbounded cardinality.
Always set field count limits and monitor mapping size in production.
A mapping explosion is silent — you'll see OOM before you see the mapping warning.

Production debug guideSymptom to root cause: the signals that matter4 entries

Symptom · 01

Query latency jumps from 50ms to 5s for the same search

→

Fix

Check _cat/thread_pool/search?v for queue depth. If queue > 100, nodes are overloaded. Also check _nodes/hot_threads for CPU contention.

Symptom · 02

Cluster health is yellow or red

→

Fix

Run GET _cluster/health?pretty. If unassigned_shards > 0, check GET _cat/shards?h=index,shard,prirep,state,node,unassigned.reason to see why shards are unassigned (e.g., node left, disk full).

Symptom · 03

Partial search results (missing documents that should exist)

→

Fix

Check for replica lag: GET _cat/shards?v and see if primaries and replicas are in sync. Also verify that the refresh_interval isn't set too high (default 1s, increase only if indexing throughput is critical).

Symptom · 04

CircuitBreakingException in logs

→

Fix

Check GET _nodes/stats/breaker?pretty. Identify which breaker tripped (field data, request, in-flight). Increase the breaker limit temporarily, but the real fix is reducing memory pressure: reduce field data cache size or limit query complexity.

★ Quick Debug Commands for ElasticsearchRun these commands immediately when something breaks. No theory, just action.

Cluster health is red or yellow−

Immediate action

Identify unassigned shards

Commands

GET _cat/shards?h=index,shard,prirep,state,node,unassigned.reason&v

GET _cluster/allocation/explain?pretty

Fix now

If disk space is low, increase cluster.routing.allocation.disk.watermark.low. If a node dropped, reroute shards manually: POST _cluster/reroute?retry_failed=true.

Query too slow+

High JVM heap usage (over 85%)+

Ingestion slowdown (indexing rejects)+

Elasticsearch vs Relational Databases

Feature	Elasticsearch	Relational Database (e.g., PostgreSQL)
Data model	JSON documents	Rows in tables (normalised)
Schema enforcement	Dynamic (can be disabled)	Strict (DDL required)
Query language	RESTful JSON queries (Query DSL)	SQL
Full-text search	Native — tokenisation, scoring, fuzzy	Full-text search via GIN indexes (limited)
ACID transactions	No (document-level only)	Yes (multi-row, multi-table)
Joins	No joins — use denormalisation or nested	Native JOINs
Scaling	Horizontal (sharding)	Vertical (or read replicas, sharding adds complexity)
Aggregations	Built-in, real-time, complex	SQL GROUP BY (batch-friendly)
Use case	Search, analytics, logs	OLTP, reporting, transactions

Key takeaways

Elasticsearch stores JSON documents in indices with configurable mappings

dynamic mapping is dangerous for production.

Full-text search is Elasticsearch's core strength

analysis, inverted index, BM25 scoring.

match queries do full-text search; term queries do exact matching

use them on the right field types.

Aggregations run on shards in parallel but can be memory-intensive

pre-filter data and cap bucket sizes.

Shard count is fixed at index creation

right-size (20-50GB per shard) and use ILM for lifecycle management.

Symptom

Elasticsearch stops allocating shards when disk usage exceeds watermark (default 85%). Writes start failing. Red cluster.

Fix

Set cluster.routing.allocation.disk.watermark.low and high in elasticsearch.yml. Add monitoring alerts at 70% disk usage.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR

Explain the difference between a `match` query and a `term` query. When ...

Q02SENIOR

What is an inverted index and how does it enable fast full-text search?

Q03SENIOR

How do shards work in Elasticsearch? Describe the role of primary and re...

Q04SENIOR

What causes a mapping explosion and how do you prevent it?

Q01 of 04JUNIOR

Explain the difference between a `match` query and a `term` query. When would you use each?

ANSWER

A match query analyses the search term (tokenises, lowercases, stems) and searches the analysed field. A term query does not analyse — it looks for the exact token as stored in the inverted index. Use match for full-text search on text fields. Use term for exact matches on keyword fields, IDs, enums, or status values. If you use term on a text field, you'll likely get zero results because the text was tokenised during indexing.

FAQ · 4 QUESTIONS

Frequently Asked Questions

When should I use Elasticsearch vs PostgreSQL full-text search?

What is the difference between match and term queries?

How do I choose the number of shards for an index?

What is the difference between filter context and query context in Elasticsearch?

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

✓ Verified

production tested

June 10, 2026

last updated

1,554

articles · all by Naren

🔥

That's NoSQL. Mark it forged?

11 min read · try the examples if you haven't