Senior 6 min · March 06, 2026

ASP.NET Core Caching — Why ResponseCache Leaks User Data

ResponseCache with Location=Any served one user's prices to others.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • ASP.NET Core offers three caching layers: IMemoryCache (in-process), IDistributedCache (Redis/SQL), and Response Caching (HTTP middleware)
  • IMemoryCache is fastest but isolated per server — use only in single-server deployments
  • IDistributedCache with Redis provides shared state across multiple instances at the cost of a network hop
  • Response Caching caches full HTTP responses before your controller runs — ideal for public, anonymous endpoints
  • All caching must pair sliding expiration with an absolute cap to avoid immortal stale data
  • Biggest mistake: using IMemoryCache in a load-balanced cluster, leading to inconsistent data across servers
Plain-English First

Imagine a librarian who, instead of walking to the back storeroom every time you ask for the same popular book, keeps a copy right at their desk. The first request is slow — they have to fetch it — but every request after that is instant. Caching in ASP.NET Core is exactly that librarian. Your app 'remembers' expensive results — database queries, API calls, computed values — and hands them back instantly for repeat requests. The trick is knowing when the book at the desk is too old and needs replacing.

Every millisecond your API spends fetching the same database row it fetched three seconds ago is a millisecond wasted — and under load, those milliseconds stack up into seconds that cost you users. Caching is not a micro-optimisation; it is the difference between an app that collapses under real traffic and one that scales gracefully. High-traffic systems like e-commerce product pages, news feeds, and dashboards owe most of their performance not to faster hardware, but to well-designed caches.

The core problem caching solves is the cost of repetition. Database queries, HTTP calls to third-party APIs, and complex in-memory computations all take time proportional to their complexity — not proportional to how often you call them. Without caching, a product page hit 10,000 times a minute fires 10,000 identical SQL queries. With caching, it fires one, and returns the stored result for the other 9,999. The challenge — and the reason most developers get caching wrong — is deciding what to cache, for how long, and when to throw it away.

By the end of this article you will understand the three main caching layers available in ASP.NET Core (In-Memory, Distributed, and Response), know exactly which one to reach for in a given situation, and be able to implement each with production-grade patterns including cache-aside, sliding expiration, and cache invalidation. You will also walk away knowing the mistakes that silently destroy cache effectiveness in real apps.

In-Memory Caching with IMemoryCache — Fast, Simple, Single-Server

In-Memory caching stores data directly in the RAM of your web server process. It is the fastest cache available because there is zero network round-trip — the data lives in the same memory space as your application. ASP.NET Core exposes this through the IMemoryCache interface, which you register once and inject anywhere.

The pattern you will use 99% of the time is called cache-aside (also known as lazy loading): you try to get the value from cache first; if it is not there (a 'cache miss'), you fetch it from the real source, store it in cache, then return it. On the next call, you get a 'cache hit' and skip the expensive work entirely.

In-memory cache is the right choice when you have a single server deployment or when the cached data is local to one server (like a per-user preferences object). It is the wrong choice when you run multiple server instances behind a load balancer — because each server has its own isolated cache, and a user hitting Server A might get stale data that Server B already updated. That is the scenario where distributed caching becomes essential.

MemoryCache entries support both absolute expiration (evict after exactly N minutes) and sliding expiration (evict if nobody reads it for N minutes). Use sliding expiration for 'warm' data that is accessed frequently; use absolute for data that must stay fresh regardless of traffic.

ProductService.csCSHARP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
using Microsoft.Extensions.Caching.Memory;
using System;
using System.Threading.Tasks;

// Register IMemoryCache in Program.cs:
// builder.Services.AddMemoryCache();

public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly IProductRepository _repository;

    // Cache key constants prevent typos across the codebase
    private const string ProductCacheKeyPrefix = "product_";
    private static readonly TimeSpan ProductCacheDuration = TimeSpan.FromMinutes(10);

    public ProductService(IMemoryCache cache, IProductRepository repository)
    {
        _cache = cache;
        _repository = repository;
    }

    public async Task<Product?> GetProductByIdAsync(int productId)
    {
        // Build a unique key per product so we can invalidate one without flushing all
        string cacheKey = $"{ProductCacheKeyPrefix}{productId}";

        // TryGetValue returns true on a cache HIT — we skip the database entirely
        if (_cache.TryGetValue(cacheKey, out Product? cachedProduct))
        {
            Console.WriteLine($"[CACHE HIT] Returning product {productId} from memory cache.");
            return cachedProduct;
        }

        // Cache MISS — go to the real data source
        Console.WriteLine($"[CACHE MISS] Fetching product {productId} from database.");
        Product? product = await _repository.GetByIdAsync(productId);

        if (product is not null)
        {
            // Configure cache entry options before storing
            var cacheOptions = new MemoryCacheEntryOptions()
                // Evict this entry if it hasn't been accessed in 5 minutes (sliding)
                .SetSlidingExpiration(TimeSpan.FromMinutes(5))
                // But always evict after 10 minutes regardless of access (absolute)
                .SetAbsoluteExpiration(ProductCacheDuration)
                // Mark as normal priority — the runtime can evict under memory pressure
                .SetPriority(CacheItemPriority.Normal);

            _cache.Set(cacheKey, product, cacheOptions);
            Console.WriteLine($"[CACHE SET] Product {productId} stored in memory cache.");
        }

        return product;
    }

    // Call this when a product is updated so the next read fetches fresh data
    public void InvalidateProductCache(int productId)
    {
        string cacheKey = $"{ProductCacheKeyPrefix}{productId}";
        _cache.Remove(cacheKey);
        Console.WriteLine($"[CACHE INVALIDATED] Removed product {productId} from cache.");
    }
}

// --- Simulated output for two sequential calls to GetProductByIdAsync(42) ---
// First call:
// [CACHE MISS] Fetching product 42 from database.
// [CACHE SET]  Product 42 stored in memory cache.
//
// Second call (within 5 minutes):
// [CACHE HIT]  Returning product 42 from memory cache.
Output
[CACHE MISS] Fetching product 42 from database.
[CACHE SET] Product 42 stored in memory cache.
[CACHE HIT] Returning product 42 from memory cache.
Watch Out: Sliding + Absolute Expiration Together
Always pair sliding expiration with an absolute expiration cap. Without the absolute cap, a cache entry that gets read every 4 minutes with a 5-minute sliding window will NEVER expire — even if the underlying data changed hours ago. The absolute ceiling guarantees freshness no matter how popular the entry is.
Production Insight
In production, IMemoryCache eviction under memory pressure is silent — entries disappear without warning.
This can cause sudden latency spikes when many keys expire simultaneously, known as the 'thundering herd'.
Rule: always use cache-aside so the first miss repopulates, and add jitter to absolute expirations to avoid batch expiration.
Key Takeaway
IMemoryCache is fastest but server-isolated.
Pair sliding + absolute expiration always.
Thundering herd risk: use jitter and cache-aside.
When to Choose IMemoryCache
IfSingle server deployment (no load balancer)
UseUse IMemoryCache — it's the fastest option and you avoid serialisation overhead.
IfData is per-user or local to a server instance
UseUse IMemoryCache — user session data doesn't need to be shared across servers.
IfMulti-server (load-balanced) deployment
UseDo NOT use IMemoryCache alone. Switch to IDistributedCache or use two-level caching.
IfCache pressure causes high memory usage
UseSet reasonable priority and expiration. Use GetCurrentStatistics() to monitor cache size and evictions.

Distributed Caching with IDistributedCache — Sharing State Across Multiple Servers

When you scale your app horizontally — multiple instances behind a load balancer — in-memory cache breaks down because each instance has its own isolated memory. User A might update a record on Server 1, but User B hits Server 2 which still has the old cached version. This is a consistency bug, not just a performance issue.

Distributed caching solves this by putting the cache outside the application in a shared store — typically Redis or SQL Server. All instances read from and write to the same cache, so everyone sees the same data. ASP.NET Core abstracts this behind IDistributedCache, meaning you can swap Redis for SQL Server (or vice versa) by changing one line in Program.cs without touching your service code.

Redis is the industry standard choice. It is an in-memory data store purpose-built for speed, supporting complex data types, pub/sub for cache invalidation, and cluster mode for high availability. SQL Server distributed cache exists for environments where you already have SQL infrastructure and cannot add Redis — but it is meaningfully slower.

The IDistributedCache API works with byte arrays, so you need to serialise your objects. The standard approach is JSON serialisation with System.Text.Json. A cleaner pattern is to wrap IDistributedCache in your own generic helper that handles serialisation transparently — which is exactly what the example below does.

DistributedCacheService.csCSHARP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
// Program.cs registration (Redis example):
// builder.Services.AddStackExchangeRedisCache(options =>
// {
//     options.Configuration = builder.Configuration.GetConnectionString("Redis");
//     options.InstanceName = "MyApp:"; // Prefix all keys to avoid collisions
// });
//
// For SQL Server instead, use:
// builder.Services.AddDistributedSqlServerCache(options => { ... });

using Microsoft.Extensions.Caching.Distributed;
using System.Text.Json;
using System.Threading;
using System.Threading.Tasks;

// A generic wrapper that hides the byte-array pain of IDistributedCache
public class DistributedCacheService
{
    private readonly IDistributedCache _distributedCache;

    public DistributedCacheService(IDistributedCache distributedCache)
    {
        _distributedCache = distributedCache;
    }

    // Returns cached value or null on a miss — caller decides what to do
    public async Task<T?> GetAsync<T>(string cacheKey, CancellationToken cancellationToken = default)
        where T : class
    {
        byte[]? cachedBytes = await _distributedCache.GetAsync(cacheKey, cancellationToken);

        if (cachedBytes is null || cachedBytes.Length == 0)
        {
            return null; // Cache miss
        }

        // Deserialise from JSON bytes back to the strongly-typed object
        return JsonSerializer.Deserialize<T>(cachedBytes);
    }

    public async Task SetAsync<T>(
        string cacheKey,
        T value,
        TimeSpan absoluteExpiration,
        CancellationToken cancellationToken = default)
        where T : class
    {
        byte[] serialisedBytes = JsonSerializer.SerializeToUtf8Bytes(value);

        var cacheEntryOptions = new DistributedCacheEntryOptions
        {
            // Absolute expiration from now — always evict after this window
            AbsoluteExpirationRelativeToNow = absoluteExpiration
        };

        await _distributedCache.SetAsync(cacheKey, serialisedBytes, cacheEntryOptions, cancellationToken);
    }

    public async Task RemoveAsync(string cacheKey, CancellationToken cancellationToken = default)
    {
        await _distributedCache.RemoveAsync(cacheKey, cancellationToken);
    }
}

// --- Usage in a controller or service ---
public class OrderSummaryService
{
    private readonly DistributedCacheService _cache;
    private readonly IOrderRepository _orderRepository;
    private static readonly TimeSpan OrderSummaryCacheDuration = TimeSpan.FromMinutes(15);

    public OrderSummaryService(DistributedCacheService cache, IOrderRepository orderRepository)
    {
        _cache = cache;
        _orderRepository = orderRepository;
    }

    public async Task<OrderSummary?> GetOrderSummaryAsync(int customerId, CancellationToken cancellationToken)
    {
        string cacheKey = $"order_summary_{customerId}";

        // Step 1: Try the distributed cache first
        OrderSummary? cached = await _cache.GetAsync<OrderSummary>(cacheKey, cancellationToken);
        if (cached is not null)
        {
            Console.WriteLine($"[REDIS HIT] Order summary for customer {customerId} served from Redis.");
            return cached;
        }

        // Step 2: Cache miss — hit the database
        Console.WriteLine($"[REDIS MISS] Querying database for customer {customerId} order summary.");
        OrderSummary? summary = await _orderRepository.GetSummaryByCustomerIdAsync(customerId, cancellationToken);

        // Step 3: Store in Redis for the next caller — all server instances benefit
        if (summary is not null)
        {
            await _cache.SetAsync(cacheKey, summary, OrderSummaryCacheDuration, cancellationToken);
            Console.WriteLine($"[REDIS SET] Customer {customerId} summary cached for 15 minutes.");
        }

        return summary;
    }
}
Output
[REDIS MISS] Querying database for customer 7 order summary.
[REDIS SET] Customer 7 summary cached for 15 minutes.
[REDIS HIT] Order summary for customer 7 served from Redis.
Pro Tip: Always Set an InstanceName Prefix in Redis
When you configure AddStackExchangeRedisCache, always set InstanceName (e.g., 'MyApp:'). Without it, all your keys are global — if you ever run two different apps against the same Redis server, or run staging and production on the same instance, their cache keys will collide silently and corrupt each other's data.
Production Insight
Serialisation mismatch is the #1 bug with IDistributedCache — store with one serializer, read with another, and you get null.
Redis connection pool exhaustion causes timeouts under high concurrency — default pool size (25) is often too low.
Rule: wrap IDistributedCache with a generic service that enforces consistent JSON serialisation and pool configuration.
Key Takeaway
IDistributedCache unifies cache across servers but adds serialisation cost.
Use Redis for production — SQL Server cache is a fallback only.
Serialiser consistency: use the same package everywhere.
Distributed Cache Provider Selection
IfMultiple server instances require consistent cached data
UseUse IDistributedCache — forces shared state outside each server's memory.
IfRedis infrastructure is available and cost is acceptable
UseUse Redis (StackExchange.Redis) — best performance among distributed stores.
IfNo Redis, but you already have SQL Server running
UseUse SQL Server distributed cache (AddDistributedSqlServerCache) — slower but avoids new infrastructure.
IfNeed cache invalidation across servers in real-time
UseUse Redis with pub/sub to broadcast eviction messages to all server instances.

Response Caching — Cache at the HTTP Layer Before Your Code Even Runs

In-Memory and Distributed caching are application-level caches — your C# code still runs and decides whether to call the database. Response Caching works at a completely different layer: it caches the entire HTTP response and serves it directly from middleware, before your controller action ever executes. This is the fastest possible cache because zero application code runs on a cache hit.

Response caching follows HTTP caching semantics via Cache-Control headers. When your action returns a response with Cache-Control: public, max-age=60, both the ASP.NET Core response cache middleware (server-side) and downstream proxies or CDNs (like Cloudflare or Azure CDN) know they can cache and replay that response for 60 seconds.

This makes it ideal for public, anonymous content: marketing pages, product catalogues, news articles — content that is the same for every user. It is completely wrong for authenticated or personalised content because cached responses ignore who the user is. A response cached for User A would be served to User B.

The [ResponseCache] attribute controls the Cache-Control header. The AddResponseCaching() middleware does the actual server-side caching. Both are needed for server-side caching to work — the attribute alone just sets the header for downstream caches like CDNs.

CatalogueController.csCSHARP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// Program.cs — register and use the middleware (ORDER MATTERS):
// builder.Services.AddResponseCaching();
// ...
// app.UseResponseCaching(); // Must come before app.MapControllers()

using Microsoft.AspNetCore.Mvc;
using System.Collections.Generic;
using System.Threading.Tasks;

[ApiController]
[Route("api/[controller]")]
public class CatalogueController : ControllerBase
{
    private readonly ICatalogueService _catalogueService;

    public CatalogueController(ICatalogueService catalogueService)
    {
        _catalogueService = catalogueService;
    }

    // This response is cached for 60 seconds server-side AND tells CDNs they can cache it too.
    // 'public' means any cache (server, CDN, proxy) can store this response.
    // 'VaryByQueryKeys' means separate cache entries are kept per 'category' query param
    // so /api/catalogue?category=shoes and ?category=hats each get their own cached version.
    [HttpGet]
    [ResponseCache(Duration = 60, Location = ResponseCacheLocation.Any, VaryByQueryKeys = new[] { "category" })]
    public async Task<IActionResult> GetProductsAsync([FromQuery] string category = "all")
    {
        Console.WriteLine($"[CONTROLLER HIT] Fetching products for category: {category}");
        // This line only runs on a cache MISS — during a 60-second window it runs once per category
        IEnumerable<ProductSummary> products = await _catalogueService.GetByCategoryAsync(category);
        return Ok(products);
    }

    // [ResponseCache(NoStore = true)] explicitly opts this endpoint OUT of caching.
    // Use this for any endpoint that returns user-specific or sensitive data.
    [HttpGet("my-orders")]
    [ResponseCache(NoStore = true, Location = ResponseCacheLocation.None)]
    public async Task<IActionResult> GetMyOrdersAsync()
    {
        // Each call always hits the controller — never cached
        var orders = await _catalogueService.GetOrdersForCurrentUserAsync();
        return Ok(orders);
    }
}

// HTTP Response headers produced by the first endpoint:
// Cache-Control: public,max-age=60
// Vary: Accept-Encoding
//
// Subsequent requests within 60 seconds return:
// [Served from response cache — controller action does NOT execute]
//
// HTTP Response headers produced by the second endpoint:
// Cache-Control: no-store,no-cache
Output
GET /api/catalogue?category=shoes
[CONTROLLER HIT] Fetching products for category: shoes
-> Response cached for 60 seconds
GET /api/catalogue?category=shoes (within 60 seconds)
-> Served from response cache. Controller NOT called.
GET /api/catalogue?category=hats
[CONTROLLER HIT] Fetching products for category: hats
-> Separate cache entry created for 'hats'
Watch Out: VaryByQueryKeys Requires the Middleware
VaryByQueryKeys on [ResponseCache] is silently ignored if app.UseResponseCaching() is not registered in your middleware pipeline. The Cache-Control header will still be sent (which CDNs honour), but the server-side middleware won't vary by query string — so all category requests return the first cached response regardless of the parameter.
Production Insight
The data leak scenario: a missing NoStore on authenticated endpoints serves cached user A's data to user B — this is a security incident.
Middleware order is critical — if UseResponseCaching() is after MapControllers(), caching never happens.
Rule: for any endpoint that checks HttpContext.User, add [ResponseCache(NoStore = true)] as a safety net.
Key Takeaway
Response caching is the fastest layer — zero code runs on a hit.
Never use it for authenticated endpoints: use [ResponseCache(NoStore = true)].
Always verify middleware order and VaryByQueryKeys registration.
When Response Caching is Appropriate
IfPublic, anonymous content (product catalogue, blog posts, marketing pages)
UseUse [ResponseCache(Duration=..., Location=Any)] — fastest possible response.
IfAuthenticated or user-specific content
UseDo NOT use response caching. Use application-level caching with user-aware keys instead.
IfContent varies by query string or header
UseUse VaryByQueryKeys or VaryByHeader in [ResponseCache] — but ensure middleware is registered.
IfNeed CDN cache control only (no server-side caching)
UseYou can omit app.UseResponseCaching() and rely on the [ResponseCache] attribute to set proper Cache-Control headers for downstream caches.

Cache Invalidation Strategies — Knowing When to Throw Data Away

Writing to cache is easy. Knowing when to evict is where production systems break. The three main strategies are: time-based expiry, explicit key removal, and pattern-based invalidation.

Time-based expiry is the simplest. You set a TTL (absolute or sliding) and let the cache purge entries automatically. The danger? If all entries for a popular endpoint expire at the same moment, every request triggers a database call — the 'thundering herd' problem. Add random jitter to expiration times (e.g., base ± 10%) to spread the load.

Explicit key removal means calling Remove or RemoveAsync when the underlying data changes. Works well for individual records, but breaks down when one data change invalidates many cache keys (e.g., updating a category name that appears in 1000 product entries). For these cases, use a key prefix pattern: store all keys with a shared prefix, and when invalidating, iterate over a separate index of keys (like a Redis SET) to evict them all.

Pattern-based invalidation using Redis sets: maintain a SET of cache keys for each 'tag' (e.g., 'category:shoes'). When the shoes category is updated, retrieve all keys from the set and delete them. This gives you bulk invalidation without a full cache flush.

For distributed systems across multiple servers, use Redis Pub/Sub: when one server invalidates a key, publish a message. All other servers subscribe and evict their local (L1) copy of that key, ensuring consistency.

CacheInvalidationService.csCSHARP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
using StackExchange.Redis;
using Microsoft.Extensions.Caching.Distributed;
using System.Text.Json;

public class CacheInvalidationService
{
    private readonly IDistributedCache _cache;
    private readonly IConnectionMultiplexer _redis;

    public CacheInvalidationService(IDistributedCache cache, IConnectionMultiplexer redis)
    {
        _cache = cache;
        _redis = redis;
    }

    // Invalidate a single key
    public async Task InvalidateKeyAsync(string cacheKey)
    {
        await _cache.RemoveAsync(cacheKey);
        // Notify other servers to evict their local L1 cache
        var subscriber = _redis.GetSubscriber();
        await subscriber.PublishAsync("cache:invalidation", cacheKey);
    }

    // Invalidate all keys belonging to a tag (requires a Redis Set of keys per tag)
    public async Task InvalidateTagAsync(string tag)
    {
        var db = _redis.GetDatabase();
        // Get all keys that belong to this tag (e.g., "tag:category:shoes")
        string tagKey = $"tag:{tag}";
        RedisValue[] memberKeys = await db.SetMembersAsync(tagKey);

        // Remove each key from the distributed cache
        var tasks = memberKeys.Select(k => _cache.RemoveAsync(k.ToString()));
        await Task.WhenAll(tasks);

        // Publish for L1 eviction on other servers
        var subscriber = _redis.GetSubscriber();
        await subscriber.PublishAsync("cache:invalidation:tag", tag);

        // Remove the tag set itself
        await db.KeyDeleteAsync(tagKey);
    }

    // Associate a cache key with a tag during Set
    public async Task SetWithTagAsync<T>(string cacheKey, T value, TimeSpan expiration, string tag)
    {
        // Store in distributed cache as usual
        byte[] bytes = JsonSerializer.SerializeToUtf8Bytes(value);
        await _cache.SetAsync(cacheKey, bytes, new DistributedCacheEntryOptions
        {
            AbsoluteExpirationRelativeToNow = expiration
        });

        // Add the key to the tag's Redis Set
        var db = _redis.GetDatabase();
        await db.SetAddAsync($"tag:{tag}", cacheKey);
    }
}

// Usage:
// await invalidationService.SetWithTagAsync("product:42", product, TimeSpan.FromMinutes(10), "category:shoes");
// ... later, when the shoes category changes:
// await invalidationService.InvalidateTagAsync("category:shocks"); // invalidates all related keys
Output
// After category update:
// TAG: cache:invalidation:tag -> "category:shoes"
// All product keys with that tag are evicted from the distributed cache and all server L1 caches.
Think in Tags, Not Keys
  • A product price update should invalidate only that product's keys — not the entire product catalogue.
  • A category name change might affect hundreds of product keys — use tags to link them.
  • A promotion activation might invalidate many unrelated categories — use pub/sub to broadcast a tag-based eviction.
  • Avoid global flush: it causes a cold cache and instant database meltdown.
Production Insight
The most common invalidation bug: the read path generates cache keys differently from the write path.
Example: read uses $"product_{id}" but write uses $"product:{id}" — the key never matches, so invalidation silently fails.
Rule: generate all cache keys from a single shared static method to guarantee they match across the codebase.
Key Takeaway
Invalidate precisely, not wholesale — cache stampede kills databases.
Use key prefix patterns for bulk invalidation.
For distributed systems, prefer pub/sub invalidation over polling.
Choosing an Invalidation Strategy
IfData changes infrequently and predictably (e.g., product prices updated hourly)
UseTime-based expiry with an absolute timeout is sufficient. No explicit invalidation needed.
IfIndividual records updated independently (e.g., user profile)
UseExplicit key removal — call Remove() right after the database write.
IfOne change invalidates many cached entries (e.g., category name rename)
UseUse tag-based invalidation: store keys per tag, then bulk evict.
IfMultiple server instances need coordinated eviction
UseAdd Redis pub/sub to broadcast invalidation messages to all servers' local caches.

Two-Level Caching (L1/L2) — Combining IMemoryCache and IDistributedCache for Speed and Consistency

A two-level cache sits IMemoryCache (L1) in front of IDistributedCache (L2). On a read request, you check the local in-memory cache first (fastest, zero network). On a miss, you check the distributed cache (Redis). On a distributed cache hit, you populate L1 so subsequent requests on that server are instant. On a distributed cache miss, you fetch from the database and store in both L2 and L1.

This pattern dramatically reduces Redis round-trips for hot data. In production systems with read-heavy workloads, two-level caching cuts p95 latency by 80-90% compared to using distributed cache alone. The cost is added complexity in invalidation: when a write occurs, you must evict the key from L2 and also from L1 on all servers. Use Redis pub/sub to broadcast L1 eviction commands.

Implementation steps: wrap both caches in a single service that follows the order: L1 → L2 → DB. Use a memory cache region per server (e.g., by server name) to avoid serialisation clashes. Monitor hit ratios at both levels — a low L1 hit rate means your memory cache is too small or your data isn't local enough.

One trap: if you store objects in L1 that are also in L2, ensure you're not holding references that prevent garbage collection. Use weak references or smaller L1 sizes for objects that change frequently.

TwoLevelCacheService.csCSHARP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
using Microsoft.Extensions.Caching.Memory;
using Microsoft.Extensions.Caching.Distributed;
using System.Text.Json;

public class TwoLevelCacheService
{
    private readonly IMemoryCache _localCache;    // L1
    private readonly IDistributedCache _distCache; // L2

    public TwoLevelCacheService(IMemoryCache localCache, IDistributedCache distCache)
    {
        _localCache = localCache;
        _distCache = distCache;
    }

    public async Task<T?> GetAsync<T>(string cacheKey, Func<Task<T?>> fetchFromDb, CancellationToken ct = default)
        where T : class
    {
        // Try L1 (local memory) first
        if (_localCache.TryGetValue(cacheKey, out T? localResult) && localResult is not null)
        {
            return localResult;
        }

        // L1 miss — try L2 (distributed, e.g., Redis)
        byte[]? distBytes = await _distCache.GetAsync(cacheKey, ct);
        if (distBytes is not null && distBytes.Length > 0)
        {
            T? deserialized = JsonSerializer.Deserialize<T>(distBytes);
            if (deserialized is not null)
            {
                // Populate L1 for future near-instant access
                _localCache.Set(cacheKey, deserialized, TimeSpan.FromMinutes(5)); // sliding for L1
                return deserialized;
            }
        }

        // L2 miss — fetch from database
        T? result = await fetchFromDb();
        if (result is null) return null;

        // Store in both caches
        byte[] bytes = JsonSerializer.SerializeToUtf8Bytes(result);
        var distOptions = new DistributedCacheEntryOptions
        {
            AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30)
        };
        await _distCache.SetAsync(cacheKey, bytes, distOptions, ct);

        _localCache.Set(cacheKey, result, TimeSpan.FromMinutes(5));

        return result;
    }

    // Invalidate from both caches and broadcast to other servers
    public async Task InvalidateAsync(string cacheKey, IConnectionMultiplexer redis, CancellationToken ct = default)
    {
        _localCache.Remove(cacheKey);
        await _distCache.RemoveAsync(cacheKey, ct);

        var subscriber = redis.GetSubscriber();
        await subscriber.PublishAsync("cache:l1:evict", cacheKey);
    }
}

// --- Usage ---
// var product = await twoLevelCache.GetAsync("product:42",
//     () => _repo.GetByIdAsync(42), cancellationToken);
Output
// Sequence:
// [L1 MISS] -> [L2 HIT] -> Populate L1 -> Return
// Next request (same server): [L1 HIT] -> Return (no Redis call)
Performance Numbers
In production at a major e-commerce site, switching from pure Redis caching to a two-level (L1: MemoryCache, L2: Redis) pattern reduced average read latency from 8ms to 0.5ms for hot products (p99 from 45ms to 12ms). The Redis server load dropped by 70% because 85% of reads were served from local memory.
Production Insight
L1 cache eviction under memory pressure is silent — you'll see Redis load spike without any code change.
Monitor L1 hit ratio via GetCurrentStatistics(). If it drops below 50%, increase L1 size or adjust entry priorities.
The invalidation broadcast is critical: without it, servers serve stale L1 data until the sliding window expires — potentially minutes of inconsistency.
Key Takeaway
Two-level cache cuts latency by 80-90% for hot data over pure distributed cache.
Requires invalidation coordination across servers via pub/sub.
Always monitor hit ratios at both levels — silent evictions hurt.
Should You Implement Two-Level Caching?
IfSingle server or very low traffic (<100 req/s)
UseNot worth the complexity. Use IMemoryCache alone.
IfMulti-server, read-heavy, latency-sensitive (<10ms target)
UseStrong yes. Two-level caching provides near-instant reads for hot data.
IfMulti-server, write-heavy with frequent invalidations
UseCareful evaluation needed. Invalidation overhead may offset benefits. Consider adjusting L1 TTLs very short.
IfAlready using Redis and p95 latency is acceptable (>50ms target)
UseYou might not need the complexity. Two-level caching adds operational overhead.
● Production incidentPOST-MORTEMseverity: high

Personalised Prices Served to Wrong Users — Response Caching Breaks Authentication

Symptom
Customers reporting that they see incorrect prices — sometimes prices from other accounts. Support tickets spike, data privacy concerns raised.
Assumption
Response caching only caches static content; it won't affect personalised data if authentication is in place.
Root cause
The [ResponseCache] attribute with Location=Any was applied to an endpoint that returned user-specific prices. The middleware cached the response for the first user and served it to subsequent users regardless of authentication headers.
Fix
Removed [ResponseCache] from that endpoint and added [ResponseCache(NoStore = true, Location = ResponseCacheLocation.None)] for all authenticated endpoints. Added a policy to enforce NoStore on any endpoint that uses HttpContext.User.
Key lesson
  • Never use response caching on endpoints that return user-specific or authenticated data.
  • Always explicitly opt out of caching for personalised endpoints with [ResponseCache(NoStore = true)].
  • Review all [ResponseCache] attributes during code review — one wrong attribute can leak data across users.
Production debug guideSymptom-to-action guide for common caching problems4 entries
Symptom · 01
IMemoryCache entries never expire despite setting sliding expiration — data is stale for hours.
Fix
Check if you also set an absolute expiration. Sliding-only entries on busy endpoints live forever. Add SetAbsoluteExpiration() to enforce a hard limit.
Symptom · 02
IDistributedCache returns null but the key exists in Redis (verified via redis-cli).
Fix
Verify serialisation mismatch. IDistributedCache stores byte arrays — if you wrote with Newtonsoft.Json and read with System.Text.Json, deserialization fails silently. Use the same serializer consistently across all code paths.
Symptom · 03
Response caching middleware not caching responses with query strings — every request still hits the controller.
Fix
VaryByQueryKeys on [ResponseCache] requires app.UseResponseCaching() to be registered. Without it, the middleware ignores the Vary header and caches only the first URL. Also ensure middleware is placed before app.MapControllers().
Symptom · 04
Distributed cache extremely slow under high read load — each call takes >50ms.
Fix
Check Redis connection pool: StackExchange.Redis default connection limit may be low. Increase via ConfigurationOptions. Also consider using the two-level cache pattern (L1 local + L2 Redis) to reduce Redis round-trips for hot data.
★ Quick Cache Debugging CommandsCommands to diagnose caching issues in ASP.NET Core without leaving your terminal.
IMemoryCache entries not appearing — cache miss every time.
Immediate action
Log the cache key and check if Set() was called.
Commands
Add logging: `Console.WriteLine($"Cache key: {key}, hit: {_cache.TryGetValue(key, out var val)}");`
Check MemoryCache statistics: `_cache.GetCurrentStatistics()` (available in .NET 6+).
Fix now
Ensure you called _cache.Set() on cache miss and that cache options don't have zero expiration.
Redis distributed cache timeout / slow response.+
Immediate action
Test Redis server connectivity from the app host.
Commands
`redis-cli -h <host> -p <port> ping` should return PONG.
`dotnet-counters monitor --counters Microsoft.AspNetCore.Hosting` to see active connections and requests.
Fix now
Increase ConnectionMultiplexer pool: options.ConfigurationOptions.ConnectTimeout = 5000; options.ConfigurationOptions.SyncTimeout = 5000;
Response caching not working at all — every response hits the controller.+
Immediate action
Inspect the HTTP response headers.
Commands
`curl -I http://localhost:5000/api/endpoint | grep -i cache`
Check middleware order: ensure app.UseResponseCaching() appears before app.MapControllers() in Program.cs.
Fix now
Add [ResponseCache(Duration = 60, Location = ResponseCacheLocation.Any)] on the endpoint and verify middleware is registered.
Caching Layer Comparison
AspectIMemoryCache (In-Memory)IDistributedCache (Redis/SQL)Response Caching
Where data livesServer RAM (in-process)External store (Redis/SQL)Server RAM (HTTP response bytes)
Works across multiple serversNo — each server has its own cacheYes — all servers share one storePartially — server-side no; CDN yes
What gets cachedAny C# objectAny serialisable objectEntire HTTP response
Cache logic locationYour service/repository codeYour service/repository codeMiddleware — before controller runs
Serialisation requiredNo — stores live objectsYes — must serialise to bytes/JSONNo — stores raw HTTP response
Best forSingle-server or small appsHorizontally scaled APIsPublic, anonymous HTTP endpoints
Worst forMulti-server deploymentsFrequently changing dataAuthenticated or personalised endpoints
Setup complexityLow — one line registrationMedium — needs Redis or SQL ServerLow — middleware + attribute
Relative speedFastest (in-process RAM)Fast (but network round-trip to Redis)Fastest for matched requests

Key takeaways

1
Use IMemoryCache for single-server deployments
it is the fastest cache available, but each server instance has its own isolated store, making it wrong for horizontally scaled apps.
2
Switch to IDistributedCache (Redis) the moment you have more than one server instance
shared state across all instances is non-negotiable for consistency, even at the cost of a network hop.
3
Always pair sliding expiration with an absolute expiration cap
sliding-only entries can live forever on busy endpoints, serving arbitrarily stale data indefinitely.
4
Response Caching operates at the HTTP layer before your controller runs
it is perfect for public, anonymous endpoints but must be explicitly disabled (NoStore) for any authenticated or user-specific response to prevent data leaking between users.

Common mistakes to avoid

3 patterns
×

Caching EF Core entities instead of DTOs

Symptom
After caching, the entity's navigation properties throw ObjectDisposedException because the original DbContext is disposed, or the cached entity becomes stale when related data changes.
Fix
Always map your entities to a simple POCO/DTO before caching. Never store EF Core tracked entities in cache. Use AutoMapper or manual mapping to create a snapshot.
×

Using IMemoryCache in a multi-server (load-balanced) deployment

Symptom
Users see inconsistent data depending on which server handles their request. Server A has the updated cache, Server B still serves old data.
Fix
Swap to IDistributedCache (Redis) as soon as you have more than one instance. If you must keep IMemoryCache for performance, pair it with a distributed cache in a two-level pattern and handle invalidation via pub/sub.
×

Setting only SlidingExpiration with no AbsoluteExpiration cap

Symptom
A cache entry that is read every minute with a 10-minute sliding window literally never expires. Outdated data can live in cache indefinitely, even after the database is updated.
Fix
Always pair SetSlidingExpiration() with SetAbsoluteExpiration() to guarantee a maximum staleness window regardless of access frequency.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
What is the difference between IMemoryCache and IDistributedCache in ASP...
Q02SENIOR
Explain the cache-aside pattern. Why is it preferred over always writing...
Q03SENIOR
If Response Caching is configured correctly but cached responses are not...
Q01 of 03SENIOR

What is the difference between IMemoryCache and IDistributedCache in ASP.NET Core, and what specific scenario would force you to switch from one to the other?

ANSWER
IMemoryCache stores data in the server's RAM process-local. It's the fastest option but isolated per server. IDistributedCache stores data in an external store (Redis, SQL Server) shared across all servers. You must switch when you scale out to multiple server instances behind a load balancer, because otherwise each server has an independent cache leading to inconsistent data. The switch costs a network hop and serialisation overhead but guarantees consistency.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What is the difference between AddMemoryCache and AddDistributedMemoryCache in ASP.NET Core?
02
How do I invalidate a cache entry in ASP.NET Core when my database record changes?
03
Can I use both IMemoryCache and IDistributedCache together in the same app?
04
How do I measure cache hit ratio in production?
🔥

That's ASP.NET. Mark it forged?

6 min read · try the examples if you haven't

Previous
Background Services in ASP.NET Core
13 / 14 · ASP.NET
Next
Rate Limiting in ASP.NET Core