Caching in ASP.NET Core: In-Memory, Distributed & Response Caching Explained
Every millisecond your API spends fetching the same database row it fetched three seconds ago is a millisecond wasted — and under load, those milliseconds stack up into seconds that cost you users. Caching is not a micro-optimisation; it is the difference between an app that collapses under real traffic and one that scales gracefully. High-traffic systems like e-commerce product pages, news feeds, and dashboards owe most of their performance not to faster hardware, but to well-designed caches.
The core problem caching solves is the cost of repetition. Database queries, HTTP calls to third-party APIs, and complex in-memory computations all take time proportional to their complexity — not proportional to how often you call them. Without caching, a product page hit 10,000 times a minute fires 10,000 identical SQL queries. With caching, it fires one, and returns the stored result for the other 9,999. The challenge — and the reason most developers get caching wrong — is deciding what to cache, for how long, and when to throw it away.
By the end of this article you will understand the three main caching layers available in ASP.NET Core (In-Memory, Distributed, and Response), know exactly which one to reach for in a given situation, and be able to implement each with production-grade patterns including cache-aside, sliding expiration, and cache invalidation. You will also walk away knowing the mistakes that silently destroy cache effectiveness in real apps.
In-Memory Caching with IMemoryCache — Fast, Simple, Single-Server
In-Memory caching stores data directly in the RAM of your web server process. It is the fastest cache available because there is zero network round-trip — the data lives in the same memory space as your application. ASP.NET Core exposes this through the IMemoryCache interface, which you register once and inject anywhere.
The pattern you will use 99% of the time is called cache-aside (also known as lazy loading): you try to get the value from cache first; if it is not there (a 'cache miss'), you fetch it from the real source, store it in cache, then return it. On the next call, you get a 'cache hit' and skip the expensive work entirely.
In-memory cache is the right choice when you have a single server deployment or when the cached data is local to one server (like a per-user preferences object). It is the wrong choice when you run multiple server instances behind a load balancer — because each server has its own isolated cache, and a user hitting Server A might get stale data that Server B already updated. That is the scenario where distributed caching becomes essential.
MemoryCache entries support both absolute expiration (evict after exactly N minutes) and sliding expiration (evict if nobody reads it for N minutes). Use sliding expiration for 'warm' data that is accessed frequently; use absolute for data that must stay fresh regardless of traffic.
using Microsoft.Extensions.Caching.Memory; using System; using System.Threading.Tasks; // Register IMemoryCache in Program.cs: // builder.Services.AddMemoryCache(); public class ProductService { private readonly IMemoryCache _cache; private readonly IProductRepository _repository; // Cache key constants prevent typos across the codebase private const string ProductCacheKeyPrefix = "product_"; private static readonly TimeSpan ProductCacheDuration = TimeSpan.FromMinutes(10); public ProductService(IMemoryCache cache, IProductRepository repository) { _cache = cache; _repository = repository; } public async Task<Product?> GetProductByIdAsync(int productId) { // Build a unique key per product so we can invalidate one without flushing all string cacheKey = $"{ProductCacheKeyPrefix}{productId}"; // TryGetValue returns true on a cache HIT — we skip the database entirely if (_cache.TryGetValue(cacheKey, out Product? cachedProduct)) { Console.WriteLine($"[CACHE HIT] Returning product {productId} from memory cache."); return cachedProduct; } // Cache MISS — go to the real data source Console.WriteLine($"[CACHE MISS] Fetching product {productId} from database."); Product? product = await _repository.GetByIdAsync(productId); if (product is not null) { // Configure cache entry options before storing var cacheOptions = new MemoryCacheEntryOptions() // Evict this entry if it hasn't been accessed in 5 minutes (sliding) .SetSlidingExpiration(TimeSpan.FromMinutes(5)) // But always evict after 10 minutes regardless of access (absolute) .SetAbsoluteExpiration(ProductCacheDuration) // Mark as normal priority — the runtime can evict under memory pressure .SetPriority(CacheItemPriority.Normal); _cache.Set(cacheKey, product, cacheOptions); Console.WriteLine($"[CACHE SET] Product {productId} stored in memory cache."); } return product; } // Call this when a product is updated so the next read fetches fresh data public void InvalidateProductCache(int productId) { string cacheKey = $"{ProductCacheKeyPrefix}{productId}"; _cache.Remove(cacheKey); Console.WriteLine($"[CACHE INVALIDATED] Removed product {productId} from cache."); } } // --- Simulated output for two sequential calls to GetProductByIdAsync(42) --- // First call: // [CACHE MISS] Fetching product 42 from database. // [CACHE SET] Product 42 stored in memory cache. // // Second call (within 5 minutes): // [CACHE HIT] Returning product 42 from memory cache.
[CACHE SET] Product 42 stored in memory cache.
[CACHE HIT] Returning product 42 from memory cache.
Distributed Caching with IDistributedCache — Sharing State Across Multiple Servers
When you scale your app horizontally — multiple instances behind a load balancer — in-memory cache breaks down because each instance has its own isolated memory. User A might update a record on Server 1, but User B hits Server 2 which still has the old cached version. This is a consistency bug, not just a performance issue.
Distributed caching solves this by putting the cache outside the application in a shared store — typically Redis or SQL Server. All instances read from and write to the same cache, so everyone sees the same data. ASP.NET Core abstracts this behind IDistributedCache, meaning you can swap Redis for SQL Server (or vice versa) by changing one line in Program.cs without touching your service code.
Redis is the industry standard choice. It is an in-memory data store purpose-built for speed, supporting complex data types, pub/sub for cache invalidation, and cluster mode for high availability. SQL Server distributed cache exists for environments where you already have SQL infrastructure and cannot add Redis — but it is meaningfully slower.
The IDistributedCache API works with byte arrays, so you need to serialise your objects. The standard approach is JSON serialisation with System.Text.Json. A cleaner pattern is to wrap IDistributedCache in your own generic helper that handles serialisation transparently — which is exactly what the example below does.
// Program.cs registration (Redis example): // builder.Services.AddStackExchangeRedisCache(options => // { // options.Configuration = builder.Configuration.GetConnectionString("Redis"); // options.InstanceName = "MyApp:"; // Prefix all keys to avoid collisions // }); // // For SQL Server instead, use: // builder.Services.AddDistributedSqlServerCache(options => { ... }); using Microsoft.Extensions.Caching.Distributed; using System.Text.Json; using System.Threading; using System.Threading.Tasks; // A generic wrapper that hides the byte-array pain of IDistributedCache public class DistributedCacheService { private readonly IDistributedCache _distributedCache; public DistributedCacheService(IDistributedCache distributedCache) { _distributedCache = distributedCache; } // Returns cached value or null on a miss — caller decides what to do public async Task<T?> GetAsync<T>(string cacheKey, CancellationToken cancellationToken = default) where T : class { byte[]? cachedBytes = await _distributedCache.GetAsync(cacheKey, cancellationToken); if (cachedBytes is null || cachedBytes.Length == 0) { return null; // Cache miss } // Deserialise from JSON bytes back to the strongly-typed object return JsonSerializer.Deserialize<T>(cachedBytes); } public async Task SetAsync<T>( string cacheKey, T value, TimeSpan absoluteExpiration, CancellationToken cancellationToken = default) where T : class { byte[] serialisedBytes = JsonSerializer.SerializeToUtf8Bytes(value); var cacheEntryOptions = new DistributedCacheEntryOptions { // Absolute expiration from now — always evict after this window AbsoluteExpirationRelativeToNow = absoluteExpiration }; await _distributedCache.SetAsync(cacheKey, serialisedBytes, cacheEntryOptions, cancellationToken); } public async Task RemoveAsync(string cacheKey, CancellationToken cancellationToken = default) { await _distributedCache.RemoveAsync(cacheKey, cancellationToken); } } // --- Usage in a controller or service --- public class OrderSummaryService { private readonly DistributedCacheService _cache; private readonly IOrderRepository _orderRepository; private static readonly TimeSpan OrderSummaryCacheDuration = TimeSpan.FromMinutes(15); public OrderSummaryService(DistributedCacheService cache, IOrderRepository orderRepository) { _cache = cache; _orderRepository = orderRepository; } public async Task<OrderSummary?> GetOrderSummaryAsync(int customerId, CancellationToken cancellationToken) { string cacheKey = $"order_summary_{customerId}"; // Step 1: Try the distributed cache first OrderSummary? cached = await _cache.GetAsync<OrderSummary>(cacheKey, cancellationToken); if (cached is not null) { Console.WriteLine($"[REDIS HIT] Order summary for customer {customerId} served from Redis."); return cached; } // Step 2: Cache miss — hit the database Console.WriteLine($"[REDIS MISS] Querying database for customer {customerId} order summary."); OrderSummary? summary = await _orderRepository.GetSummaryByCustomerIdAsync(customerId, cancellationToken); // Step 3: Store in Redis for the next caller — all server instances benefit if (summary is not null) { await _cache.SetAsync(cacheKey, summary, OrderSummaryCacheDuration, cancellationToken); Console.WriteLine($"[REDIS SET] Customer {customerId} summary cached for 15 minutes."); } return summary; } }
[REDIS SET] Customer 7 summary cached for 15 minutes.
[REDIS HIT] Order summary for customer 7 served from Redis.
Response Caching — Cache at the HTTP Layer Before Your Code Even Runs
In-Memory and Distributed caching are application-level caches — your C# code still runs and decides whether to call the database. Response Caching works at a completely different layer: it caches the entire HTTP response and serves it directly from middleware, before your controller action ever executes. This is the fastest possible cache because zero application code runs on a cache hit.
Response caching follows HTTP caching semantics via Cache-Control headers. When your action returns a response with Cache-Control: public, max-age=60, both the ASP.NET Core response cache middleware (server-side) and downstream proxies or CDNs (like Cloudflare or Azure CDN) know they can cache and replay that response for 60 seconds.
This makes it ideal for public, anonymous content: marketing pages, product catalogues, news articles — content that is the same for every user. It is completely wrong for authenticated or personalised content because cached responses ignore who the user is. A response cached for User A would be served to User B.
The [ResponseCache] attribute controls the Cache-Control header. The AddResponseCaching() middleware does the actual server-side caching. Both are needed for server-side caching to work — the attribute alone just sets the header for downstream caches like CDNs.
// Program.cs — register and use the middleware (ORDER MATTERS): // builder.Services.AddResponseCaching(); // ... // app.UseResponseCaching(); // Must come before app.MapControllers() using Microsoft.AspNetCore.Mvc; using System.Collections.Generic; using System.Threading.Tasks; [ApiController] [Route("api/[controller]")] public class CatalogueController : ControllerBase { private readonly ICatalogueService _catalogueService; public CatalogueController(ICatalogueService catalogueService) { _catalogueService = catalogueService; } // This response is cached for 60 seconds server-side AND tells CDNs they can cache it too. // 'public' means any cache (server, CDN, proxy) can store this response. // 'VaryByQueryKeys' means separate cache entries are kept per 'category' query param // so /api/catalogue?category=shoes and ?category=hats each get their own cached version. [HttpGet] [ResponseCache(Duration = 60, Location = ResponseCacheLocation.Any, VaryByQueryKeys = new[] { "category" })] public async Task<IActionResult> GetProductsAsync([FromQuery] string category = "all") { Console.WriteLine($"[CONTROLLER HIT] Fetching products for category: {category}"); // This line only runs on a cache MISS — during a 60-second window it runs once per category IEnumerable<ProductSummary> products = await _catalogueService.GetByCategoryAsync(category); return Ok(products); } // [ResponseCache(NoStore = true)] explicitly opts this endpoint OUT of caching. // Use this for any endpoint that returns user-specific or sensitive data. [HttpGet("my-orders")] [ResponseCache(NoStore = true, Location = ResponseCacheLocation.None)] public async Task<IActionResult> GetMyOrdersAsync() { // Each call always hits the controller — never cached var orders = await _catalogueService.GetOrdersForCurrentUserAsync(); return Ok(orders); } } // HTTP Response headers produced by the first endpoint: // Cache-Control: public,max-age=60 // Vary: Accept-Encoding // // Subsequent requests within 60 seconds return: // [Served from response cache — controller action does NOT execute] // // HTTP Response headers produced by the second endpoint: // Cache-Control: no-store,no-cache
[CONTROLLER HIT] Fetching products for category: shoes
-> Response cached for 60 seconds
GET /api/catalogue?category=shoes (within 60 seconds)
-> Served from response cache. Controller NOT called.
GET /api/catalogue?category=hats
[CONTROLLER HIT] Fetching products for category: hats
-> Separate cache entry created for 'hats'
| Aspect | IMemoryCache (In-Memory) | IDistributedCache (Redis/SQL) | Response Caching |
|---|---|---|---|
| Where data lives | Server RAM (in-process) | External store (Redis/SQL) | Server RAM (HTTP response bytes) |
| Works across multiple servers | No — each server has its own cache | Yes — all servers share one store | Partially — server-side no; CDN yes |
| What gets cached | Any C# object | Any serialisable object | Entire HTTP response |
| Cache logic location | Your service/repository code | Your service/repository code | Middleware — before controller runs |
| Serialisation required | No — stores live objects | Yes — must serialise to bytes/JSON | No — stores raw HTTP response |
| Best for | Single-server or small apps | Horizontally scaled APIs | Public, anonymous HTTP endpoints |
| Worst for | Multi-server deployments | Frequently changing data | Authenticated or personalised endpoints |
| Setup complexity | Low — one line registration | Medium — needs Redis or SQL Server | Low — middleware + attribute |
| Relative speed | Fastest (in-process RAM) | Fast (but network round-trip to Redis) | Fastest for matched requests |
🎯 Key Takeaways
- Use IMemoryCache for single-server deployments — it is the fastest cache available, but each server instance has its own isolated store, making it wrong for horizontally scaled apps.
- Switch to IDistributedCache (Redis) the moment you have more than one server instance — shared state across all instances is non-negotiable for consistency, even at the cost of a network hop.
- Always pair sliding expiration with an absolute expiration cap — sliding-only entries can live forever on busy endpoints, serving arbitrarily stale data indefinitely.
- Response Caching operates at the HTTP layer before your controller runs — it is perfect for public, anonymous endpoints but must be explicitly disabled (NoStore) for any authenticated or user-specific response to prevent data leaking between users.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Caching objects that contain DbContext or EF Core entities directly — the cached entity stays attached to a disposed DbContext, causing ObjectDisposedException or stale navigation properties on the next access. Fix: Always cache plain DTOs or value types, never EF Core tracked entities. Map to a DTO before calling _cache.Set().
- ✕Mistake 2: Using IMemoryCache in a multi-server (load-balanced) deployment and wondering why users see inconsistent data — each server's cache is independent, so Server A can serve stale data that Server B already invalidated. Fix: Swap to IDistributedCache (Redis) as soon as you have more than one instance, or use Response Caching with a shared CDN layer.
- ✕Mistake 3: Setting only SlidingExpiration with no AbsoluteExpiration cap — a cache entry that is read every minute with a 10-minute sliding window literally never expires, so outdated data can live in cache indefinitely even after the database is updated. Fix: Always pair SetSlidingExpiration() with SetAbsoluteExpiration() to guarantee a maximum staleness window regardless of access frequency.
Interview Questions on This Topic
- QWhat is the difference between IMemoryCache and IDistributedCache in ASP.NET Core, and what specific scenario would force you to switch from one to the other?
- QExplain the cache-aside pattern. Why is it preferred over always writing to the cache on every database write, and what are its consistency trade-offs?
- QIf Response Caching is configured correctly but cached responses are not being served for requests with query strings, what is the most likely cause and how would you diagnose it?
Frequently Asked Questions
What is the difference between AddMemoryCache and AddDistributedMemoryCache in ASP.NET Core?
AddMemoryCache registers IMemoryCache — a true in-process memory cache tied to the server's RAM. AddDistributedMemoryCache registers an IDistributedCache implementation that also uses in-process memory, but behind the distributed cache interface. It exists purely for local development and testing so you can code against IDistributedCache without needing a real Redis server running. Never use AddDistributedMemoryCache in production — it has the same multi-server isolation problem as IMemoryCache.
How do I invalidate a cache entry in ASP.NET Core when my database record changes?
For IMemoryCache call _cache.Remove(cacheKey) immediately after your database update succeeds. For IDistributedCache call await _distributedCache.RemoveAsync(cacheKey). The cleanest architecture is to invalidate in the same service method that performs the write — update the database, then evict the cache key — so the next read triggers a fresh fetch. For complex scenarios with many related keys, use a cache key prefix strategy or Redis tag-based invalidation patterns.
Can I use both IMemoryCache and IDistributedCache together in the same app?
Yes, and this is actually a common production pattern called a two-level (L1/L2) cache. You check IMemoryCache first (L1 — fastest, no network), and only on a miss do you check IDistributedCache (L2 — Redis). On an L2 hit, you populate L1 so the next request on that same server is instant. This reduces Redis round-trips significantly under high read traffic while keeping multi-server consistency intact.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.