Appearance
Caching Patterns and Pitfalls
Introduction
Caching is the practice of storing copies of frequently accessed data in a faster storage layer to reduce latency, lower database load, and improve application throughput. While conceptually simple, caching introduces subtle complexity around data consistency, invalidation timing, and failure modes that can turn a performance optimization into a production nightmare. Understanding the core caching patterns — and the pitfalls that accompany each — is essential for any developer building scalable, reliable systems.
Core Concepts
What Is a Cache?
A cache is a high-speed data storage layer that sits between your application and a slower backing store (database, remote API, filesystem). When a request arrives, the application checks the cache first. If the data is present (a cache hit), it's returned immediately. If not (a cache miss), the data is fetched from the origin, stored in the cache for future use, and then returned.
Key Terminology
| Term | Definition |
|---|---|
| Cache Hit | Requested data found in cache |
| Cache Miss | Requested data not in cache; must fetch from origin |
| Hit Ratio | Percentage of requests served from cache |
| TTL (Time-To-Live) | Duration before a cached entry expires |
| Eviction | Removing entries from cache due to capacity limits |
| Invalidation | Explicitly removing or updating stale cache entries |
| Cold Start | Cache is empty, all requests are misses |
| Warm Cache | Cache populated with frequently accessed data |
Cache Layers in a Typical Architecture
Caching Patterns
1. Cache-Aside (Lazy Loading)
The most common caching pattern. The application is responsible for reading from and writing to the cache. On a miss, the application fetches from the database, populates the cache, and returns the result.
Pros: Only requested data is cached (no wasted memory). Cache failures don't break the application — it falls back to DB.
Cons: First request always slow (cache miss penalty). Data can become stale.
java
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.Optional;
public class CacheAsideExample {
// Simulated cache and database
private final Map<String, String> cache = new ConcurrentHashMap<>();
private final Map<String, String> database = new ConcurrentHashMap<>();
public CacheAsideExample() {
// Seed the database
database.put("user:1", "{\"id\":1,\"name\":\"Alice\"}");
database.put("user:2", "{\"id\":2,\"name\":\"Bob\"}");
}
public Optional<String> getUser(String key) {
// Step 1: Check cache
String cached = cache.get(key);
if (cached != null) {
System.out.println("Cache HIT for " + key);
return Optional.of(cached);
}
System.out.println("Cache MISS for " + key);
// Step 2: Fetch from database
String fromDb = database.get(key);
if (fromDb == null) {
System.out.println("Not found in database: " + key);
return Optional.empty();
}
// Step 3: Populate cache
cache.put(key, fromDb);
System.out.println("Cached " + key);
return Optional.of(fromDb);
}
public void updateUser(String key, String value) {
// Update database first
database.put(key, value);
// Invalidate cache
cache.remove(key);
System.out.println("Updated DB and invalidated cache for " + key);
}
public static void main(String[] args) {
CacheAsideExample service = new CacheAsideExample();
// First call: cache miss
service.getUser("user:1");
// Second call: cache hit
service.getUser("user:1");
// Update triggers invalidation
service.updateUser("user:1", "{\"id\":1,\"name\":\"Alice Updated\"}");
// Next call: cache miss again, fetches updated data
service.getUser("user:1");
}
}2. Read-Through Cache
The cache itself is responsible for loading data from the origin on a miss. The application only talks to the cache — never directly to the database for reads.
java
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.function.Function;
public class ReadThroughCache<K, V> {
private final Map<K, V> store = new ConcurrentHashMap<>();
private final Function<K, V> loader;
public ReadThroughCache(Function<K, V> loader) {
this.loader = loader;
}
public V get(K key) {
return store.computeIfAbsent(key, k -> {
System.out.println("Cache MISS — loading from origin for: " + k);
V value = loader.apply(k);
if (value == null) {
throw new RuntimeException("Key not found in origin: " + k);
}
return value;
});
}
public void invalidate(K key) {
store.remove(key);
}
public static void main(String[] args) {
// Simulated database
Map<String, String> db = Map.of(
"product:1", "Laptop",
"product:2", "Phone"
);
ReadThroughCache<String, String> cache = new ReadThroughCache<>(db::get);
System.out.println(cache.get("product:1")); // MISS, loads from DB
System.out.println(cache.get("product:1")); // HIT
System.out.println(cache.get("product:2")); // MISS, loads from DB
}
}3. Write-Through Cache
Every write goes to the cache and the backing store synchronously. The write is only considered complete when both the cache and database have been updated.
Pros: Cache and DB are always consistent. No stale reads after writes.
Cons: Higher write latency (must wait for both). Caches data that may never be read.
4. Write-Behind (Write-Back) Cache
Writes go to the cache immediately, and the cache asynchronously flushes changes to the database in the background. This dramatically reduces write latency.
Pros: Very low write latency. Can batch and coalesce writes.
Cons: Risk of data loss if cache fails before flushing. Complex to implement correctly.
java
import java.util.Map;
import java.util.concurrent.*;
public class WriteBehindCache {
private final Map<String, String> cache = new ConcurrentHashMap<>();
private final Map<String, String> database = new ConcurrentHashMap<>();
private final BlockingQueue<Map.Entry<String, String>> writeQueue = new LinkedBlockingQueue<>();
private final ScheduledExecutorService flusher = Executors.newSingleThreadScheduledExecutor();
public WriteBehindCache() {
// Flush every 2 seconds
flusher.scheduleAtFixedRate(this::flush, 2, 2, TimeUnit.SECONDS);
}
public void put(String key, String value) {
cache.put(key, value);
writeQueue.offer(Map.entry(key, value));
System.out.println("Cached " + key + " (write queued)");
}
public String get(String key) {
return cache.getOrDefault(key, database.get(key));
}
private void flush() {
int count = 0;
Map.Entry<String, String> entry;
while ((entry = writeQueue.poll()) != null) {
database.put(entry.getKey(), entry.getValue());
count++;
}
if (count > 0) {
System.out.println("Flushed " + count + " entries to database");
}
}
public void shutdown() {
flush(); // Final flush
flusher.shutdown();
}
public static void main(String[] args) throws InterruptedException {
WriteBehindCache cache = new WriteBehindCache();
cache.put("session:abc", "{user:'Alice',lastAccess:'now'}");
cache.put("session:def", "{user:'Bob',lastAccess:'now'}");
System.out.println("Read from cache: " + cache.get("session:abc"));
// Wait for async flush
Thread.sleep(3000);
cache.shutdown();
}
}5. Refresh-Ahead
The cache proactively refreshes entries before they expire. If an entry is accessed and its TTL is close to expiring (within a configurable threshold), the cache triggers an asynchronous background refresh.
Pattern Comparison
Eviction Strategies
When a cache reaches capacity, it must decide which entries to remove.
java
import java.util.LinkedHashMap;
import java.util.Map;
public class LRUCache<K, V> extends LinkedHashMap<K, V> {
private final int maxSize;
public LRUCache(int maxSize) {
// accessOrder=true makes it LRU
super(maxSize, 0.75f, true);
this.maxSize = maxSize;
}
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
boolean shouldRemove = size() > maxSize;
if (shouldRemove) {
System.out.println("Evicting: " + eldest.getKey() + " -> " + eldest.getValue());
}
return shouldRemove;
}
public static void main(String[] args) {
LRUCache<String, String> cache = new LRUCache<>(3);
cache.put("a", "1");
cache.put("b", "2");
cache.put("c", "3");
System.out.println("Cache: " + cache); // {a=1, b=2, c=3}
cache.get("a"); // Access 'a' — moves it to most recent
cache.put("d", "4"); // Evicts 'b' (least recently used)
System.out.println("Cache after adding d: " + cache); // {c=3, a=1, d=4}
cache.put("e", "5"); // Evicts 'c'
System.out.println("Cache after adding e: " + cache); // {a=1, d=4, e=5}
}
}Common Pitfalls
Pitfall 1: Cache Stampede (Thundering Herd)
When a popular cache entry expires, hundreds of concurrent requests all experience a cache miss simultaneously and flood the database.
Solution: Locking / Single-Flight
Only one request fetches from the origin; others wait for the result.
java
import java.util.Map;
import java.util.concurrent.*;
public class StampedeProtectedCache {
private final Map<String, String> cache = new ConcurrentHashMap<>();
private final Map<String, CompletableFuture<String>> inflightRequests = new ConcurrentHashMap<>();
private final Map<String, String> database = new ConcurrentHashMap<>();
public StampedeProtectedCache() {
database.put("hot-item", "{\"name\":\"Popular Widget\",\"price\":29.99}");
}
public String get(String key) throws Exception {
// Check cache
String cached = cache.get(key);
if (cached != null) {
return cached;
}
// Single-flight: only one request loads from DB
CompletableFuture<String> future = inflightRequests.computeIfAbsent(key, k -> {
System.out.println(Thread.currentThread().getName() + " is the loader for " + k);
return CompletableFuture.supplyAsync(() -> {
try {
Thread.sleep(500); // Simulate slow DB query
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
String value = database.get(k);
if (value != null) {
cache.put(k, value);
}
inflightRequests.remove(k);
return value;
});
});
System.out.println(Thread.currentThread().getName() + " waiting for result of " + key);
return future.get(5, TimeUnit.SECONDS);
}
public static void main(String[] args) throws Exception {
StampedeProtectedCache cache = new StampedeProtectedCache();
ExecutorService pool = Executors.newFixedThreadPool(5);
// Simulate 5 concurrent requests for the same key
for (int i = 0; i < 5; i++) {
pool.submit(() -> {
try {
String result = cache.get("hot-item");
System.out.println(Thread.currentThread().getName() + " got: " + result);
} catch (Exception e) {
e.printStackTrace();
}
});
}
pool.shutdown();
pool.awaitTermination(10, TimeUnit.SECONDS);
}
}Pitfall 2: Cache Penetration
Requests for keys that don't exist in the database bypass the cache every time, because there's nothing to cache. Attackers can exploit this to overwhelm the database.
Solution: Cache negative results (null markers) with short TTLs, or use a Bloom filter.
Pitfall 3: Cache Avalanche
Many cache entries expire at the same time (e.g., all set with the same TTL during a cold start), causing a massive spike in database load.
Solution: Add random jitter to TTL values.
java
import java.util.concurrent.ThreadLocalRandom;
public class TTLJitter {
private static final int BASE_TTL_SECONDS = 300; // 5 minutes
private static final int JITTER_RANGE_SECONDS = 60; // ±60 seconds
public static int calculateTTL() {
int jitter = ThreadLocalRandom.current().nextInt(-JITTER_RANGE_SECONDS, JITTER_RANGE_SECONDS + 1);
return BASE_TTL_SECONDS + jitter;
}
public static void main(String[] args) {
System.out.println("TTL values with jitter:");
for (int i = 0; i < 10; i++) {
System.out.println(" Entry " + i + ": " + calculateTTL() + "s");
}
}
}Pitfall 4: Stale Data and Inconsistency
The classic cache invalidation problem. After a database update, the cache may still serve old data.
Pitfall 5: Hot Key Problem
A single key receives disproportionately high traffic, overwhelming the cache node that stores it.
Redis Caching with AWS SDK for Java
A practical example using AWS ElastiCache (Redis-compatible) through the Jedis client:
java
import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisPoolConfig;
public class RedisCacheService {
private final JedisPool pool;
private static final int DEFAULT_TTL = 300;
public RedisCacheService(String host, int port) {
JedisPoolConfig config = new JedisPoolConfig();
config.setMaxTotal(50);
config.setMaxIdle(10);
config.setMinIdle(2);
config.setTestOnBorrow(true);
this.pool = new JedisPool(config, host, port);
}
public String getOrLoad(String key, DatabaseLoader loader) {
try (Jedis jedis = pool.getResource()) {
// Try cache
String cached = jedis.get(key);
if (cached != null) {
if ("__NULL__".equals(cached)) {
return null; // Negative cache — prevents cache penetration
}
return cached;
}
// Load from database
String value = loader.load(key);
if (value != null) {
int ttl = DEFAULT_TTL + (int) (Math.random() * 60); // Jitter
jedis.setex(key, ttl, value);
} else {
// Cache negative result with short TTL
jedis.setex(key, 30, "__NULL__");
}
return value;
} catch (Exception e) {
System.err.println("Cache error, falling back to DB: " + e.getMessage());
// Graceful degradation: skip cache on failure
return loader.load(key);
}
}
public void invalidate(String key) {
try (Jedis jedis = pool.getResource()) {
jedis.del(key);
}
}
@FunctionalInterface
public interface DatabaseLoader {
String load(String key);
}
public void close() {
pool.close();
}
}Cache Invalidation Strategies
Monitoring and Observability
Key metrics to track for any cache:
| Metric | Target | Action If Violated |
|---|---|---|
| Hit Ratio | > 80% | Check access patterns, increase TTL |
| Latency (p99) | < 5ms | Scale cache cluster, check network |
| Eviction Rate | Low/stable | Increase cache size |
| Memory Usage | < 80% capacity | Add nodes or reduce TTL |
| Connection Pool Utilization | < 70% | Increase pool size |
Best Practices
Start with Cache-Aside: It's the simplest pattern and works for most use cases. Only adopt more complex patterns when you have specific requirements.
Always set TTLs: Never cache data indefinitely. Even long TTLs (hours) protect against unbounded staleness and memory growth.
Add jitter to TTLs: Prevent cache avalanche by randomizing expiration times. A simple
baseTTL + random(0, jitterRange)is sufficient.Implement graceful degradation: When the cache is unavailable, fall back to the database. A cache failure should never become an application failure.
Use negative caching sparingly: Cache "not found" results with very short TTLs (15–30 seconds) to prevent cache penetration without permanently hiding newly created records.
Protect against stampedes: Use locking, single-flight patterns, or early refresh to prevent concurrent cache misses from overwhelming the origin.
Monitor your hit ratio obsessively: A declining hit ratio is the earliest indicator of misconfigured TTLs, insufficient cache size, or changed access patterns.
Prefer invalidation over update: When data changes, delete the cache key rather than trying to update it. This avoids race conditions between concurrent writers.
Size your cache based on working set: Cache the 20% of data that serves 80% of requests. Caching everything is wasteful and counterproductive.
Never cache sensitive data without encryption: Session tokens, PII, and financial data in cache must be encrypted at rest and in transit (TLS to Redis, encrypted ElastiCache).
Related Concepts
- Eventual Consistency — Caching introduces eventual consistency between cache and database; understanding this model is essential.
- Asynchronous Programming — Write-behind and refresh-ahead patterns rely on async processing.
- High-Performance Streaming Operations — Caching complements streaming for high-throughput data pipelines.
- Serverless and Container Workloads — Cache cold starts are especially impactful in serverless environments where instances scale to zero.