Skip to content

Caching Patterns and Pitfalls

Introduction

Caching is the practice of storing copies of frequently accessed data in a faster storage layer to reduce latency, lower database load, and improve application throughput. While conceptually simple, caching introduces subtle complexity around data consistency, invalidation timing, and failure modes that can turn a performance optimization into a production nightmare. Understanding the core caching patterns — and the pitfalls that accompany each — is essential for any developer building scalable, reliable systems.

Core Concepts

What Is a Cache?

A cache is a high-speed data storage layer that sits between your application and a slower backing store (database, remote API, filesystem). When a request arrives, the application checks the cache first. If the data is present (a cache hit), it's returned immediately. If not (a cache miss), the data is fetched from the origin, stored in the cache for future use, and then returned.

Key Terminology

TermDefinition
Cache HitRequested data found in cache
Cache MissRequested data not in cache; must fetch from origin
Hit RatioPercentage of requests served from cache
TTL (Time-To-Live)Duration before a cached entry expires
EvictionRemoving entries from cache due to capacity limits
InvalidationExplicitly removing or updating stale cache entries
Cold StartCache is empty, all requests are misses
Warm CacheCache populated with frequently accessed data

Cache Layers in a Typical Architecture

Caching Patterns

1. Cache-Aside (Lazy Loading)

The most common caching pattern. The application is responsible for reading from and writing to the cache. On a miss, the application fetches from the database, populates the cache, and returns the result.

Pros: Only requested data is cached (no wasted memory). Cache failures don't break the application — it falls back to DB.

Cons: First request always slow (cache miss penalty). Data can become stale.

java
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.Optional;

public class CacheAsideExample {

    // Simulated cache and database
    private final Map<String, String> cache = new ConcurrentHashMap<>();
    private final Map<String, String> database = new ConcurrentHashMap<>();

    public CacheAsideExample() {
        // Seed the database
        database.put("user:1", "{\"id\":1,\"name\":\"Alice\"}");
        database.put("user:2", "{\"id\":2,\"name\":\"Bob\"}");
    }

    public Optional<String> getUser(String key) {
        // Step 1: Check cache
        String cached = cache.get(key);
        if (cached != null) {
            System.out.println("Cache HIT for " + key);
            return Optional.of(cached);
        }

        System.out.println("Cache MISS for " + key);

        // Step 2: Fetch from database
        String fromDb = database.get(key);
        if (fromDb == null) {
            System.out.println("Not found in database: " + key);
            return Optional.empty();
        }

        // Step 3: Populate cache
        cache.put(key, fromDb);
        System.out.println("Cached " + key);

        return Optional.of(fromDb);
    }

    public void updateUser(String key, String value) {
        // Update database first
        database.put(key, value);
        // Invalidate cache
        cache.remove(key);
        System.out.println("Updated DB and invalidated cache for " + key);
    }

    public static void main(String[] args) {
        CacheAsideExample service = new CacheAsideExample();

        // First call: cache miss
        service.getUser("user:1");
        // Second call: cache hit
        service.getUser("user:1");
        // Update triggers invalidation
        service.updateUser("user:1", "{\"id\":1,\"name\":\"Alice Updated\"}");
        // Next call: cache miss again, fetches updated data
        service.getUser("user:1");
    }
}

2. Read-Through Cache

The cache itself is responsible for loading data from the origin on a miss. The application only talks to the cache — never directly to the database for reads.

java
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.function.Function;

public class ReadThroughCache<K, V> {

    private final Map<K, V> store = new ConcurrentHashMap<>();
    private final Function<K, V> loader;

    public ReadThroughCache(Function<K, V> loader) {
        this.loader = loader;
    }

    public V get(K key) {
        return store.computeIfAbsent(key, k -> {
            System.out.println("Cache MISS — loading from origin for: " + k);
            V value = loader.apply(k);
            if (value == null) {
                throw new RuntimeException("Key not found in origin: " + k);
            }
            return value;
        });
    }

    public void invalidate(K key) {
        store.remove(key);
    }

    public static void main(String[] args) {
        // Simulated database
        Map<String, String> db = Map.of(
            "product:1", "Laptop",
            "product:2", "Phone"
        );

        ReadThroughCache<String, String> cache = new ReadThroughCache<>(db::get);

        System.out.println(cache.get("product:1")); // MISS, loads from DB
        System.out.println(cache.get("product:1")); // HIT
        System.out.println(cache.get("product:2")); // MISS, loads from DB
    }
}

3. Write-Through Cache

Every write goes to the cache and the backing store synchronously. The write is only considered complete when both the cache and database have been updated.

Pros: Cache and DB are always consistent. No stale reads after writes.

Cons: Higher write latency (must wait for both). Caches data that may never be read.

4. Write-Behind (Write-Back) Cache

Writes go to the cache immediately, and the cache asynchronously flushes changes to the database in the background. This dramatically reduces write latency.

Pros: Very low write latency. Can batch and coalesce writes.

Cons: Risk of data loss if cache fails before flushing. Complex to implement correctly.

java
import java.util.Map;
import java.util.concurrent.*;

public class WriteBehindCache {

    private final Map<String, String> cache = new ConcurrentHashMap<>();
    private final Map<String, String> database = new ConcurrentHashMap<>();
    private final BlockingQueue<Map.Entry<String, String>> writeQueue = new LinkedBlockingQueue<>();
    private final ScheduledExecutorService flusher = Executors.newSingleThreadScheduledExecutor();

    public WriteBehindCache() {
        // Flush every 2 seconds
        flusher.scheduleAtFixedRate(this::flush, 2, 2, TimeUnit.SECONDS);
    }

    public void put(String key, String value) {
        cache.put(key, value);
        writeQueue.offer(Map.entry(key, value));
        System.out.println("Cached " + key + " (write queued)");
    }

    public String get(String key) {
        return cache.getOrDefault(key, database.get(key));
    }

    private void flush() {
        int count = 0;
        Map.Entry<String, String> entry;
        while ((entry = writeQueue.poll()) != null) {
            database.put(entry.getKey(), entry.getValue());
            count++;
        }
        if (count > 0) {
            System.out.println("Flushed " + count + " entries to database");
        }
    }

    public void shutdown() {
        flush(); // Final flush
        flusher.shutdown();
    }

    public static void main(String[] args) throws InterruptedException {
        WriteBehindCache cache = new WriteBehindCache();

        cache.put("session:abc", "{user:'Alice',lastAccess:'now'}");
        cache.put("session:def", "{user:'Bob',lastAccess:'now'}");

        System.out.println("Read from cache: " + cache.get("session:abc"));

        // Wait for async flush
        Thread.sleep(3000);
        cache.shutdown();
    }
}

5. Refresh-Ahead

The cache proactively refreshes entries before they expire. If an entry is accessed and its TTL is close to expiring (within a configurable threshold), the cache triggers an asynchronous background refresh.

Pattern Comparison

Eviction Strategies

When a cache reaches capacity, it must decide which entries to remove.

java
import java.util.LinkedHashMap;
import java.util.Map;

public class LRUCache<K, V> extends LinkedHashMap<K, V> {

    private final int maxSize;

    public LRUCache(int maxSize) {
        // accessOrder=true makes it LRU
        super(maxSize, 0.75f, true);
        this.maxSize = maxSize;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        boolean shouldRemove = size() > maxSize;
        if (shouldRemove) {
            System.out.println("Evicting: " + eldest.getKey() + " -> " + eldest.getValue());
        }
        return shouldRemove;
    }

    public static void main(String[] args) {
        LRUCache<String, String> cache = new LRUCache<>(3);

        cache.put("a", "1");
        cache.put("b", "2");
        cache.put("c", "3");
        System.out.println("Cache: " + cache); // {a=1, b=2, c=3}

        cache.get("a"); // Access 'a' — moves it to most recent

        cache.put("d", "4"); // Evicts 'b' (least recently used)
        System.out.println("Cache after adding d: " + cache); // {c=3, a=1, d=4}

        cache.put("e", "5"); // Evicts 'c'
        System.out.println("Cache after adding e: " + cache); // {a=1, d=4, e=5}
    }
}

Common Pitfalls

Pitfall 1: Cache Stampede (Thundering Herd)

When a popular cache entry expires, hundreds of concurrent requests all experience a cache miss simultaneously and flood the database.

Solution: Locking / Single-Flight

Only one request fetches from the origin; others wait for the result.

java
import java.util.Map;
import java.util.concurrent.*;

public class StampedeProtectedCache {

    private final Map<String, String> cache = new ConcurrentHashMap<>();
    private final Map<String, CompletableFuture<String>> inflightRequests = new ConcurrentHashMap<>();
    private final Map<String, String> database = new ConcurrentHashMap<>();

    public StampedeProtectedCache() {
        database.put("hot-item", "{\"name\":\"Popular Widget\",\"price\":29.99}");
    }

    public String get(String key) throws Exception {
        // Check cache
        String cached = cache.get(key);
        if (cached != null) {
            return cached;
        }

        // Single-flight: only one request loads from DB
        CompletableFuture<String> future = inflightRequests.computeIfAbsent(key, k -> {
            System.out.println(Thread.currentThread().getName() + " is the loader for " + k);
            return CompletableFuture.supplyAsync(() -> {
                try {
                    Thread.sleep(500); // Simulate slow DB query
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                }
                String value = database.get(k);
                if (value != null) {
                    cache.put(k, value);
                }
                inflightRequests.remove(k);
                return value;
            });
        });

        System.out.println(Thread.currentThread().getName() + " waiting for result of " + key);
        return future.get(5, TimeUnit.SECONDS);
    }

    public static void main(String[] args) throws Exception {
        StampedeProtectedCache cache = new StampedeProtectedCache();
        ExecutorService pool = Executors.newFixedThreadPool(5);

        // Simulate 5 concurrent requests for the same key
        for (int i = 0; i < 5; i++) {
            pool.submit(() -> {
                try {
                    String result = cache.get("hot-item");
                    System.out.println(Thread.currentThread().getName() + " got: " + result);
                } catch (Exception e) {
                    e.printStackTrace();
                }
            });
        }

        pool.shutdown();
        pool.awaitTermination(10, TimeUnit.SECONDS);
    }
}

Pitfall 2: Cache Penetration

Requests for keys that don't exist in the database bypass the cache every time, because there's nothing to cache. Attackers can exploit this to overwhelm the database.

Solution: Cache negative results (null markers) with short TTLs, or use a Bloom filter.

Pitfall 3: Cache Avalanche

Many cache entries expire at the same time (e.g., all set with the same TTL during a cold start), causing a massive spike in database load.

Solution: Add random jitter to TTL values.

java
import java.util.concurrent.ThreadLocalRandom;

public class TTLJitter {

    private static final int BASE_TTL_SECONDS = 300; // 5 minutes
    private static final int JITTER_RANGE_SECONDS = 60; // ±60 seconds

    public static int calculateTTL() {
        int jitter = ThreadLocalRandom.current().nextInt(-JITTER_RANGE_SECONDS, JITTER_RANGE_SECONDS + 1);
        return BASE_TTL_SECONDS + jitter;
    }

    public static void main(String[] args) {
        System.out.println("TTL values with jitter:");
        for (int i = 0; i < 10; i++) {
            System.out.println("  Entry " + i + ": " + calculateTTL() + "s");
        }
    }
}

Pitfall 4: Stale Data and Inconsistency

The classic cache invalidation problem. After a database update, the cache may still serve old data.

Pitfall 5: Hot Key Problem

A single key receives disproportionately high traffic, overwhelming the cache node that stores it.

Redis Caching with AWS SDK for Java

A practical example using AWS ElastiCache (Redis-compatible) through the Jedis client:

java
import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisPoolConfig;

public class RedisCacheService {

    private final JedisPool pool;
    private static final int DEFAULT_TTL = 300;

    public RedisCacheService(String host, int port) {
        JedisPoolConfig config = new JedisPoolConfig();
        config.setMaxTotal(50);
        config.setMaxIdle(10);
        config.setMinIdle(2);
        config.setTestOnBorrow(true);
        this.pool = new JedisPool(config, host, port);
    }

    public String getOrLoad(String key, DatabaseLoader loader) {
        try (Jedis jedis = pool.getResource()) {
            // Try cache
            String cached = jedis.get(key);
            if (cached != null) {
                if ("__NULL__".equals(cached)) {
                    return null; // Negative cache — prevents cache penetration
                }
                return cached;
            }

            // Load from database
            String value = loader.load(key);

            if (value != null) {
                int ttl = DEFAULT_TTL + (int) (Math.random() * 60); // Jitter
                jedis.setex(key, ttl, value);
            } else {
                // Cache negative result with short TTL
                jedis.setex(key, 30, "__NULL__");
            }

            return value;

        } catch (Exception e) {
            System.err.println("Cache error, falling back to DB: " + e.getMessage());
            // Graceful degradation: skip cache on failure
            return loader.load(key);
        }
    }

    public void invalidate(String key) {
        try (Jedis jedis = pool.getResource()) {
            jedis.del(key);
        }
    }

    @FunctionalInterface
    public interface DatabaseLoader {
        String load(String key);
    }

    public void close() {
        pool.close();
    }
}

Cache Invalidation Strategies

Monitoring and Observability

Key metrics to track for any cache:

MetricTargetAction If Violated
Hit Ratio> 80%Check access patterns, increase TTL
Latency (p99)< 5msScale cache cluster, check network
Eviction RateLow/stableIncrease cache size
Memory Usage< 80% capacityAdd nodes or reduce TTL
Connection Pool Utilization< 70%Increase pool size

Best Practices

  1. Start with Cache-Aside: It's the simplest pattern and works for most use cases. Only adopt more complex patterns when you have specific requirements.

  2. Always set TTLs: Never cache data indefinitely. Even long TTLs (hours) protect against unbounded staleness and memory growth.

  3. Add jitter to TTLs: Prevent cache avalanche by randomizing expiration times. A simple baseTTL + random(0, jitterRange) is sufficient.

  4. Implement graceful degradation: When the cache is unavailable, fall back to the database. A cache failure should never become an application failure.

  5. Use negative caching sparingly: Cache "not found" results with very short TTLs (15–30 seconds) to prevent cache penetration without permanently hiding newly created records.

  6. Protect against stampedes: Use locking, single-flight patterns, or early refresh to prevent concurrent cache misses from overwhelming the origin.

  7. Monitor your hit ratio obsessively: A declining hit ratio is the earliest indicator of misconfigured TTLs, insufficient cache size, or changed access patterns.

  8. Prefer invalidation over update: When data changes, delete the cache key rather than trying to update it. This avoids race conditions between concurrent writers.

  9. Size your cache based on working set: Cache the 20% of data that serves 80% of requests. Caching everything is wasteful and counterproductive.

  10. Never cache sensitive data without encryption: Session tokens, PII, and financial data in cache must be encrypted at rest and in transit (TLS to Redis, encrypted ElastiCache).