Appearance
Caching Strategies with Redis
Introduction
Caching is the practice of storing frequently accessed data in a high-speed data store to reduce latency, lower database load, and improve application throughput. Redis (Remote Dictionary Server) is the most widely adopted in-memory data structure store, functioning as a cache, message broker, and database. Understanding the various caching strategies — and when to apply each — is essential for building scalable, performant distributed systems.
Core Concepts
Why Cache?
Every time an application fetches data from a primary data store (e.g., a relational database), it incurs network latency, query parsing, disk I/O, and serialization overhead. For read-heavy workloads where the same data is requested repeatedly, this is wasteful. A cache sits between your application and the primary store, serving pre-computed or recently-fetched results in microseconds rather than milliseconds.
Redis as a Cache
Redis stores data entirely in memory, providing sub-millisecond read and write latencies. Key Redis features that make it ideal for caching include:
- TTL (Time-To-Live): Automatic key expiration
- Eviction Policies: LRU, LFU, random, and volatile strategies
- Data Structures: Strings, hashes, lists, sets, sorted sets, streams
- Persistence Options: RDB snapshots and AOF logs (optional for caching)
- Clustering: Horizontal partitioning across multiple nodes
- Pub/Sub: Cache invalidation event propagation
The Five Core Caching Strategies
There are five primary caching strategies, each suited to different access patterns:
| Strategy | Read Path | Write Path | Best For |
|---|---|---|---|
| Cache-Aside | App reads cache, misses go to DB | App writes to DB, then invalidates cache | General purpose, read-heavy |
| Read-Through | Cache reads from DB on miss | Same as Cache-Aside | Simplified app logic |
| Write-Through | Same as Cache-Aside | App writes to cache AND DB synchronously | Strong consistency needs |
| Write-Behind | Same as Cache-Aside | App writes to cache, cache async writes to DB | Write-heavy workloads |
| Write-Around | Same as Cache-Aside | App writes directly to DB, skips cache | Infrequently re-read writes |
Cache-Aside (Lazy Loading)
Cache-Aside is the most common caching pattern. The application is responsible for managing the cache: it checks the cache first, and on a miss, loads data from the database and populates the cache.
Java Implementation
java
import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisPoolConfig;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.sql.*;
import java.time.Duration;
public class CacheAsideExample {
private final JedisPool jedisPool;
private final Connection dbConnection;
private final ObjectMapper objectMapper = new ObjectMapper();
private static final int CACHE_TTL_SECONDS = 3600; // 1 hour
public CacheAsideExample(String redisHost, int redisPort, String dbUrl) throws SQLException {
JedisPoolConfig poolConfig = new JedisPoolConfig();
poolConfig.setMaxTotal(50);
poolConfig.setMaxIdle(10);
poolConfig.setMinIdle(5);
poolConfig.setMaxWait(Duration.ofMillis(2000));
this.jedisPool = new JedisPool(poolConfig, redisHost, redisPort);
this.dbConnection = DriverManager.getConnection(dbUrl);
}
// --- Cache-Aside READ ---
public User getUser(String userId) {
String cacheKey = "user:" + userId;
// Step 1: Check cache
try (Jedis jedis = jedisPool.getResource()) {
String cached = jedis.get(cacheKey);
if (cached != null) {
System.out.println("Cache HIT for " + cacheKey);
return objectMapper.readValue(cached, User.class);
}
} catch (Exception e) {
System.err.println("Cache read failed, falling back to DB: " + e.getMessage());
}
// Step 2: Cache miss — read from database
System.out.println("Cache MISS for " + cacheKey);
User user = fetchUserFromDb(userId);
// Step 3: Populate cache
if (user != null) {
try (Jedis jedis = jedisPool.getResource()) {
String json = objectMapper.writeValueAsString(user);
jedis.setex(cacheKey, CACHE_TTL_SECONDS, json);
} catch (Exception e) {
System.err.println("Cache write failed: " + e.getMessage());
}
}
return user;
}
// --- Cache-Aside WRITE (invalidation) ---
public void updateUser(User user) throws SQLException {
String cacheKey = "user:" + user.getId();
// Step 1: Write to database first
try (PreparedStatement stmt = dbConnection.prepareStatement(
"UPDATE users SET name=?, email=? WHERE id=?")) {
stmt.setString(1, user.getName());
stmt.setString(2, user.getEmail());
stmt.setString(3, user.getId());
stmt.executeUpdate();
}
// Step 2: Invalidate cache (delete, not update)
try (Jedis jedis = jedisPool.getResource()) {
jedis.del(cacheKey);
System.out.println("Invalidated cache for " + cacheKey);
} catch (Exception e) {
System.err.println("Cache invalidation failed: " + e.getMessage());
}
}
private User fetchUserFromDb(String userId) {
try (PreparedStatement stmt = dbConnection.prepareStatement(
"SELECT id, name, email FROM users WHERE id = ?")) {
stmt.setString(1, userId);
ResultSet rs = stmt.executeQuery();
if (rs.next()) {
return new User(rs.getString("id"), rs.getString("name"), rs.getString("email"));
}
} catch (SQLException e) {
System.err.println("Database read failed: " + e.getMessage());
}
return null;
}
public static void main(String[] args) throws Exception {
CacheAsideExample cache = new CacheAsideExample(
"localhost", 6379, "jdbc:postgresql://localhost:5432/mydb");
// First call: cache miss, loads from DB
User user = cache.getUser("123");
System.out.println("User: " + user);
// Second call: cache hit
User cachedUser = cache.getUser("123");
System.out.println("Cached: " + cachedUser);
// Update: invalidates cache
user.setEmail("newemail@example.com");
cache.updateUser(user);
// Next read: cache miss again, fresh from DB
User freshUser = cache.getUser("123");
System.out.println("Fresh: " + freshUser);
}
}
class User {
private String id;
private String name;
private String email;
public User() {}
public User(String id, String name, String email) {
this.id = id;
this.name = name;
this.email = email;
}
public String getId() { return id; }
public void setId(String id) { this.id = id; }
public String getName() { return name; }
public void setName(String name) { this.name = name; }
public String getEmail() { return email; }
public void setEmail(String email) { this.email = email; }
@Override
public String toString() {
return "User{id='" + id + "', name='" + name + "', email='" + email + "'}";
}
}Write-Through Strategy
With Write-Through, every write goes to both the cache and the database synchronously. The application writes to the cache, and the cache layer (or the application itself) immediately persists the data to the primary store.
Java Implementation
java
public class WriteThroughCache {
private final JedisPool jedisPool;
private final Connection dbConnection;
private final ObjectMapper objectMapper = new ObjectMapper();
private static final int CACHE_TTL_SECONDS = 3600;
public WriteThroughCache(JedisPool jedisPool, Connection dbConnection) {
this.jedisPool = jedisPool;
this.dbConnection = dbConnection;
}
public void saveUser(User user) throws Exception {
String cacheKey = "user:" + user.getId();
String json = objectMapper.writeValueAsString(user);
// Step 1: Write to cache
try (Jedis jedis = jedisPool.getResource()) {
jedis.setex(cacheKey, CACHE_TTL_SECONDS, json);
}
// Step 2: Write to database (synchronous)
try (PreparedStatement stmt = dbConnection.prepareStatement(
"INSERT INTO users (id, name, email) VALUES (?, ?, ?) " +
"ON CONFLICT (id) DO UPDATE SET name=?, email=?")) {
stmt.setString(1, user.getId());
stmt.setString(2, user.getName());
stmt.setString(3, user.getEmail());
stmt.setString(4, user.getName());
stmt.setString(5, user.getEmail());
stmt.executeUpdate();
} catch (SQLException e) {
// Rollback cache on DB failure
try (Jedis jedis = jedisPool.getResource()) {
jedis.del(cacheKey);
}
throw new RuntimeException("Write-through failed, cache rolled back", e);
}
}
public User getUser(String userId) throws Exception {
String cacheKey = "user:" + userId;
try (Jedis jedis = jedisPool.getResource()) {
String cached = jedis.get(cacheKey);
if (cached != null) {
return objectMapper.readValue(cached, User.class);
}
}
// Fallback to DB on miss (e.g., after TTL expiry)
// Same as Cache-Aside read path
return null;
}
}Write-Behind (Write-Back) Strategy
Write-Behind decouples the database write from the application write. The application writes only to the cache, and a background process asynchronously flushes changes to the database. This dramatically improves write latency but introduces the risk of data loss if Redis fails before the flush.
Java Implementation with Redis Streams
java
import redis.clients.jedis.*;
import redis.clients.jedis.params.XReadGroupParams;
import redis.clients.jedis.resps.StreamEntry;
import java.util.*;
import java.util.concurrent.*;
public class WriteBehindCache {
private final JedisPool jedisPool;
private final Connection dbConnection;
private final ObjectMapper objectMapper = new ObjectMapper();
private static final String WRITE_STREAM = "cache:write-behind";
private static final String CONSUMER_GROUP = "db-writers";
private final ScheduledExecutorService executor = Executors.newScheduledThreadPool(2);
public WriteBehindCache(JedisPool jedisPool, Connection dbConnection) {
this.jedisPool = jedisPool;
this.dbConnection = dbConnection;
initializeStream();
startBackgroundWriter();
}
private void initializeStream() {
try (Jedis jedis = jedisPool.getResource()) {
try {
jedis.xgroupCreate(WRITE_STREAM, CONSUMER_GROUP, new StreamEntryID("0"), true);
} catch (Exception e) {
// Group already exists
}
}
}
// Fast write — only touches Redis
public void saveUser(User user) throws Exception {
String cacheKey = "user:" + user.getId();
String json = objectMapper.writeValueAsString(user);
try (Jedis jedis = jedisPool.getResource()) {
// Write to cache immediately
jedis.set(cacheKey, json);
// Enqueue for async DB write
Map<String, String> event = new HashMap<>();
event.put("key", cacheKey);
event.put("data", json);
event.put("operation", "UPSERT");
event.put("timestamp", String.valueOf(System.currentTimeMillis()));
jedis.xadd(WRITE_STREAM, StreamEntryID.NEW_ENTRY, event);
}
}
// Background worker flushes to DB in batches
private void startBackgroundWriter() {
executor.scheduleWithFixedDelay(() -> {
try (Jedis jedis = jedisPool.getResource()) {
var entries = jedis.xreadGroup(
CONSUMER_GROUP, "worker-1",
XReadGroupParams.xReadGroupParams().count(100).block(1000),
Map.entry(WRITE_STREAM, StreamEntryID.UNRECEIVED_ENTRY)
);
if (entries != null) {
for (var streamEntries : entries) {
for (StreamEntry entry : streamEntries.getValue()) {
processWriteEvent(entry);
jedis.xack(WRITE_STREAM, CONSUMER_GROUP, entry.getID());
}
}
}
} catch (Exception e) {
System.err.println("Write-behind flush error: " + e.getMessage());
}
}, 0, 500, TimeUnit.MILLISECONDS);
}
private void processWriteEvent(StreamEntry entry) {
String data = entry.getFields().get("data");
try {
User user = objectMapper.readValue(data, User.class);
try (PreparedStatement stmt = dbConnection.prepareStatement(
"INSERT INTO users (id, name, email) VALUES (?, ?, ?) " +
"ON CONFLICT (id) DO UPDATE SET name=?, email=?")) {
stmt.setString(1, user.getId());
stmt.setString(2, user.getName());
stmt.setString(3, user.getEmail());
stmt.setString(4, user.getName());
stmt.setString(5, user.getEmail());
stmt.executeUpdate();
}
} catch (Exception e) {
System.err.println("Failed to flush to DB: " + e.getMessage());
// In production: dead-letter queue or retry logic
}
}
}Write-Around Strategy
Write-Around bypasses the cache entirely on writes. Data is written directly to the database, and the cache is only populated on subsequent reads (via Cache-Aside). This avoids polluting the cache with data that may never be read.
Cache Invalidation Strategies
Cache invalidation is famously one of the hardest problems in computer science. Redis offers several mechanisms:
Pub/Sub Cache Invalidation Across Services
java
public class CacheInvalidationService {
private static final String INVALIDATION_CHANNEL = "cache:invalidation";
private final JedisPool jedisPool;
public CacheInvalidationService(JedisPool jedisPool) {
this.jedisPool = jedisPool;
}
// Publisher: called by any service that modifies data
public void publishInvalidation(String cacheKey) {
try (Jedis jedis = jedisPool.getResource()) {
jedis.publish(INVALIDATION_CHANNEL, cacheKey);
System.out.println("Published invalidation for: " + cacheKey);
}
}
// Subscriber: runs in each application instance
public void startInvalidationListener() {
new Thread(() -> {
try (Jedis jedis = jedisPool.getResource()) {
jedis.subscribe(new JedisPubSub() {
@Override
public void onMessage(String channel, String cacheKey) {
System.out.println("Invalidating local cache: " + cacheKey);
try (Jedis j = jedisPool.getResource()) {
j.del(cacheKey);
}
}
}, INVALIDATION_CHANNEL);
}
}, "cache-invalidation-listener").start();
}
public static void main(String[] args) {
JedisPool pool = new JedisPool("localhost", 6379);
CacheInvalidationService service = new CacheInvalidationService(pool);
// Start listener in background
service.startInvalidationListener();
// Simulate an invalidation event
service.publishInvalidation("user:123");
service.publishInvalidation("product:456");
}
}Redis Eviction Policies
When Redis reaches its configured maxmemory limit, it must evict keys. The eviction policy determines which keys are removed.
For pure caching use cases, allkeys-lru or allkeys-lfu are the most common choices. LFU works better when you have a mix of frequently and infrequently accessed keys.
Architecture: Multi-Tier Caching
Production systems often use multiple layers of caching to maximize hit rates and minimize latency.
Handling Cache Stampede (Thundering Herd)
When a popular cache key expires, many threads may simultaneously attempt to rebuild it, overwhelming the database. This is called a cache stampede.
Distributed Lock for Stampede Prevention
java
public class StampedeProtectedCache {
private final JedisPool jedisPool;
private static final int LOCK_TIMEOUT_MS = 5000;
private static final int RETRY_DELAY_MS = 50;
public StampedeProtectedCache(JedisPool jedisPool) {
this.jedisPool = jedisPool;
}
public String getWithStampedeProtection(String key, int ttlSeconds,
java.util.function.Supplier<String> dbLoader) {
try (Jedis jedis = jedisPool.getResource()) {
// Try cache first
String value = jedis.get(key);
if (value != null) {
return value;
}
// Cache miss — try to acquire lock
String lockKey = "lock:" + key;
String lockValue = UUID.randomUUID().toString();
String lockResult = jedis.set(lockKey, lockValue,
new redis.clients.jedis.params.SetParams()
.nx().px(LOCK_TIMEOUT_MS));
if ("OK".equals(lockResult)) {
try {
// Won the lock — load from DB
String dbValue = dbLoader.get();
if (dbValue != null) {
jedis.setex(key, ttlSeconds, dbValue);
}
return dbValue;
} finally {
// Release lock (only if we still own it)
String script = "if redis.call('get',KEYS[1])==ARGV[1] then " +
"return redis.call('del',KEYS[1]) else return 0 end";
jedis.eval(script, List.of(lockKey), List.of(lockValue));
}
} else {
// Lost the lock — wait and retry from cache
for (int i = 0; i < LOCK_TIMEOUT_MS / RETRY_DELAY_MS; i++) {
Thread.sleep(RETRY_DELAY_MS);
value = jedis.get(key);
if (value != null) {
return value;
}
}
// Fallback: load from DB directly
return dbLoader.get();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return dbLoader.get();
}
}
public static void main(String[] args) {
JedisPool pool = new JedisPool("localhost", 6379);
StampedeProtectedCache cache = new StampedeProtectedCache(pool);
// Simulate concurrent access
ExecutorService executor = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
executor.submit(() -> {
String result = cache.getWithStampedeProtection(
"popular:key", 300,
() -> {
System.out.println(Thread.currentThread().getName() + " loading from DB");
try { Thread.sleep(200); } catch (InterruptedException e) {}
return "{\"data\": \"expensive result\"}";
}
);
System.out.println(Thread.currentThread().getName() + " got: " + result);
});
}
executor.shutdown();
}
}Strategy Decision Matrix
Best Practices
Always set TTLs: Every cache key should have an expiration time. Without TTLs, stale data accumulates and memory fills, eventually requiring manual intervention or causing outages.
Prefer deletion over update on writes: In Cache-Aside, delete the key after a database write rather than updating it. This avoids race conditions where a stale read overwrites a newer cache entry.
Use connection pooling: Create a
JedisPool(or Lettuce connection pool) and reuse connections. Creating a new Redis connection per request adds 1-3ms of overhead and risks exhausting file descriptors.Handle cache failures gracefully: Redis being unavailable should not bring down your application. Always fall back to the database when the cache is unreachable — treat the cache as a performance enhancement, not a required dependency.
Use consistent key naming conventions: Adopt a pattern like
entity:id:subfield(e.g.,user:123:profile). This makes debugging, monitoring, and bulk invalidation straightforward.Monitor cache hit ratios: Track hit rate, miss rate, eviction count, and memory usage. A hit ratio below 80% usually signals that your TTLs are too short or your access patterns don't benefit from caching.
Protect against cache stampede: Use distributed locks or probabilistic early expiration for hot keys. A single popular key expiring can generate thousands of concurrent database queries.
Serialize efficiently: Use compact formats like MessagePack or Protocol Buffers instead of JSON for high-throughput caches. Serialization overhead becomes significant at scale.
Separate cache and persistent Redis: Don't use the same Redis instance for caching and durable data (e.g., sessions, rate limits). Caches should use
allkeys-lrueviction, which would destroy persistent data.Use Redis Cluster for horizontal scaling: When a single Redis instance's memory or throughput is insufficient, shard data across a Redis Cluster. Ensure your key design supports hash-slot distribution.
Related Concepts
- Eventual Consistency: Caching inherently introduces eventual consistency; understanding the tradeoffs is critical.
- Serverless and Container Workloads: Deploying Redis as ElastiCache alongside serverless or containerized applications.
- High-Performance Streaming Operations: Redis Streams for event-driven write-behind patterns.
- Asynchronous Programming: Write-Behind strategies depend heavily on async processing patterns.