DynamoDB Indexes and High Availability

Introduction

Amazon DynamoDB is a fully managed NoSQL database service that delivers single-digit millisecond performance at any scale. Understanding how DynamoDB organizes data through its indexing mechanisms—and how it achieves extraordinary availability through its distributed architecture—is essential for designing resilient, performant systems. This article explores partition keys, sort keys, Global Secondary Indexes (GSIs), Local Secondary Indexes (LSIs), and the replication and consistency strategies that make DynamoDB one of the most highly available databases in the cloud.

Core Concepts

The DynamoDB Data Model

DynamoDB is a key-value and document database. Every item in a table is uniquely identified by its primary key. There are two types of primary keys:

Simple Primary Key (Partition Key only): A single attribute that DynamoDB uses to distribute data across partitions. Each partition key value must be unique.
Composite Primary Key (Partition Key + Sort Key): Two attributes together uniquely identify an item. Multiple items can share the same partition key, but each must have a unique sort key within that partition.

The partition key determines which physical partition stores the item. The sort key determines the order of items within that partition, enabling efficient range queries.

Partitioning Strategy

DynamoDB automatically partitions data across multiple storage nodes using a hash function applied to the partition key. A well-chosen partition key distributes traffic evenly, avoiding hot partitions that can throttle performance.

Each partition supports up to:

3,000 Read Capacity Units (RCUs) per second
1,000 Write Capacity Units (WCUs) per second
10 GB of data

When a partition exceeds these limits, DynamoDB automatically splits it.

Global Secondary Indexes (GSIs)

A Global Secondary Index lets you query data using an entirely different partition key and optional sort key than the base table. GSIs are "global" because they span all partitions of the base table.

How GSIs Work

When you create a GSI, DynamoDB maintains a separate, fully replicated copy of the projected data. Writes to the base table are asynchronously propagated to all GSIs. This means:

GSIs are eventually consistent — there is a slight replication lag
GSIs have their own provisioned throughput (separate from the base table)
You can project all attributes, only keys, or a custom subset of attributes

GSI Key Properties

Property	Detail
Max per table	20 (default, can request increase)
Consistency	Eventually consistent reads only
Key schema	Any attribute as PK; optional different SK
Throughput	Independent RCU/WCU from base table
Projection	ALL, KEYS_ONLY, or INCLUDE specific attributes
Backfill	Created asynchronously; existing items are backfilled

GSI Write Throttling

A critical design concern: if a GSI's write capacity is insufficient, the base table writes will be throttled even if the base table has capacity. This is because DynamoDB must propagate changes to the GSI and will back-pressure the base table.

Local Secondary Indexes (LSIs)

A Local Secondary Index uses the same partition key as the base table but a different sort key. LSIs are "local" because they are co-located with the base table partition.

How LSIs Work

LSIs must be defined at table creation time — they cannot be added later
They share throughput capacity with the base table
They support both strongly consistent and eventually consistent reads
They are subject to a 10 GB partition limit per partition key value (including base table + all LSI data)

GSI vs. LSI Comparison

Feature	GSI	LSI
Partition key	Different from base table	Same as base table
Sort key	Optional, different	Required, different
Throughput	Independent	Shared with base table
Consistency	Eventually consistent only	Strong or eventual
Creation	Anytime	Table creation only
Size limit	No per-partition limit	10 GB per partition key
Max per table	20	5

Sparse Indexes

A powerful pattern in DynamoDB: if an item does not contain the attribute used as the GSI's key, that item is not included in the index. This creates a naturally filtered, smaller index.

For example, if you have an IsActive attribute that only exists on active users, a GSI on IsActive will only contain active users — making queries against it fast and cheap.

High Availability Architecture

Multi-AZ Replication

DynamoDB automatically replicates data across three Availability Zones (AZs) within a single AWS Region. Every write is synchronously replicated to at least two of the three AZs before being acknowledged to the client. This design provides:

Durability: Data survives the loss of an entire AZ
Availability: Reads and writes continue even if one AZ fails
99.999% SLA for Global Tables, 99.99% SLA for regional tables

Consistency Models

DynamoDB offers two read consistency options:

Eventually Consistent Reads (default): May return stale data; reads from any replica. Costs 0.5 RCU per 4 KB.
Strongly Consistent Reads: Always reads from the leader replica, guaranteeing the most recent write. Costs 1 RCU per 4 KB.

Global Tables — Multi-Region Replication

DynamoDB Global Tables extend high availability across AWS Regions, providing active-active replication. Every replica table can accept reads and writes. Changes are propagated asynchronously to all other regions, typically within one second.

Conflict Resolution in Global Tables

Global Tables use a last-writer-wins conflict resolution strategy based on timestamps. When concurrent writes to the same item occur in different regions, the write with the latest timestamp prevails. This is simple but requires careful design:

Avoid writing the same item from multiple regions simultaneously
Use conditional writes or application-level conflict detection where necessary
Consider region-affinity routing for write-heavy workloads

DynamoDB Streams and Change Data Capture

DynamoDB Streams capture a time-ordered sequence of item-level modifications. This is critical for high availability patterns such as:

Cross-region replication (underlying mechanism for Global Tables)
Event-driven architectures (triggering Lambda on data changes)
Materialized views (keeping GSIs or external systems in sync)

Stream records contain:

KEYS_ONLY: Only the key attributes of the modified item
NEW_IMAGE: The entire item as it appears after the modification
OLD_IMAGE: The entire item as it appeared before the modification
NEW_AND_OLD_IMAGES: Both the old and new images

Capacity Modes and Availability

DynamoDB offers two capacity modes that impact both cost and availability:

On-Demand Mode

Automatically scales to accommodate traffic
No capacity planning required
Charges per request
Ideal for unpredictable workloads
Can handle up to 2x the previous peak instantly; beyond that, may throttle for a few minutes while scaling

Provisioned Mode

You specify RCUs and WCUs
Supports Auto Scaling with target utilization policies
More cost-effective for predictable workloads
Reserved capacity available for further savings (1-year or 3-year terms)

Backup and Recovery

DynamoDB provides multiple backup mechanisms that contribute to its high availability story:

Point-in-Time Recovery (PITR)

Continuous backups for the last 35 days
Restore to any second within that window
Protects against accidental writes or deletes
Restores to a new table (not in-place)

On-Demand Backups

Full table backups created at any time
No impact on table performance
Retained until explicitly deleted
Useful for long-term archival and compliance

DAX (DynamoDB Accelerator)

DAX is a fully managed, in-memory cache for DynamoDB that delivers microsecond read latency. DAX sits between your application and DynamoDB, transparently caching reads.

Write-through cache: Writes go through DAX to DynamoDB
Item cache: Caches individual GetItem and BatchGetItem results
Query cache: Caches Query and Scan results
Deployed across multiple AZs in a cluster for high availability

Best Practices

Choose high-cardinality partition keys: Select attributes with many distinct values (e.g., UserID, SessionID) to distribute load evenly across partitions and avoid hot spots.
Design GSIs to match your query patterns: Model your indexes around your access patterns, not your data structure. Each GSI should serve a specific query requirement.
Monitor GSI write capacity separately: Since GSI throttling cascades to the base table, always provision GSI write capacity equal to or greater than the base table's write capacity.
Use sparse indexes for filtered access: Leverage DynamoDB's behavior of excluding items without the index attribute to create efficient, naturally filtered indexes.
Enable Point-in-Time Recovery: PITR adds minimal cost but provides critical protection against accidental data corruption or deletion.
Use Global Tables for disaster recovery: Active-active multi-region replication eliminates the need for manual failover and provides the highest availability SLA (99.999%).
Prefer eventually consistent reads when possible: Eventually consistent reads cost half the RCUs and reduce load on the leader replica, improving overall throughput.
Project only needed attributes in GSIs: Use INCLUDE projections rather than ALL to reduce storage costs and write amplification. Fetching unprojected attributes requires a separate base table read.
Implement exponential backoff for throttled requests: DynamoDB SDKs do this by default, but ensure custom HTTP clients handle ProvisionedThroughputExceededException with retries.
Use write sharding for hot partition keys: For inherently hot keys (e.g., a global counter), append a random suffix to distribute writes across multiple partitions, then aggregate on read.

Eventual Consistency: Deep dive into consistency models and how they apply to distributed databases like DynamoDB.
Serverless and Container Workloads: DynamoDB is the natural persistence layer for serverless architectures with AWS Lambda.
REST HTTP Verbs and Status Codes: Understanding API design patterns that sit in front of DynamoDB-backed services.
High-Performance Streaming Operations: Patterns for processing DynamoDB Streams at scale.

DynamoDB Indexes and High Availability ​

Introduction ​

Core Concepts ​

The DynamoDB Data Model ​

Partitioning Strategy ​

Global Secondary Indexes (GSIs) ​

How GSIs Work ​

GSI Key Properties ​

GSI Write Throttling ​

Local Secondary Indexes (LSIs) ​

How LSIs Work ​

GSI vs. LSI Comparison ​

Sparse Indexes ​

High Availability Architecture ​

Multi-AZ Replication ​

Consistency Models ​

Global Tables — Multi-Region Replication ​

Conflict Resolution in Global Tables ​

DynamoDB Streams and Change Data Capture ​

Capacity Modes and Availability ​

On-Demand Mode ​

Provisioned Mode ​

Backup and Recovery ​

Point-in-Time Recovery (PITR) ​

On-Demand Backups ​

DAX (DynamoDB Accelerator) ​

Best Practices ​

Related Concepts ​