Skip to content

DynamoDB Indexes and High Availability

Introduction

Amazon DynamoDB is a fully managed NoSQL database service that delivers single-digit millisecond performance at any scale. Understanding how DynamoDB organizes data through its indexing mechanisms—and how it achieves extraordinary availability through its distributed architecture—is essential for designing resilient, performant systems. This article explores partition keys, sort keys, Global Secondary Indexes (GSIs), Local Secondary Indexes (LSIs), and the replication and consistency strategies that make DynamoDB one of the most highly available databases in the cloud.


Core Concepts

The DynamoDB Data Model

DynamoDB is a key-value and document database. Every item in a table is uniquely identified by its primary key. There are two types of primary keys:

  • Simple Primary Key (Partition Key only): A single attribute that DynamoDB uses to distribute data across partitions. Each partition key value must be unique.
  • Composite Primary Key (Partition Key + Sort Key): Two attributes together uniquely identify an item. Multiple items can share the same partition key, but each must have a unique sort key within that partition.

The partition key determines which physical partition stores the item. The sort key determines the order of items within that partition, enabling efficient range queries.

Partitioning Strategy

DynamoDB automatically partitions data across multiple storage nodes using a hash function applied to the partition key. A well-chosen partition key distributes traffic evenly, avoiding hot partitions that can throttle performance.

Each partition supports up to:

  • 3,000 Read Capacity Units (RCUs) per second
  • 1,000 Write Capacity Units (WCUs) per second
  • 10 GB of data

When a partition exceeds these limits, DynamoDB automatically splits it.


Global Secondary Indexes (GSIs)

A Global Secondary Index lets you query data using an entirely different partition key and optional sort key than the base table. GSIs are "global" because they span all partitions of the base table.

How GSIs Work

When you create a GSI, DynamoDB maintains a separate, fully replicated copy of the projected data. Writes to the base table are asynchronously propagated to all GSIs. This means:

  • GSIs are eventually consistent — there is a slight replication lag
  • GSIs have their own provisioned throughput (separate from the base table)
  • You can project all attributes, only keys, or a custom subset of attributes

GSI Key Properties

PropertyDetail
Max per table20 (default, can request increase)
ConsistencyEventually consistent reads only
Key schemaAny attribute as PK; optional different SK
ThroughputIndependent RCU/WCU from base table
ProjectionALL, KEYS_ONLY, or INCLUDE specific attributes
BackfillCreated asynchronously; existing items are backfilled

GSI Write Throttling

A critical design concern: if a GSI's write capacity is insufficient, the base table writes will be throttled even if the base table has capacity. This is because DynamoDB must propagate changes to the GSI and will back-pressure the base table.


Local Secondary Indexes (LSIs)

A Local Secondary Index uses the same partition key as the base table but a different sort key. LSIs are "local" because they are co-located with the base table partition.

How LSIs Work

  • LSIs must be defined at table creation time — they cannot be added later
  • They share throughput capacity with the base table
  • They support both strongly consistent and eventually consistent reads
  • They are subject to a 10 GB partition limit per partition key value (including base table + all LSI data)

GSI vs. LSI Comparison

FeatureGSILSI
Partition keyDifferent from base tableSame as base table
Sort keyOptional, differentRequired, different
ThroughputIndependentShared with base table
ConsistencyEventually consistent onlyStrong or eventual
CreationAnytimeTable creation only
Size limitNo per-partition limit10 GB per partition key
Max per table205

Sparse Indexes

A powerful pattern in DynamoDB: if an item does not contain the attribute used as the GSI's key, that item is not included in the index. This creates a naturally filtered, smaller index.

For example, if you have an IsActive attribute that only exists on active users, a GSI on IsActive will only contain active users — making queries against it fast and cheap.


High Availability Architecture

Multi-AZ Replication

DynamoDB automatically replicates data across three Availability Zones (AZs) within a single AWS Region. Every write is synchronously replicated to at least two of the three AZs before being acknowledged to the client. This design provides:

  • Durability: Data survives the loss of an entire AZ
  • Availability: Reads and writes continue even if one AZ fails
  • 99.999% SLA for Global Tables, 99.99% SLA for regional tables

Consistency Models

DynamoDB offers two read consistency options:

  • Eventually Consistent Reads (default): May return stale data; reads from any replica. Costs 0.5 RCU per 4 KB.
  • Strongly Consistent Reads: Always reads from the leader replica, guaranteeing the most recent write. Costs 1 RCU per 4 KB.

Global Tables — Multi-Region Replication

DynamoDB Global Tables extend high availability across AWS Regions, providing active-active replication. Every replica table can accept reads and writes. Changes are propagated asynchronously to all other regions, typically within one second.

Conflict Resolution in Global Tables

Global Tables use a last-writer-wins conflict resolution strategy based on timestamps. When concurrent writes to the same item occur in different regions, the write with the latest timestamp prevails. This is simple but requires careful design:

  • Avoid writing the same item from multiple regions simultaneously
  • Use conditional writes or application-level conflict detection where necessary
  • Consider region-affinity routing for write-heavy workloads

DynamoDB Streams and Change Data Capture

DynamoDB Streams capture a time-ordered sequence of item-level modifications. This is critical for high availability patterns such as:

  • Cross-region replication (underlying mechanism for Global Tables)
  • Event-driven architectures (triggering Lambda on data changes)
  • Materialized views (keeping GSIs or external systems in sync)

Stream records contain:

  • KEYS_ONLY: Only the key attributes of the modified item
  • NEW_IMAGE: The entire item as it appears after the modification
  • OLD_IMAGE: The entire item as it appeared before the modification
  • NEW_AND_OLD_IMAGES: Both the old and new images

Capacity Modes and Availability

DynamoDB offers two capacity modes that impact both cost and availability:

On-Demand Mode

  • Automatically scales to accommodate traffic
  • No capacity planning required
  • Charges per request
  • Ideal for unpredictable workloads
  • Can handle up to 2x the previous peak instantly; beyond that, may throttle for a few minutes while scaling

Provisioned Mode

  • You specify RCUs and WCUs
  • Supports Auto Scaling with target utilization policies
  • More cost-effective for predictable workloads
  • Reserved capacity available for further savings (1-year or 3-year terms)

Backup and Recovery

DynamoDB provides multiple backup mechanisms that contribute to its high availability story:

Point-in-Time Recovery (PITR)

  • Continuous backups for the last 35 days
  • Restore to any second within that window
  • Protects against accidental writes or deletes
  • Restores to a new table (not in-place)

On-Demand Backups

  • Full table backups created at any time
  • No impact on table performance
  • Retained until explicitly deleted
  • Useful for long-term archival and compliance

DAX (DynamoDB Accelerator)

DAX is a fully managed, in-memory cache for DynamoDB that delivers microsecond read latency. DAX sits between your application and DynamoDB, transparently caching reads.

  • Write-through cache: Writes go through DAX to DynamoDB
  • Item cache: Caches individual GetItem and BatchGetItem results
  • Query cache: Caches Query and Scan results
  • Deployed across multiple AZs in a cluster for high availability

Best Practices

  1. Choose high-cardinality partition keys: Select attributes with many distinct values (e.g., UserID, SessionID) to distribute load evenly across partitions and avoid hot spots.

  2. Design GSIs to match your query patterns: Model your indexes around your access patterns, not your data structure. Each GSI should serve a specific query requirement.

  3. Monitor GSI write capacity separately: Since GSI throttling cascades to the base table, always provision GSI write capacity equal to or greater than the base table's write capacity.

  4. Use sparse indexes for filtered access: Leverage DynamoDB's behavior of excluding items without the index attribute to create efficient, naturally filtered indexes.

  5. Enable Point-in-Time Recovery: PITR adds minimal cost but provides critical protection against accidental data corruption or deletion.

  6. Use Global Tables for disaster recovery: Active-active multi-region replication eliminates the need for manual failover and provides the highest availability SLA (99.999%).

  7. Prefer eventually consistent reads when possible: Eventually consistent reads cost half the RCUs and reduce load on the leader replica, improving overall throughput.

  8. Project only needed attributes in GSIs: Use INCLUDE projections rather than ALL to reduce storage costs and write amplification. Fetching unprojected attributes requires a separate base table read.

  9. Implement exponential backoff for throttled requests: DynamoDB SDKs do this by default, but ensure custom HTTP clients handle ProvisionedThroughputExceededException with retries.

  10. Use write sharding for hot partition keys: For inherently hot keys (e.g., a global counter), append a random suffix to distribute writes across multiple partitions, then aggregate on read.