Excelling With DynamoDB
Posts
5 Lessons I’ve Learned From Building Scalable DynamoDB Databases

5 Lessons I’ve Learned From Building Scalable DynamoDB Databases

From small and large-scale databases, here are the most valuable concepts I’ve learned over the past year.

Uriel Bitton
December 28, 2024

It’s been an incredible year filled with growth, learning, and success.

We often learn most when we teach others.

Much of that success is attributed to helping and teaching others through my technical writing.

To wrap up what has been an amazing 2024, I’ll share with you the 5 biggest lessons I’ve learned from building scalable databases with DynamoDB.

Let’s get right into it.

1. Understanding primary keys is key

If you understand nothing else, grasping this concept will change everything for you.

Primary key design is always the most important element in DynamoDB design.

Most databases I’ve worked with that couldn’t scale or had query inefficiencies had this problem in common:

Bad primary key design.

The problem is two-fold:

The partition key needs to be quite unique (e.g. userID or productID).
The sort key holds the power to powerful queries.

There are many different and efficient queries you can make to your database by using a good sort key.

Some examples include prefixing the data entity type like “user#<userID>” or even “u#<userID>”.

Other effective strategies include prefixing the sort key with a date-time string so that you can sort multiple query results by date or another prefix you require.

Yet another effective use of the sort key is to use hierarchical data to enable multiple filtering.

For example, you can design a sort key to hold the state, city, and zip code of a store item. With this design pattern, you can efficiently filter stores in your queries like so: sk = “CA#los angeles#103900”

The limitations of sort key (and primary key) design are only your imagination and enable powerfully efficient queries with DynamoDB.

2. Design for access patterns first

The second biggest mistake I see people making with DynamoDB is designing their data according to how they envision it without planning out their access patterns first.

No matter how you design your data, it will always need a redesign if you don’t consider your access patterns.

Effective (and efficient) DynamoDB database design can only come from modeling your data based on your data access patterns.

Here’s an example: Say you have a core feature that needs to fetch students who are attending any given course.

Your data model must consider this as a primary access pattern and have a partition key with a courseID and a sort key that is prefixed with “student#” and then the ID of the student.

If the primary access pattern is to fetch all students that an instructor teaches, rather than by students enrolled in a course, you should instead use the instructorID as the partition key and a prefix of “student#” followed by the studentID.

Access patterns matter more than ever in DynamoDB. Remember this to design effective DynamoDB databases.

I wrote an airline data model post here, you’ll find it quite interesting.

3. TTL to optimize storage costs

Storage costs are something that never comes up in the short term but returns with a vengeance in the long term.

I’ve previously written about a company that lost thousands of dollars because they hadn’t considered TTL in their solutions architecture from the start.

Setting TTLs from the start enables DynamoDB to automatically delete any stale data from your database greatly reducing your storage costs in the long run.

I highly recommend this article to learn about TTL in DynamoDB.

4. Data that is accessed together must be stored together

Rick Houlihan said this best.

This relates directly to the single table design. This design pattern states that data that is often fetched together should be stored together.

In DynamoDB this means, you should store multiple data entities in one table. Additionally, data that is fetched together should be stored in the same partition (item collection) in your database.

This enables efficient queries and optimizes DynamoDB requests costs.

I wrote extensively about the single table design, with various examples here.

5. Leverage Write Sharding for scalability

I created a lot of DynamoDB challenges this year, whether in a LinkedIn post or a Medium article.

They all had the same theme — try to find a scalable solution to a data model problem.

For scalability, there are tons of things you can do such as partition key cardinality, intelligent data modeling with sort keys, and use the single table design, etc.

However, one critical implementation is to shard your writes across a hot partition (high traffic partition).

Sharding writes involves breaking up the writes of a partition.

For example, a popular user may post many times a day on a social media platform.

Instead of writing all post items to one busy partition, we can break it up into partitions based on the week or the month. Hence, instead of having one giant partition, you have 12, one for each month, alleviating read and write requests traffic.

Check out this challenge on designing a scalable voting system database with DynamoDB, using write sharding.

Conclusion

After another year of designing DynamoDB databases, I’ve come up with these 5 effective lessons.

From mastering primary keys to implementing advanced techniques like write sharding, I’ve learned a lot and in the process have been able to help others build better and more efficient DynamoDB databases.

Here’s to 2025, another year of growth, innovation, and solving the needs of DynamoDB users with efficient data modeling strategies.

👋 My name is Uriel Bitton and I hope you learned something in this edition of Excelling With DynamoDB.

📅 If you're looking for help with DynamoDB, let's have a quick chat.

🙌 I hope to see you in next week's edition!