How I Efficiently Handle Large Item Sizes In DynamoDB

The issues with large item sizes and some strategies on how to handle these efficiently.

Oftentimes you will run into scenarios with your DynamoDB data that force you to have large item sizes.

Some attributes may contain array or object data types and some may hold large strings content.

We know DynamoDB limits a single item’s size to 400KB and the maximum amount of data you can return is 1MB.

That 400KB item size limit is hard to reach but nonetheless it will cost you in requests and latency.

Now, when it comes to large string values, the easy solution is to write the string contents to a text file and upload it to Amazon S3. From there you can retrieve the URL and can use that to reference it within your database.

But what can you do when your data is structured as an array, an object data type?

As this is a common recurring issue, I wanted to discuss some strategies I’ve used to solve this problem for some projects I worked on in the past.

1. Denormalizing Data as Individual Items

The first strategy is the most effective one. It’s about denormalizing or breaking up your arrays or objects into their own items.

Let’s take a look at an students database example.

Imagine we store student items on this table. We might be tempted to also store the courses that user is enrolled in, as an array item (same with grades).

However, this is where our items can start growing in size, with this design, if we need to store a few years of courses enrolled by each student, that item size can grow to be too large.

It would also be impossible to query these items by courses or grades, if we had an access pattern for that.

Here’s how we can denormalize the courses array as their own items:

Here we use a single table design instead and denormalize the courses array into their own items instead.

We can easily query for a student’s courses, grades, etc.

Now our original student item is much smaller in size and will not grow with time.

2. Using Projection Expressions

Projection Expressions allow you to specify only the attributes you want to return in your query.

This lets you massively optimize the size of the data you get back.

In cases where you have no choice but to store large item sizes, projection expressions offers a more efficient way to query your data.

Most of the time, you don’t need all of the attributes inside an item. With projection expressions you can specify only the attributes your UI needs to display and nothing more.

This will allow you to greatly reduce requests at scale

3. Shorter Attribute names

A more dramatic optimization for item sizes is to shorten your attribute names.

With DynamoDB, you pay per item size returned, as opposed to per request or per item.

Shortening your attribute names will naturally reduce the cost of requests.

This is a more “desperate” strategy and only recommended to be used at large scales when optimizing item sizes becomes critical.

Here’s the strategy:

  • Use single or double letter initials instead of the full attribute name: e.g. “user” => “u”, “firstName” => “n”, “userID” => “uid”.

  • Write all items to DynamoDB in this short form.

  • On the client side, keep a map of these transformations and rehydrate the attribute names with their respective long version when you read and write data.

  • This lets you query and write data as you normally would with regular named attributes

Finally, this can also be done with values as well, but may not always be ideal. Use what makes the most sense for your use case.

Balancing simplicity and costs is critical.

👋 My name is Uriel Bitton and I hope you learned something in this edition of Excelling With DynamoDB.

📅 If you're looking for help with DynamoDB, let's have a quick chat.

 🙌 I hope to see you in next week's edition!