- Excelling With DynamoDB
- Posts
- Enhancing DynamoDB With Full-Text Search
Enhancing DynamoDB With Full-Text Search
The best way to make your DynamoDB data searchable
Can you use DynamoDB queries (or Scan) to run full-text searches on your data?
Like many things, there's an efficient (and recommended) way to do it as well as an inefficient way.
Let's take a look at both.
Inefficient full-text search
DynamoDB provides a "contains" filter expression on queries.
How it works.
When you use the QueryCommand method, you must specify your primary keys: partition key (and optionally sort key). Then, you can add a FilterExpression using the "contains" method.
Here's an example:
const params = {
TableName: "users",
KeyConditionExpression: "pk = :pk",
ExpressionAttributeValues: marshall({
":pk": "user#101",
":searchVal": "sports",
}),
FilterExpression: "contains(interests, :searchVal)",
};
const command = new QueryCommand(params);
const result = await dbClient.send(command);
With this code we can run a search expression to find all user's interests for "sports".
We can store user interests as keywords in a string or an array and when we search for an interest keyword, DynamoDB will return any items whose "interests" attrribute contains the value we search for.
This enables full-text searching by getting any value that is "contains" our search string.
Now why is it inefficient?
For two reasons:
First, we are only searching on a limited set of a given partition. If you notice in the code above we specify "pk" as the partition key. Therefore our search result set is limited to the items within that partition.
Second, if we wish to bypass this limitation we can use the Scan method to extend the search to our entire table.
But that would be highly inefficient as the Scan method "scans" through every item in our table, significantly bringing up costs and latency.
Efficient full-text search
The efficient alternative to this is using an external integration.
There are various solutions that enable true and efficient full-text searching, such as search engines and vector databases.
Some examples are:
Amazon Opensearch
Algolia
ElasticSearch
These solutions allow you to synchronize your DynamoDB data in near real-time and act as replicated data store to enable more complex full-text search and filtering on.
Here's how you can enable this.
Enable Streams on your DynamoDB table.
Once enabled, you can create a Lambda function to detect new writes, edits and deletes of the items on your DynamoDB table
Write/modify/delete these items to and from your external solution.
When these items are on the secondary database or search engine, you can now call that to run your search queries against.
This process is simple and maintains a reliable synchronized copy of your data on the search database.
If you need a serverless solution, Algolia provides a fully serverless search engine which you can use. You can create an index and store your table's data on it, keeping the serverless architecture with DynamoDB and Lambda.
Next week I'll take you through a demo of integrating DynamoDB with Algolia.
Stay tuned !
👋 My name is Uriel Bitton and I hope you learned something in this edition of Excelling With DynamoDB.
📅 If you're looking for help with DynamoDB, let's have a quick chat.
🙌 I hope to see you in next week's edition!