Redis as a vector database quick start guide

Understand how to use Redis as a vector database

This quick start guide helps you to:

  1. Understand what a vector database is
  2. Set a vector database up
  3. Create vector embeddings and store vectors
  4. Query data and perform a vector search

Understand vector databases

Data is often unstructured, which means that it isn't described by a well-defined schema. Examples of unstructured data include text passages, images, videos, or music titles. An approach to dealing with unstructured data is to vectorize it. Vectorizing means to map unstructured data to a flat sequence of numbers. Such a vector represents the data embedded in an N-dimensional space. Machine learning models have facilitated the rise of embeddings as a widely embraced method for generating dense, low-dimensional vector representations. Given a suitable machine learning model, the generated embeddings can encapsulate complex patterns and semantic meanings inherent in data.

You can use Redis Stack as a vector database. It allows you to:

  • Store vectors and the associated metadata within hashes or JSON documents
  • Retrieve vectors
  • Perform vector searches

Set a vector database up

The easiest way to get started with Redis Stack is to use Redis Cloud:

  1. Create a free account.

  2. Follow the instructions to create a free database.

This free Redis Cloud database comes out of the box with all the Redis Stack features.

You can alternatively use the installation guides to install Redis Stack on your local machine.

You need to have the following features configured for your Redis server: JSON and search and query.

Install the required Python packages

The code examples are currently provided for Redis CLI and Python. For Python, you will need to create a virtual environment and install the following Python packages:

  • redis: You can find further details about the redis-py client library in the clients section of this documentation site.
  • pandas: Pandas is a data analysis library.
  • sentence-transformers: You will use the SentenceTransformers framework to generate embeddings on full text. Sentence-BERT (SBERT) is a BERT model modification that produces consistent and contextually rich sentence embeddings. SBERT improves tasks like semantic search and text grouping by allowing for efficient and meaningful comparison of sentence-level semantic similarity.
  • tabulate: This package is optional. Pandas use it to render Markdown.

You will also need the following imports in your Python code:

Connect

Instantiate the Redis client. By default, Redis returns binary responses. To decode them, you pass the decode_responses parameter set to True:


Tip:
Instead of using a local Redis Stack server, you can copy and paste the connection details from the Redis Cloud database configuration page. Here is an example connection string of a Cloud database that is hosted in the AWS region us-east-1 and listens on port 16379: redis-16379.c283.us-east-1-4.ec2.cloud.redislabs.com:16379. The connection string has the format host:port. You must also copy and paste the username and password of your Cloud database. The line of code for connecting with the default user changes then to client = redis.Redis(host="redis-16379.c283.us-east-1-4.ec2.cloud.redislabs.com", port=16379, password="your_password_here" decode_responses=True).

Create vector embeddings from the demo data

This quick start guide also uses the bikes dataset. Here is an example document of it:

{
  "model": "Jigger",
  "brand": "Velorim",
  "price": 270,
  "type": "Kids bikes",
  "specs": {
    "material": "aluminium",
    "weight": "10"
  },
  "description": "Small and powerful, the Jigger is the best ride for the smallest of tikes! ...
}

The description field is particularly interesting since it contains a free-form textual description of a bicycle.

1. Fetch the demo data

You need to first fetch the demo dataset as a JSON array:

The following code allows you to look at the structure of one of our bike JSON documents.

2. Store the demo data in your database

Then, you iterate over the bikes array to store the data as JSON documents in the database by using the JSON.SET command. The below code uses a pipeline to minimize the round-trip times:

You can now retrieve a specific value from one of the JSON documents in Redis using a JSONPath expression:

3. Select a machine-learning model

This quick start guide uses a pre-trained MS MARCO model. They are widely used in search engines, chatbots, and other AI applications.

from sentence_transformers import SentenceTransformer

embedder = SentenceTransformer('msmarco-distilbert-base-v4')

4. Create the vector embeddings

In the next step, you must iterate over all the Redis keys with the prefix bikes::

Use the keys as a parameter to the JSON.MGET command, along with the JSONPath expression $.description to collect the descriptions in a list. Then, pass the list to the encode method to get a list of vectorized embeddings:

You now need to add the vectorized descriptions to the JSON documents in Redis using the JSON.SET command. The following command inserts a new field in each of the documents under the JSONPath $.description_embeddings. Once again, you'll do this using a pipeline:

Inspect one of the vectorized bike documents using the JSON.GET command:

When storing a vector embedding within a JSON document, the embedding is stored as a JSON array.

Note:
In the example above, the array was shortened considerably for the sake of readability.

Create an index

1. Create an index with a vector field

You must create an index to query based on vector metadata or perform vector searches. Use the FT.CREATE command:

Here is a breakdown of the VECTOR schema field definition:

  • $.description_embeddings AS vector: The vector field's JSON path and its field alias vector.
  • FLAT: Specifies the indexing method, which is either a flat index or a hierarchical navigable small world graph (HNSW).
  • TYPE FLOAT32: Sets the type of a vector component, in this case a 32-bit floating point number.
  • DIM 768: The length or dimension of the embeddings, which you determined previously to be 768.
  • DISTANCE_METRIC COSINE: The distance function is, in this example, cosine similarity.

You can find further details about all these options in the vector reference documentation.

2. Check the state of the index

As soon as you execute the FT.CREATE command, the indexing process runs in the background. In a short time, all JSON documents should be indexed and ready to be queried. To validate that, you can use the FT.INFO command, which provides details and statistics about the index. Of particular interest are the number of documents successfully indexed and the number of failures:

Search and query

This quick start guide focuses on the vector search aspect. Still, you can learn more about how to query based on vector metadata in the document database quick start guide.

1. Embed your prompts

The following code snipped shows a list of textual prompts:

You need first to encode the query prompts as you did with the descriptions of the bikes by using the same SentenceTransformers model:

2. Perform a K-nearest neighbors (KNN) query

KNN is a foundational algorithm that aims to find the most similar items to a given input. The KNN algorithm calculates the distance between the query vector and each vector in the database based on the chosen distance function. It then returns the K items with the smallest distances to the query vector. These are the most similar items.

The following example shows a query that doesn't apply a pre-filter. The pre-filter expression (*) means all, but you could replace it with a query expression that filters by additional metadata.

Then KNN part of the query searches for the three nearest neighbors. The distance to the query vector is returned as vector_score. The results are sorted by this score. Finally, it returns the fields vector_score, id, $.brand, $.model, and $.description within the resultset.

query = (
    Query('(*)=>[KNN 3 @vector $query_vector AS vector_score]')
     .sort_by('vector_score')
     .return_fields('vector_score', 'id', 'brand', 'model', 'description')
     .dialect(2)
)
Note:
To utilize a vector query with the FT.SEARCH command, you must specify DIALECT 2 or greater.

You must pass the vectorized query as $query_vector as a byte array. The following code shows an example of creating a Python NumPy array from a vectorized query prompt (encoded_query) as a single precision floating point array and converting it into a compact, byte-level representation that can be passed as a parameter to the query:

client.ft(INDEX_NAME).search(query, { 'query_vector': np.array(encoded_query, dtype=np.float32).tobytes() }).docs

With the template for the query in place, you can execute all query prompts in a loop by passing the vectorized query prompts over. Notice that the script calculates the vector_score for each result as 1 - doc.vector_score. Because the cosine distance is used as the metric, the items with the smallest distance are closer and, therefore, more similar to the query.

Then, loop over the matched documents and create a list of results that can be converted into a Pandas table to visualize the results:

The query results show the individual queries' top three matches (our K parameter) along with the bike's id, brand, and model for each query. For example, for the query "Best Mountain bikes for kids", the highest similarity score (0.54) and, therefore the closest match was the 'Nord' brand 'Chook air 5' bike model, described as:

The Chook Air 5 gives kids aged six years and older a durable and uberlight mountain bike for their first experience on tracks and easy cruising through forests and fields. The lower top tube makes it easy to mount and dismount in any situation, giving your kids greater safety on the trails. The Chook Air 5 is the perfect intro to mountain biking.

From the description, this bike is an excellent match for younger children, and the used embeddings have accurately captured the semantics of the description.