·9 min read

Upstash Vector: Serverless Vector Database for AI and LLMs

Mehmet DoganMehmet DoganCTO @Upstash

Today, we're introducing Upstash Vector, a serverless vector database built for working with vector embeddings for AI models and LLMs. Upstash Vector will help you efficiently store and query high-dimensional vector embeddings on a serverless model, eliminating the hosting and management complexities.

What's a Vector Database really?

A vector database is a specialized database designed to store, manage, and query/search vector embeddings, which are mathematical representations of data points in a high-dimensional space. A vector database supports various distance metrics (such as cosine or euclidean) to measure the similarity between vectors. While storing and managing vector data is straightforward, querying for vector similarity poses significant challenges.

A simple and basic approach to searching in a vector database is to perform an exhaustive search by comparing a query vector to every other vector stored in the database one by one. However, this consumes too many resources and results in very high latencies, making it not very practical. To address this problem, Approximate Nearest Neighbor (ANN) algorithms are used. ANN search approximates the true nearest neighbor, which means it might not find the absolute closest point, but it will find one that's close enough, with much faster performance and consuming fewer resources.

A vector database indexes stored vector embeddings using an ANN algorithm and a distance metric. That makes an index a data structure optimized for searching a given query vector in a vector database.

ANN Algorithms

Several ANN algorithms, such as HNSW, NSG, and DiskANN, are available each with its own advantages, problems and trade-offs. One of the difficult problems in ANN algorithms is, that indexing and querying vectors may require storing the whole data in memory. When the dataset is huge, then memory requirements for indexing may exceed available memory.

DiskANN algorithm tries to solve this problem by using disk as the main storage for indexes and for performing queries directly on disk. DiskANN paper acknowledges that HNSW and NSG can also achieve acceptable latencies when vectors are stored on disk. However, it argues that their latencies can be significantly higher compared to DiskANN, especially for large datasets and high recall requirements.

Even though DiskANN has its advantages, it also requires more work to be practical. The main problem is that you can’t insert/update existing vectors in the index without rebuilding the whole index. To solve this problem, there is another DiskANN paper: FreshDiskANN. FreshDiskANN improves DiskANN by introducing a temporary index for the most recent data in memory. Queries are served from both the temporary index and also from the disk index. These temporary indexes are merged to the disk index when needed using the StreamingMerge algorithm described in the paper.

Upstash Vector with DiskANN

Upstash Vector is using DiskANN and FreshDiskANN algorithms with a few more enhancements. We based our work on the amazing JVector library and implemented FreshDiskANN's StreamingMerge algorithm over it.

For each vector database, we create a transient index in memory (similar to FreshDiskANN's temporary index) and a disk-based index. All writes (except huge bursts) are performed on the transient index initially, and then background procedures merge the transient index to the disk-based index when some criteria are matched, such as memory usage, number of vectors, etc. Each query operation is executed on both transient and disk indexes, and their results are merged by comparing similarity scores.

An exception for writes is, when a huge burst write is detected, like inserting hundreds of thousands of vector upserts in a very short period, we skip writing the transient index and merge updates to the disk index directly. According to our internal tests, this significantly reduces the total time to index burst writes compared to using the transient index and then merging it to the disk index.

Similarity Functions

Vector similarity functions measure how "close" or "similar" two vectors are to each other, helping us understand relationships and patterns within the data. Each of these functions yields distinct query results, catering to specific use cases.

Here are the three supported similarity functions:

  • Cosine Similarity: Measures the cosine of the angle between two vectors. It is particularly useful when the magnitude of the vectors is not essential, and the focus is on the orientation.

  • Euclidean Distance: Measures the straight-line distance between two vectors in the multi-dimensional space. It is well-suited for scenarios where the magnitude of vectors is crucial, providing a measure of their spatial separation.

  • Dot Product: Measures the similarity by multiplying the corresponding components of two vectors and summing the results.

You can read more information about them and their main use cases in our docs: Vector Similarity Functions

Upstash Vector allows storing JSON metadata alongside the vector embeddings. So you can insert embedding-related metadata and get them back while querying the database. Upstash Vector currently supports storing and retrieving metadata alongside vectors. However, it doesn't yet allow filtering based on that metadata. Filtering is in our roadmap and it will enable you to refine vector search results even further.

See Metadata docs and Roadmap for more information.

Where will be your data stored?

Initially, Upstash Vector will be deployed on two AWS regions; us-east-1 (N. Virginia) and eu-west-1 (Ireland). We are planning to add more regions (and maybe more cloud providers) in the upcoming days.

Multitenancy

As a serverless database, Upstash Vector is a multi-tenant service too. We already have very good experience in providing multitenant services on multiple cloud providers and regions. See our Redis and Kafka offerings.

Similarly, the Upstash vector is deployed on multiple instances in each region, and databases are load-balanced among them due to their size, usage, and traffic. To avoid noisy neighbor problems, we implemented strict CPU parallelism and memory mapping restrictions for each database.

Plans & Limits & Quotas

Every cloud service has limitations and quotas. While flexibility is important, limitations and quotas ensure fair and sustainable service use.

Upstash Vector will have different-sized plans, and each plan has its own limits & quotas; such as database size, number of daily requests, max vector dimensions, etc. You can choose a plan that fits your current needs, and know you can always upgrade or downgrade as your project grows.

Initially we will have free, pay-as-you-go, fixed and pro plans. Pay-as-you-go plan is pure serverless with $0.4 per 100K requests, while fixed plan is $60 per month.

See the Upstash Vector pricing & plans page for more information.

API & SDKs

Upstash Vector provides a REST API to upsert, fetch, delete, and more importantly query vector embeddings. REST API works with JSON messages. You can learn more information at REST API page.

Also, we have released three official SDKs built around our REST API:

from upstash_vector import Index
index = Index(url=UPSTASH_VECTOR_REST_URL, token=UPSTASH_VECTOR_REST_TOKEN)
 
index.upsert(
    vectors=[
        ("id1", [0.1, 0.2], {"metadata_field": "metadata_value1"}),
        ("id2", [0.3, 0.4], {"metadata_field": "metadata_value2"}),
    ]
)
 
query_vector = [0.6, 0.9]
query_res = index.query(
    vector=query_vector,
    top_k=10,
    include_vectors=True,
    include_metadata=True,
)
// ...
import { Index } from "@upstash/vector";
 
const index = new Index({
  url: "<UPSTASH_VECTOR_REST_URL>",
  token: "<UPSTASH_VECTOR_REST_TOKEN>",
});
 
await index.upsert([
    {id: 'id1', vector: [0.1, 0.2], metadata: {metadata_field: 'metadata_value1'}},
    {id: 'id2', vector: [0.3, 0.4], metadata: {metadata_field: 'metadata_value2'}},
])
 
const results = await index.query({ topK: 10, vector: [ 0.6, 0.9 ]})
// ...
import (
	"github.com/upstash/vector-go"
)
 
func main() {
	index := vector.NewIndex("<UPSTASH_VECTOR_REST_URL>", "<UPSTASH_VECTOR_REST_TOKEN>")
 
    upserts := []vector.Upsert{
        {
            Id:     "id1",
            Vector: []float32{0.1, 0.2},
            Metadata: map[string]any{"metadata_field": "metadata_value1"}, 
        },
        {
            Id:       "id2",
            Vector:   []float32{0.3, 0.4},
            Metadata: map[string]any{"metadata_field": "metadata_value2"}, 
        },
    }
    err := index.UpsertMany(upserts)
 
    scores, err := index.Query(vector.Query{
        Vector:          []float32{0.6, 0.9},
        TopK:            10,
        IncludeVectors:  true,
        IncludeMetadata: true,
    })
    
    // ...
}

Still way to go: Our roadmap

There are still several features and enhancements planned in the Upstash Vector roadmap:

  • Metadata Filtering: Filtering feature will add hybrid search capability and will enable you to refine vector search results even further by filtering out irrelevant noise based on keywords or metadata, in addition to vector similarity.

  • Index Replication: With replication, there will be multiple read replicas of each index, ensuring:

    • High Availability: Even if one replica fails, others will remain accessible, maintaining index availability.
    • Reduced Latency: Replicas can be placed in different regions, bringing the index closer to clients and reducing latency for geographically distributed users.
  • Index Namespaces: Namespaces will allow you to divide an index into multiple, isolated partitions. Each namespace will function as a self-contained subset of the index.

See our Roadmap page to learn more.

What makes Upstash Vector different?

Upstash is already a pioneer of the serverless database model, allowing users to avoid infrastructure work and only pay for what they use. Despite the crowded vector database market, we believe it lacks a solution that is reasonably priced, powerful, and capable of supporting large amounts of data without compromising performance.

Some solutions offer a low barrier to entry and can be easily used as an extension to your existing database (bolt-on solutions). However, they struggle with high capacity loads. Most use HNSW algorithm, which is memory-intensive, resulting in increased infrastructure costs proportional to the amount of storage.

On the contrary, there are powerful, purpose-built solutions. Unfortunately, many of them offer a poor development experience due to complicated APIs and configuration options. A few offer an easy-to-use managed service, but their complex pricing model can lead to sudden cost increases.

We created Upstash Vector, a purpose-built, powerful vector database that aims an excellent developer experience and a flexible pricing model suitable for both startups and enterprises.

If you have any questions or need assistance, you can reach out to our support team at support@upstash.com or join our community on Discord. Stay updated by following us on Twitter.