From my research, the Vector Search landscape is converging around these five offerings:
- String Lookups (Database)
- Vector Embedding (ML)
- ANN/kNN (Vector Search)
- Filtering (Token Search - tf-idf/bm25)
- Re-Ranking (Feedback Loops, Model Re-Training, Learn-To-Rank, etc.)
This repository will discuss the various criteria to use when evaluating the landscape of vector search engine technologies.
| Hosting | Performance | MLOps | Availability | Security | Cost | Community | Dense vs Sparse | |
|---|---|---|---|---|---|---|---|---|
| Pinecone | Cloud Only | N/A | ||||||
| Weaviate | Both | 31 | ||||||
| Cloud Only | N/A | Dense | ||||||
| Elastic | Both | 161 | Both | |||||
| Algolia | Cloud Only | N/A | Both | |||||
| Vespa | Both | N/A | ||||||
| Milvus | Both | 194 | ||||||
| Redis | Both | 617 | Both | |||||
| Qdrant | Both | 28 | ||||||
| OpenSearch | Both | 135 | Both | |||||
| LucidWorks | Cloud Only |
Where is the search engine deployable?
- Both: Can be deployed as a managed service in the cloud or self-managed on premise.
- Cloud Only: Can only be deployed as a managed service in the cloud.
This gets challenging as some of the technologies are cloud exclusive and some are hybrid capable. We'll have to devise a benchmark that compares read/write SLAs against comparable hardware. Ideally, we'll use Locust.io to simulate fixed throughput concurrency.
Proposed benchmarks below:
Hardware Selection:
- Must select single node instances with comparable CPU, RAM, Disk, and IOPs
Write:
- Insert 1,000,000 vectors with a fixed dimensionality (384) as an output from the most popular sentence similarity transformer
- Measure the time to completion/durability
Read:
- Measure max reads/sec and query response latency
- Fixed K value (10)
- Will only be using cosine similarity to limit scope
With the caveat that some of these vector search engines are baked into a parent repository (i.e. Redis vector search within core Redis) we've taken the total number of contributors as of 9/19/2022.
Can you perform dense vector searches in conjunction with sparse vector searches?
TBA