
You're building an AI customer service chatbot that needs to answer questions about your company's 50,000 support articles. Traditional databases can find exact matches for keywords, but they can't understand that "How do I reset my password?" and "I forgot my login credentials" mean essentially the same thing. This is where vector databases come in — they store and search information based on semantic meaning rather than exact text matches.
By the end of this lesson, you'll understand how vector databases work, when to use them, and how to choose between the three most popular options for production systems. You'll also build a working example that demonstrates the key differences between these platforms.
What you'll learn:
You should be comfortable with:
No prior experience with vector databases, embeddings, or machine learning is required.
Think of a traditional database as a filing cabinet organized alphabetically. If you're looking for "Smith," you know exactly where to find it. But what if you need to find all documents related to "customer satisfaction" when some files are labeled "happy clients," "user feedback," or "service quality"?
Vector databases solve this problem by converting information into numerical representations called vectors — essentially coordinates in a multi-dimensional space. Similar concepts end up close together in this space, allowing you to find related information even when the exact words don't match.
Here's how the process works:
Let's see this in action with a simple example:
# Conceptual example - don't run this yet
query = "password reset help"
query_vector = [0.2, 0.8, -0.1, 0.5, ...] # 1536 dimensions typically
stored_vectors = {
"How to reset forgotten password": [0.3, 0.7, -0.2, 0.4, ...],
"Changing login credentials": [0.1, 0.9, -0.1, 0.6, ...],
"Pizza delivery menu": [-0.8, 0.1, 0.9, -0.3, ...]
}
# The system calculates distances and returns the closest matches
# Password-related articles will be much closer than the pizza menu
The magic happens in that multi-dimensional space. Similar concepts cluster together, enabling semantic search that understands meaning rather than just matching keywords.
Pinecone is a fully managed vector database service that handles all the infrastructure complexity for you. Think of it as the "AWS RDS of vector databases" — you get enterprise-grade performance without managing servers.
Architecture highlights:
Best for: Teams that want to focus on application development rather than database administration, especially those building customer-facing applications that need reliable performance.
Weaviate is an open-source vector database that you can self-host or use as a managed service. It's designed as a complete AI-native database with built-in vectorization and advanced querying capabilities.
Architecture highlights:
Best for: Organizations that need full control over their data and infrastructure, or those requiring advanced querying capabilities and custom model integration.
pgvector extends PostgreSQL with vector similarity search capabilities. If you're already using PostgreSQL, this adds vector search without introducing a new database system.
Architecture highlights:
Best for: Teams already invested in PostgreSQL who want to add vector search capabilities to existing applications without architectural changes.
Let's get hands-on experience with each platform. We'll build a simple document search system to compare their capabilities.
First, install the required packages:
pip install pinecone-client weaviate-client psycopg2-binary openai python-dotenv
Create a .env file for your API keys (we'll add these as we go):
# .env file
OPENAI_API_KEY=your_openai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here
WEAVIATE_URL=your_weaviate_url_here
WEAVIATE_API_KEY=your_weaviate_api_key_here
DATABASE_URL=postgresql://username:password@localhost:5432/vectordb
Here's our sample dataset — customer support articles that we'll use across all three platforms:
# sample_data.py
import openai
import os
from dotenv import load_dotenv
load_dotenv()
openai.api_key = os.getenv('OPENAI_API_KEY')
# Sample support articles
SUPPORT_ARTICLES = [
{
"id": "art_001",
"title": "How to Reset Your Password",
"content": "If you've forgotten your password, click the 'Forgot Password' link on the login page. Enter your email address and we'll send you a reset link.",
"category": "authentication",
"tags": ["password", "login", "security"]
},
{
"id": "art_002",
"title": "Troubleshooting Payment Issues",
"content": "Payment problems can occur due to expired cards, insufficient funds, or bank security measures. Check your payment method and try again.",
"category": "billing",
"tags": ["payment", "billing", "credit card"]
},
{
"id": "art_003",
"title": "Managing Your Account Settings",
"content": "Access your account settings by clicking your profile icon. Here you can update personal information, notification preferences, and privacy settings.",
"category": "account",
"tags": ["settings", "profile", "account"]
},
{
"id": "art_004",
"title": "Understanding Your Monthly Bill",
"content": "Your monthly bill includes subscription fees, usage charges, and applicable taxes. Download detailed billing statements from your account dashboard.",
"category": "billing",
"tags": ["billing", "charges", "invoice"]
}
]
def get_embedding(text):
"""Generate embeddings using OpenAI's API"""
response = openai.Embedding.create(
model="text-embedding-ada-002",
input=text
)
return response['data'][0]['embedding']
def prepare_articles_with_embeddings():
"""Add embeddings to our sample articles"""
articles_with_embeddings = []
for article in SUPPORT_ARTICLES:
# Combine title and content for embedding
text_to_embed = f"{article['title']} {article['content']}"
embedding = get_embedding(text_to_embed)
articles_with_embeddings.append({
**article,
"embedding": embedding,
"text_to_embed": text_to_embed
})
return articles_with_embeddings
Let's start with Pinecone since it's the most straightforward to set up. First, create a free account at pinecone.io and get your API key.
# pinecone_example.py
import pinecone
import openai
import os
from dotenv import load_dotenv
from sample_data import prepare_articles_with_embeddings, get_embedding
load_dotenv()
# Initialize Pinecone
pinecone.init(
api_key=os.getenv('PINECONE_API_KEY'),
environment='us-west1-gcp-free' # Use your environment
)
class PineconeVectorDB:
def __init__(self, index_name="support-articles"):
self.index_name = index_name
self.dimension = 1536 # OpenAI embedding dimension
self.metric = "cosine"
# Create index if it doesn't exist
if index_name not in pinecone.list_indexes():
pinecone.create_index(
name=index_name,
dimension=self.dimension,
metric=self.metric
)
self.index = pinecone.Index(index_name)
def upsert_articles(self, articles):
"""Insert or update articles in the index"""
vectors_to_upsert = []
for article in articles:
vectors_to_upsert.append({
"id": article["id"],
"values": article["embedding"],
"metadata": {
"title": article["title"],
"content": article["content"],
"category": article["category"],
"tags": ",".join(article["tags"])
}
})
# Upsert in batches
self.index.upsert(vectors=vectors_to_upsert)
print(f"Upserted {len(vectors_to_upsert)} articles to Pinecone")
def search(self, query_text, top_k=3, filter_dict=None):
"""Search for similar articles"""
query_embedding = get_embedding(query_text)
results = self.index.query(
vector=query_embedding,
top_k=top_k,
include_metadata=True,
filter=filter_dict
)
return results["matches"]
def search_with_category_filter(self, query_text, category, top_k=3):
"""Search within a specific category"""
return self.search(
query_text,
top_k=top_k,
filter_dict={"category": {"$eq": category}}
)
# Usage example
if __name__ == "__main__":
# Setup
db = PineconeVectorDB()
articles = prepare_articles_with_embeddings()
# Insert articles
db.upsert_articles(articles)
# Test searches
print("=== General Search ===")
results = db.search("I can't log into my account")
for match in results:
print(f"Score: {match.score:.3f}")
print(f"Title: {match.metadata['title']}")
print(f"Category: {match.metadata['category']}")
print()
print("=== Category-Filtered Search ===")
results = db.search_with_category_filter("monthly charges", "billing")
for match in results:
print(f"Score: {match.score:.3f}")
print(f"Title: {match.metadata['title']}")
print()
Pinecone's strength lies in its simplicity and reliability. Notice how the metadata filtering works — you can combine semantic search with traditional filtering to narrow down results by category, date ranges, or any other structured data.
Weaviate offers more advanced querying capabilities through its GraphQL interface. You can use Weaviate Cloud Services (WCS) for a managed instance or run it locally with Docker.
For local setup:
docker run -d \
--name weaviate \
-p 8080:8080 \
-e QUERY_DEFAULTS_LIMIT=25 \
-e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
-e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
semitechnologies/weaviate:latest
# weaviate_example.py
import weaviate
import openai
import os
from dotenv import load_dotenv
from sample_data import prepare_articles_with_embeddings, get_embedding
load_dotenv()
class WeaviateVectorDB:
def __init__(self, url="http://localhost:8080"):
self.client = weaviate.Client(url=url)
self.class_name = "SupportArticle"
# Create schema if it doesn't exist
self._create_schema()
def _create_schema(self):
"""Create the schema for our articles"""
# Check if class exists
try:
self.client.schema.get(self.class_name)
return # Schema already exists
except:
pass
schema = {
"class": self.class_name,
"description": "Support articles for customer service",
"vectorizer": "none", # We'll provide our own vectors
"properties": [
{
"name": "title",
"dataType": ["string"],
"description": "Article title"
},
{
"name": "content",
"dataType": ["text"],
"description": "Article content"
},
{
"name": "category",
"dataType": ["string"],
"description": "Article category"
},
{
"name": "tags",
"dataType": ["string[]"],
"description": "Article tags"
},
{
"name": "articleId",
"dataType": ["string"],
"description": "Unique article identifier"
}
]
}
self.client.schema.create_class(schema)
print(f"Created schema for {self.class_name}")
def insert_articles(self, articles):
"""Insert articles with their vectors"""
with self.client.batch as batch:
for article in articles:
properties = {
"title": article["title"],
"content": article["content"],
"category": article["category"],
"tags": article["tags"],
"articleId": article["id"]
}
batch.add_data_object(
data_object=properties,
class_name=self.class_name,
vector=article["embedding"]
)
print(f"Inserted {len(articles)} articles into Weaviate")
def search(self, query_text, top_k=3):
"""Search for similar articles using GraphQL"""
query_embedding = get_embedding(query_text)
result = (
self.client.query
.get(self.class_name, ["title", "content", "category", "tags", "articleId"])
.with_near_vector({
"vector": query_embedding,
"certainty": 0.7 # Minimum similarity threshold
})
.with_limit(top_k)
.with_additional(["certainty", "distance"])
.do()
)
return result["data"]["Get"][self.class_name]
def search_with_where_filter(self, query_text, category, top_k=3):
"""Search with WHERE clause filtering"""
query_embedding = get_embedding(query_text)
result = (
self.client.query
.get(self.class_name, ["title", "content", "category", "tags"])
.with_where({
"path": ["category"],
"operator": "Equal",
"valueString": category
})
.with_near_vector({
"vector": query_embedding,
"certainty": 0.7
})
.with_limit(top_k)
.with_additional(["certainty"])
.do()
)
return result["data"]["Get"][self.class_name]
def hybrid_search(self, query_text, top_k=3):
"""Combine vector search with BM25 keyword search"""
result = (
self.client.query
.get(self.class_name, ["title", "content", "category"])
.with_hybrid(
query=query_text,
alpha=0.7 # Weight between vector (1.0) and keyword (0.0) search
)
.with_limit(top_k)
.with_additional(["score"])
.do()
)
return result["data"]["Get"][self.class_name]
# Usage example
if __name__ == "__main__":
db = WeaviateVectorDB()
articles = prepare_articles_with_embeddings()
# Insert articles
db.insert_articles(articles)
print("=== Vector Search ===")
results = db.search("forgot my login information")
for result in results:
print(f"Certainty: {result['_additional']['certainty']:.3f}")
print(f"Title: {result['title']}")
print(f"Category: {result['category']}")
print()
print("=== Hybrid Search ===")
results = db.hybrid_search("billing payment issues")
for result in results:
print(f"Score: {result['_additional']['score']:.3f}")
print(f"Title: {result['title']}")
print()
Weaviate's GraphQL interface provides incredible flexibility. The hybrid search feature is particularly powerful — it combines semantic similarity with traditional keyword matching, giving you the best of both worlds.
pgvector requires a PostgreSQL database with the pgvector extension installed. Here's how to set it up locally:
# Install PostgreSQL and pgvector (Ubuntu/Debian)
sudo apt-get install postgresql postgresql-contrib
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
# Create database and enable extension
sudo -u postgres createdb vectordb
sudo -u postgres psql -d vectordb -c "CREATE EXTENSION vector;"
# pgvector_example.py
import psycopg2
import numpy as np
import os
from dotenv import load_dotenv
from sample_data import prepare_articles_with_embeddings, get_embedding
load_dotenv()
class PgVectorDB:
def __init__(self, database_url=None):
self.database_url = database_url or os.getenv('DATABASE_URL')
self.conn = psycopg2.connect(self.database_url)
self.conn.autocommit = True
# Create table if it doesn't exist
self._create_table()
def _create_table(self):
"""Create the articles table with vector column"""
with self.conn.cursor() as cur:
cur.execute("""
CREATE TABLE IF NOT EXISTS support_articles (
id VARCHAR PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
category VARCHAR NOT NULL,
tags TEXT[],
embedding vector(1536), -- OpenAI embedding dimension
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
""")
# Create vector index for performance
cur.execute("""
CREATE INDEX IF NOT EXISTS support_articles_embedding_idx
ON support_articles
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
""")
print("Created support_articles table with vector index")
def insert_articles(self, articles):
"""Insert articles with their embeddings"""
with self.conn.cursor() as cur:
for article in articles:
cur.execute("""
INSERT INTO support_articles (id, title, content, category, tags, embedding)
VALUES (%s, %s, %s, %s, %s, %s)
ON CONFLICT (id) DO UPDATE SET
title = EXCLUDED.title,
content = EXCLUDED.content,
category = EXCLUDED.category,
tags = EXCLUDED.tags,
embedding = EXCLUDED.embedding;
""", (
article["id"],
article["title"],
article["content"],
article["category"],
article["tags"],
article["embedding"]
))
print(f"Inserted {len(articles)} articles into PostgreSQL")
def search(self, query_text, top_k=3):
"""Search for similar articles using cosine similarity"""
query_embedding = get_embedding(query_text)
with self.conn.cursor() as cur:
cur.execute("""
SELECT
id, title, content, category, tags,
1 - (embedding <=> %s) AS similarity
FROM support_articles
ORDER BY embedding <=> %s
LIMIT %s;
""", (query_embedding, query_embedding, top_k))
results = cur.fetchall()
# Convert to dictionary format
columns = ["id", "title", "content", "category", "tags", "similarity"]
return [dict(zip(columns, row)) for row in results]
def search_with_category_filter(self, query_text, category, top_k=3):
"""Search within a specific category"""
query_embedding = get_embedding(query_text)
with self.conn.cursor() as cur:
cur.execute("""
SELECT
id, title, content, category, tags,
1 - (embedding <=> %s) AS similarity
FROM support_articles
WHERE category = %s
ORDER BY embedding <=> %s
LIMIT %s;
""", (query_embedding, category, query_embedding, top_k))
results = cur.fetchall()
columns = ["id", "title", "content", "category", "tags", "similarity"]
return [dict(zip(columns, row)) for row in results]
def search_with_sql_conditions(self, query_text, min_similarity=0.8, top_k=3):
"""Advanced search with SQL conditions"""
query_embedding = get_embedding(query_text)
with self.conn.cursor() as cur:
cur.execute("""
SELECT
id, title, content, category, tags,
1 - (embedding <=> %s) AS similarity
FROM support_articles
WHERE 1 - (embedding <=> %s) >= %s
ORDER BY embedding <=> %s
LIMIT %s;
""", (query_embedding, query_embedding, min_similarity, query_embedding, top_k))
results = cur.fetchall()
columns = ["id", "title", "content", "category", "tags", "similarity"]
return [dict(zip(columns, row)) for row in results]
def get_statistics(self):
"""Get database statistics"""
with self.conn.cursor() as cur:
cur.execute("""
SELECT
COUNT(*) as total_articles,
COUNT(DISTINCT category) as unique_categories,
AVG(array_length(tags, 1)) as avg_tags_per_article
FROM support_articles;
""")
result = cur.fetchone()
return {
"total_articles": result[0],
"unique_categories": result[1],
"avg_tags_per_article": float(result[2]) if result[2] else 0
}
# Usage example
if __name__ == "__main__":
db = PgVectorDB()
articles = prepare_articles_with_embeddings()
# Insert articles
db.insert_articles(articles)
print("=== Basic Vector Search ===")
results = db.search("can't access my account")
for result in results:
print(f"Similarity: {result['similarity']:.3f}")
print(f"Title: {result['title']}")
print(f"Category: {result['category']}")
print()
print("=== High-Similarity Search ===")
results = db.search_with_sql_conditions("billing questions", min_similarity=0.7)
for result in results:
print(f"Similarity: {result['similarity']:.3f}")
print(f"Title: {result['title']}")
print()
print("=== Database Statistics ===")
stats = db.get_statistics()
print(f"Total articles: {stats['total_articles']}")
print(f"Unique categories: {stats['unique_categories']}")
print(f"Average tags per article: {stats['avg_tags_per_article']:.1f}")
pgvector's advantage is its tight integration with PostgreSQL. You can use standard SQL operations, joins with other tables, transactions, and all the PostgreSQL features you already know.
Let's build a complete comparison system that demonstrates the key differences between all three platforms. This exercise will help you understand the practical implications of choosing each technology.
# comparison_demo.py
import time
import statistics
from pinecone_example import PineconeVectorDB
from weaviate_example import WeaviateVectorDB
from pgvector_example import PgVectorDB
from sample_data import prepare_articles_with_embeddings
class VectorDBComparison:
def __init__(self):
self.databases = {
"Pinecone": PineconeVectorDB("comparison-test"),
"Weaviate": WeaviateVectorDB(),
"pgvector": PgVectorDB()
}
self.test_queries = [
"I forgot my password and can't log in",
"Why was I charged twice this month?",
"How do I change my notification settings?",
"My payment was declined",
"Where can I download my invoice?"
]
def setup_all_databases(self):
"""Initialize all databases with the same data"""
print("Setting up test data...")
articles = prepare_articles_with_embeddings()
for name, db in self.databases.items():
print(f"Setting up {name}...")
if name == "Pinecone":
db.upsert_articles(articles)
elif name == "Weaviate":
db.insert_articles(articles)
else: # pgvector
db.insert_articles(articles)
# Small delay to ensure consistency
time.sleep(1)
def benchmark_search_speed(self):
"""Compare search performance across platforms"""
print("\n=== Search Performance Benchmark ===")
results = {}
for db_name, db in self.databases.items():
times = []
for query in self.test_queries:
start_time = time.time()
# Perform search based on database type
if db_name == "Pinecone":
db.search(query, top_k=3)
elif db_name == "Weaviate":
db.search(query, top_k=3)
else: # pgvector
db.search(query, top_k=3)
end_time = time.time()
times.append(end_time - start_time)
avg_time = statistics.mean(times)
std_time = statistics.stdev(times) if len(times) > 1 else 0
results[db_name] = {
"avg_time": avg_time,
"std_time": std_time,
"times": times
}
print(f"{db_name}: {avg_time:.3f}s ± {std_time:.3f}s")
return results
def compare_search_results(self):
"""Compare search result quality and consistency"""
print("\n=== Search Result Comparison ===")
test_query = "I need help with billing issues"
for db_name, db in self.databases.items():
print(f"\n{db_name} Results:")
if db_name == "Pinecone":
results = db.search(test_query, top_k=3)
for i, match in enumerate(results, 1):
print(f"{i}. {match.metadata['title']} (Score: {match.score:.3f})")
elif db_name == "Weaviate":
results = db.search(test_query, top_k=3)
for i, result in enumerate(results, 1):
certainty = result.get('_additional', {}).get('certainty', 0)
print(f"{i}. {result['title']} (Certainty: {certainty:.3f})")
else: # pgvector
results = db.search(test_query, top_k=3)
for i, result in enumerate(results, 1):
print(f"{i}. {result['title']} (Similarity: {result['similarity']:.3f})")
def test_filtering_capabilities(self):
"""Test metadata filtering across platforms"""
print("\n=== Filtering Capabilities ===")
query = "payment problem"
category = "billing"
for db_name, db in self.databases.items():
print(f"\n{db_name} - Billing Category Filter:")
if db_name == "Pinecone":
results = db.search_with_category_filter(query, category, top_k=2)
for match in results:
print(f" • {match.metadata['title']}")
elif db_name == "Weaviate":
results = db.search_with_where_filter(query, category, top_k=2)
for result in results:
print(f" • {result['title']}")
else: # pgvector
results = db.search_with_category_filter(query, category, top_k=2)
for result in results:
print(f" • {result['title']}")
def analyze_advanced_features(self):
"""Demonstrate unique features of each platform"""
print("\n=== Advanced Features Demonstration ===")
# Weaviate hybrid search
print("Weaviate Hybrid Search (combines vector + keyword):")
weaviate_db = self.databases["Weaviate"]
hybrid_results = weaviate_db.hybrid_search("password reset", top_k=2)
for result in hybrid_results:
score = result.get('_additional', {}).get('score', 0)
print(f" • {result['title']} (Hybrid Score: {score:.3f})")
# pgvector SQL flexibility
print("\nPgvector SQL Analysis:")
pgvector_db = self.databases["pgvector"]
stats = pgvector_db.get_statistics()
print(f" • Database contains {stats['total_articles']} articles")
print(f" • Across {stats['unique_categories']} categories")
# Pinecone namespace demonstration (would require setup)
print("\nPinecone Features:")
print(" • Fully managed scaling")
print(" • Multiple deployment regions")
print(" • Built-in monitoring and analytics")
# Run the comparison
if __name__ == "__main__":
comparison = VectorDBComparison()
# Setup all databases
comparison.setup_all_databases()
# Run comparisons
comparison.benchmark_search_speed()
comparison.compare_search_results()
comparison.test_filtering_capabilities()
comparison.analyze_advanced_features()
Run this comparison to see how each database performs with your specific use case. Pay attention to the search speed, result consistency, and how easy it is to implement filtering.
Pinecone typically offers the fastest query times due to its purpose-built infrastructure and optimized indexing algorithms. Expect sub-100ms response times for most queries.
Weaviate performance depends on your deployment method. The managed service offers consistent performance, while self-hosted instances can be tuned for your specific workload.
pgvector performance varies significantly based on your PostgreSQL configuration and hardware. The ivfflat index provides good performance for most use cases, but may be slower than specialized vector databases for large datasets.
Each platform scales differently:
# Scaling characteristics example
scaling_profiles = {
"Pinecone": {
"horizontal_scaling": "Automatic",
"max_vectors": "Billions",
"index_management": "Managed",
"cost_model": "Pay per query + storage"
},
"Weaviate": {
"horizontal_scaling": "Manual sharding",
"max_vectors": "Depends on cluster size",
"index_management": "Self-managed",
"cost_model": "Infrastructure costs"
},
"pgvector": {
"horizontal_scaling": "PostgreSQL limitations",
"max_vectors": "Millions per instance",
"index_management": "Manual optimization",
"cost_model": "Database hosting costs"
}
}
Let's break down the real costs for a typical application serving 1 million vectors with 10,000 queries per day:
Pinecone: ~$70/month for a p1.x1 pod plus query costs Weaviate Cloud: ~$25/month for basic tier, scaling based on usage pgvector: ~$20-100/month depending on your PostgreSQL hosting provider
Remember that costs extend beyond the database itself. Consider development time, operational overhead, and integration complexity when making your decision.
The most common error is trying to insert vectors with the wrong dimensions:
# Wrong - mixing embedding models
index.upsert([{
"id": "test",
"values": openai_embedding, # 1536 dimensions
}])
# Later...
index.upsert([{
"id": "test2",
"values": sentence_transformer_embedding, # 768 dimensions - ERROR!
}])
Solution: Always verify your embedding dimensions match your index configuration:
def validate_embedding_dimension(embedding, expected_dim=1536):
if len(embedding) != expected_dim:
raise ValueError(f"Expected {expected_dim} dimensions, got {len(embedding)}")
return embedding
Vector searches can become slow with poor indexing strategies:
# Inefficient - no proper indexing
# pgvector without index
CREATE TABLE articles (
id VARCHAR PRIMARY KEY,
embedding vector(1536) -- No index!
);
# Efficient - with proper indexing
CREATE INDEX articles_embedding_idx
ON articles
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Overly restrictive filters can return no results even when similar vectors exist:
# Too restrictive
results = pinecone_index.query(
vector=query_vector,
filter={"category": "billing", "tags": "urgent", "date": "2024-01-01"} # Too specific
)
# Better approach - hierarchical filtering
results = pinecone_index.query(
vector=query_vector,
filter={"category": "billing"}, # Start broad
top_k=10
)
# Then apply additional filtering in application code
Vector databases can experience connection issues during high load:
import time
from typing import List, Dict, Any
def robust_vector_search(db, query_vector: List[float], retries: int = 3) -> Dict[str, Any]:
"""Implement retry logic for vector searches"""
for attempt in range(retries):
try:
results = db.search(query_vector, top_k=5)
return results
except Exception as e:
if attempt == retries - 1:
raise e
print(f"Attempt {attempt + 1} failed: {e}")
time.sleep(2 ** attempt) # Exponential backoff
return {"matches": []}
Large vector datasets can consume significant memory:
# Bad - loading everything into memory
all_embeddings = []
for document in documents: # 1 million documents
embedding = get_embedding(document)
all_embeddings.append(embedding) # Memory explosion!
# Good - batch processing
def process_documents_in_batches(documents, batch_size=100):
for i in range(0, len(documents), batch_size):
batch = documents[i:i + batch_size]
embeddings = [get_embedding(doc) for doc in batch]
# Process batch immediately
db.insert_batch(embeddings)
# Clear memory
del embeddings
You now understand the core differences between the three major vector database options:
Choose Pinecone if you want:
Choose Weaviate if you need:
Choose pgvector if you have:
The vector database landscape is evolving rapidly. Next steps for deepening your knowledge:
Vector databases are becoming the foundation for AI-powered applications. Understanding their strengths and limitations will help you build more effective, scalable systems that truly understand the meaning behind your data.
Learning Path: RAG & AI Agents