Semantic search over your own content using vector embeddings, pgvector, and a similarity query layer.
March 31, 2026

Keyword search fails in ways that are obvious to users but hard to fix with traditional approaches. A post about "deploying Next.js to the cloud" won't appear when someone searches for "hosting a React app" — the words don't match even though the intent does. Semantic search using embeddings solves this by understanding meaning rather than matching strings.
This tutorial builds a semantic search layer for a blog using text embeddings, pgvector in PostgreSQL, and a similarity query in TypeScript.
An embedding model converts text into a high-dimensional vector — a list of floating-point numbers that represents the meaning of the text. Texts with similar meaning produce vectors that are close together in this space. "Deploying Next.js" and "hosting a React app" end up near each other; "baking bread" ends up far away.
To search semantically:
"Closest" is measured by cosine similarity or dot product. The results are the most semantically similar documents, regardless of word overlap.
pgvector is a PostgreSQL extension that adds a vector column type and vector similarity operators. If you're already using PostgreSQL (as this project is), it's the most practical option for adding vector search without a separate database.
Enable the extension:
CREATE EXTENSION IF NOT EXISTS vector;
If you're using Docker locally:
# Use pgvector's official image instead of plain postgres
docker pull pgvector/pgvector:pg16
In your docker-compose.yml:
services:
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_DB: blogdb
POSTGRES_USER: blog
POSTGRES_PASSWORD: blogpass
ports:
- "5432:5432"
Add a table to store post embeddings alongside your posts:
CREATE TABLE post_embeddings (
id SERIAL PRIMARY KEY,
post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
content TEXT NOT NULL,
embedding vector(1536), -- 1536 dimensions for text-embedding-3-small
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Index for fast similarity search
CREATE INDEX ON post_embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
The ivfflat index enables approximate nearest-neighbor search, which is fast enough for thousands of documents. For millions of rows, use hnsw instead.
Use OpenAI's text-embedding-3-small model (1536 dimensions, cheap, fast):
import OpenAI from 'openai'
import { Pool } from 'pg'
const openai = new OpenAI()
const pool = new Pool({ connectionString: process.env.DATABASE_URI })
async function embedPost(postId: number, content: string) {
// Generate embedding
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: content,
})
const embedding = response.data[0].embedding
// Store in database
await pool.query(
`INSERT INTO post_embeddings (post_id, content, embedding)
VALUES ($1, $2, $3)
ON CONFLICT (post_id) DO UPDATE
SET content = $2, embedding = $3, created_at = NOW()`,
[postId, content, `[${embedding.join(',')}]`]
)
}
What to embed for a blog post: title + description + first 500 words of content. This gives the embedding enough signal without hitting token limits or wasting money on the full post body.
async function indexPost(post: { id: number; title: string; description: string; content: string }) {
const textToEmbed = `${post.title}\n\n${post.description}\n\n${post.content.slice(0, 2000)}`
await embedPost(post.id, textToEmbed)
}
With embeddings stored, similarity search is a single SQL query:
async function semanticSearch(query: string, limit = 10) {
// Embed the search query
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: query,
})
const queryEmbedding = response.data[0].embedding
// Find similar posts
const result = await pool.query(
`SELECT
p.id,
p.title,
p.slug,
p.description,
1 - (pe.embedding <=> $1) AS similarity
FROM post_embeddings pe
JOIN posts p ON p.id = pe.post_id
WHERE p.status = 'published'
AND 1 - (pe.embedding <=> $1) > 0.7 -- minimum similarity threshold
ORDER BY pe.embedding <=> $1
LIMIT $2`,
[`[${queryEmbedding.join(',')}]`, limit]
)
return result.rows
}
The <=> operator computes cosine distance. Subtracting from 1 gives cosine similarity (higher = more similar). The > 0.7 threshold filters out weak matches — tune this based on your content.
Wire it into a Next.js API route:
// app/api/search/route.ts
import { NextRequest } from 'next/server'
export async function GET(req: NextRequest) {
const query = req.nextUrl.searchParams.get('q')
if (!query || query.length < 3) {
return Response.json({ results: [] })
}
try {
const results = await semanticSearch(query, 8)
return Response.json({ results })
} catch (err) {
console.error('Search error:', err)
return Response.json({ results: [] }, { status: 500 })
}
}
Pure semantic search sometimes misses exact title matches that users expect. Combine semantic search with full-text search using PostgreSQL's tsvector:
async function hybridSearch(query: string, limit = 10) {
const embedding = await getEmbedding(query)
const result = await pool.query(
`SELECT
p.id, p.title, p.slug, p.description,
-- Combine semantic and keyword scores
(0.7 * (1 - (pe.embedding <=> $1)) +
0.3 * ts_rank(to_tsvector('english', p.title || ' ' || p.description),
plainto_tsquery('english', $2))) AS score
FROM post_embeddings pe
JOIN posts p ON p.id = pe.post_id
WHERE p.status = 'published'
ORDER BY score DESC
LIMIT $3`,
[`[${embedding.join(',')}]`, query, limit]
)
return result.rows
}
The 70/30 weight split (semantic/keyword) is a reasonable starting point. Adjust based on your content and how users search.
Index new posts automatically using a Payload hook:
// In your Posts collection config
afterChange: [
async ({ doc, operation }) => {
if (doc.status === 'published') {
const content = `${doc.title}\n\n${doc.description}\n\n${plainTextFromLexical(doc.content)}`
await embedPost(doc.id, content).catch(console.error)
}
}
]
> 0.7) to filter weak matches before returning results