Building AI-Powered Semantic Search with Embeddings

Keyword search fails in ways that are obvious to users but hard to fix with traditional approaches. A post about "deploying Next.js to the cloud" won't appear when someone searches for "hosting a React app" — the words don't match even though the intent does. Semantic search using embeddings solves this by understanding meaning rather than matching strings.

This tutorial builds a semantic search layer for a blog using text embeddings, pgvector in PostgreSQL, and a similarity query in TypeScript.

How Embeddings Work

An embedding model converts text into a high-dimensional vector — a list of floating-point numbers that represents the meaning of the text. Texts with similar meaning produce vectors that are close together in this space. "Deploying Next.js" and "hosting a React app" end up near each other; "baking bread" ends up far away.

To search semantically:

Embed every document at index time and store the vectors
At query time, embed the search query
Find the documents whose vectors are closest to the query vector

"Closest" is measured by cosine similarity or dot product. The results are the most semantically similar documents, regardless of word overlap.

Setting Up pgvector

pgvector is a PostgreSQL extension that adds a vector column type and vector similarity operators. If you're already using PostgreSQL (as this project is), it's the most practical option for adding vector search without a separate database.

Enable the extension:

CREATE EXTENSION IF NOT EXISTS vector;

If you're using Docker locally:

# Use pgvector's official image instead of plain postgres
docker pull pgvector/pgvector:pg16

In your docker-compose.yml:

services:
  postgres:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_DB: blogdb
      POSTGRES_USER: blog
      POSTGRES_PASSWORD: blogpass
    ports:
      - "5432:5432"

Creating the Embeddings Table

Add a table to store post embeddings alongside your posts:

CREATE TABLE post_embeddings (
  id          SERIAL PRIMARY KEY,
  post_id     INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  content     TEXT NOT NULL,
  embedding   vector(1536),  -- 1536 dimensions for text-embedding-3-small
  created_at  TIMESTAMPTZ DEFAULT NOW()
);

-- Index for fast similarity search
CREATE INDEX ON post_embeddings
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

The ivfflat index enables approximate nearest-neighbor search, which is fast enough for thousands of documents. For millions of rows, use hnsw instead.

Generating Embeddings

Use OpenAI's text-embedding-3-small model (1536 dimensions, cheap, fast):

import OpenAI from 'openai'
import { Pool } from 'pg'

const openai = new OpenAI()
const pool = new Pool({ connectionString: process.env.DATABASE_URI })

async function embedPost(postId: number, content: string) {
  // Generate embedding
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: content,
  })
  const embedding = response.data[0].embedding

  // Store in database
  await pool.query(
    `INSERT INTO post_embeddings (post_id, content, embedding)
     VALUES ($1, $2, $3)
     ON CONFLICT (post_id) DO UPDATE
       SET content = $2, embedding = $3, created_at = NOW()`,
    [postId, content, `[${embedding.join(',')}]`]
  )
}

What to embed for a blog post: title + description + first 500 words of content. This gives the embedding enough signal without hitting token limits or wasting money on the full post body.

async function indexPost(post: { id: number; title: string; description: string; content: string }) {
  const textToEmbed = `${post.title}\n\n${post.description}\n\n${post.content.slice(0, 2000)}`
  await embedPost(post.id, textToEmbed)
}

The Search Query

With embeddings stored, similarity search is a single SQL query:

async function semanticSearch(query: string, limit = 10) {
  // Embed the search query
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: query,
  })
  const queryEmbedding = response.data[0].embedding

  // Find similar posts
  const result = await pool.query(
    `SELECT
       p.id,
       p.title,
       p.slug,
       p.description,
       1 - (pe.embedding <=> $1) AS similarity
     FROM post_embeddings pe
     JOIN posts p ON p.id = pe.post_id
     WHERE p.status = 'published'
       AND 1 - (pe.embedding <=> $1) > 0.7  -- minimum similarity threshold
     ORDER BY pe.embedding <=> $1
     LIMIT $2`,
    [`[${queryEmbedding.join(',')}]`, limit]
  )

  return result.rows
}

The <=> operator computes cosine distance. Subtracting from 1 gives cosine similarity (higher = more similar). The > 0.7 threshold filters out weak matches — tune this based on your content.

Next.js API Route

Wire it into a Next.js API route:

// app/api/search/route.ts
import { NextRequest } from 'next/server'

export async function GET(req: NextRequest) {
  const query = req.nextUrl.searchParams.get('q')
  if (!query || query.length < 3) {
    return Response.json({ results: [] })
  }

  try {
    const results = await semanticSearch(query, 8)
    return Response.json({ results })
  } catch (err) {
    console.error('Search error:', err)
    return Response.json({ results: [] }, { status: 500 })
  }
}

Hybrid Search

Pure semantic search sometimes misses exact title matches that users expect. Combine semantic search with full-text search using PostgreSQL's tsvector:

async function hybridSearch(query: string, limit = 10) {
  const embedding = await getEmbedding(query)

  const result = await pool.query(
    `SELECT
       p.id, p.title, p.slug, p.description,
       -- Combine semantic and keyword scores
       (0.7 * (1 - (pe.embedding <=> $1)) +
        0.3 * ts_rank(to_tsvector('english', p.title || ' ' || p.description),
                      plainto_tsquery('english', $2))) AS score
     FROM post_embeddings pe
     JOIN posts p ON p.id = pe.post_id
     WHERE p.status = 'published'
     ORDER BY score DESC
     LIMIT $3`,
    [`[${embedding.join(',')}]`, query, limit]
  )

  return result.rows
}

The 70/30 weight split (semantic/keyword) is a reasonable starting point. Adjust based on your content and how users search.

Keeping Embeddings Fresh

Index new posts automatically using a Payload hook:

// In your Posts collection config
afterChange: [
  async ({ doc, operation }) => {
    if (doc.status === 'published') {
      const content = `${doc.title}\n\n${doc.description}\n\n${plainTextFromLexical(doc.content)}`
      await embedPost(doc.id, content).catch(console.error)
    }
  }
]

Key Takeaways

Embeddings find semantic matches that keyword search misses — the same idea expressed differently
pgvector brings vector search into your existing PostgreSQL setup; no separate vector database needed
Embed title + description + first ~2000 chars; embedding the full post body is usually wasteful
Use a similarity threshold (> 0.7) to filter weak matches before returning results
Hybrid search (semantic + keyword) outperforms either approach alone for most use cases