Free lesson

Install pgvector and create vector-enabled tables

You enable the pgvector extension, design tables with the right vector dimension, and choose between SERIAL and UUID primary keys for distributed GenAI workloads.

~25 min read · Free to read — no subscription required.

Install and configure pgvector extension for vector storage in PostgreSQL

PostgreSQL has long served as the backbone of transactional workloads, but the rise of GenAI pipelines demands that your database also store and query high-dimensional vector embeddings efficiently. The pgvector extension bridges this gap by adding a native vector data type, distance operators for cosine similarity, L2 distance, and inner product, and indexing strategies like HNSW index and IVFFlat index — all within the same PostgreSQL instance that already manages your relational data. This section walks you through every step: from compiling and enabling pgvector, to creating vector-enabled tables, to validating your setup with real embedding dimensions used by production models such as OpenAI's text-embedding-ada-002 (1536 dimensions) and Google's Gemini embedding models (768 dimensions). By the end, you will have a running PostgreSQL instance that accepts vector embeddings alongside traditional columns, ready for similarity search and hybrid retrieval queries explored in later goals.

Introduction

When you add vector search to an existing PostgreSQL application, a separate vector database introduces a new operational boundary and breaks ACID guarantees across your relational and vector data — a user profile update and its regenerated embedding can no longer commit atomically. pgvector eliminates that trade-off by living inside PostgreSQL itself, giving you one connection string, one backup pipeline, and full transactional consistency between your rows and their embeddings. By the end of this lesson, you will be able to enable the pgvector extension, create a table with a properly dimensioned vector column, and validate your PostgreSQL memory configuration for vector workloads.

Key Terminology

Extension — A PostgreSQL plug-in that adds new data types, operators, index access methods, or functions without modifying the core engine. pgvector registers the vector type and distance operators through the CREATE EXTENSION mechanism.
Shared library — The compiled .so file (e.g., vector.so) that PostgreSQL loads into its backend process address space. The library must be present on disk before CREATE EXTENSION succeeds.
Control file — A metadata file (vector.control) that tells PostgreSQL the extension's default version, schema, and whether it is relocatable. This file lives in the SHAREDIR/extension/ directory.

Concepts

Installing pgvector

pgvector ships as a PostgreSQL extension written in C. On most Linux distributions, you install it via the system package manager. On Docker-based deployments — the standard for GenAI infrastructure — you either use the official pgvector/pgvector Docker image or add the extension to your existing PostgreSQL image.

For production environments, pin the pgvector version to avoid unexpected behavior changes across upgrades. At the time of writing, pgvector 0.7.x is the stable branch with full HNSW index support and improved memory management for large vector dimensions.

Dimension selection strategy

Choosing the right vector dimensions is not a free parameter — it is dictated by the embedding model you select and has direct consequences for storage, index size, and query latency.

Vector dimensions — The number of floating-point components in each embedding. Higher dimensions capture more semantic nuance but consume more storage (4 bytes per float × dimension count per row) and slow down distance calculations.
Cosine similarity — A distance metric that measures the angle between two vectors, ignoring magnitude. pgvector implements it via the <=> operator, returning values between 0 (identical direction) and 2 (opposite direction). Most semantic search applications use cosine similarity because embeddings from transformer models are not normalized to unit length by default.
L2 distance — Euclidean distance between two vectors. pgvector uses the <-> operator. Sensitive to vector magnitude, which makes it less suitable for raw transformer embeddings unless you normalize them before insertion.
Inner product — The dot product of two vectors, accessed via <#> in pgvector. Returns the negative inner product so that PostgreSQL's ORDER BY ... ASC convention returns the highest-similarity results first. Useful when embeddings are already normalized.

The table below summarizes common embedding models and their dimensions:

Model	Dimensions	Recommended metric
OpenAI `text-embedding-ada-002`	1536	Cosine similarity
OpenAI `text-embedding-3-small`	1536	Cosine similarity
OpenAI `text-embedding-3-large`	3072	Cosine similarity
Google Gemini `text-embedding-004`	768	Cosine similarity
Cohere embed-english-v3.0	1024	Cosine similarity

When you define a vector(N) column, PostgreSQL enforces that every inserted row contains exactly N components. Attempting to insert a vector with a different length raises a SQL error similar to ValueError (ERROR: expected N dimensions, not M). This strictness is a feature: it prevents silent corruption from model version mismatches.

Code Walkthrough

Now that you understand how the vector extension registers its shared library, distance operators, and index access methods through PostgreSQL's extension mechanism, you can wire up the installation and validation steps in Python.

The setup_pgvector_database function uses psycopg to enable pgvector and create a documents table with a vector column. verify_pgvector_installation then queries the pg_extension system catalog to confirm the extension loaded correctly.

Code snippetpython
1import psycopg
2from psycopg.rows import dict_row
3
4def setup_pgvector_database(conninfo: str, embedding_dim: int = 1536) -> dict:
5    """Enable pgvector and create the documents table."""
6    with psycopg.connect(conninfo, row_factory=dict_row) as conn:
7        conn.autocommit = True
8        with conn.cursor() as cur:
9            cur.execute("CREATE EXTENSION IF NOT EXISTS vector")
10            cur.execute(f"""
11                CREATE TABLE IF NOT EXISTS documents (
12                    id          BIGSERIAL PRIMARY KEY,
13                    content     TEXT NOT NULL,
14                    embedding   vector({embedding_dim}),
15                    metadata    JSONB DEFAULT '{{}}'::jsonb,
16                    created_at  TIMESTAMPTZ DEFAULT now()
17                )
18            """)
19            cur.execute("""
20                SELECT column_name, udt_name
21                FROM information_schema.columns
22                WHERE table_name = 'documents'
23                  AND column_name = 'embedding'
24            """)
25            column_info = cur.fetchone()
26    return {"status": "ready", "embedding_dim": embedding_dim, "column_info": column_info}
27
28def verify_pgvector_installation(conninfo: str) -> dict | None:
29    """Return pgvector version info or None if not installed."""
30    with psycopg.connect(conninfo, row_factory=dict_row) as conn:
31        with conn.cursor() as cur:
32            cur.execute("""
33                SELECT extname, extversion
34                FROM pg_extension
35                WHERE extname = 'vector'
36            """)
37            row = cur.fetchone()
38            return {"name": row["extname"], "version": row["extversion"]} if row else None

autocommit is set to True because DDL statements — CREATE EXTENSION and CREATE TABLE — cannot run inside an implicit transaction block in some PostgreSQL configurations. The embedding_dim default of 1536 matches OpenAI's text-embedding-ada-002 output; adjust it to match whichever embedding model your application uses, since the vector(n) type enforces that every inserted row contains exactly n floating-point components. Once CREATE EXTENSION succeeds, the <=> cosine similarity operator and the <-> L2 distance operator are immediately available for queries.

PostgreSQL memory settings directly affect vector index build performance, so validate them before creating any HNSW or IVFFlat index. The function below reads maintenance_work_mem — the setting most critical for index builds — using current_setting():

Code snippetpython
1def check_maintenance_work_mem(conninfo: str) -> str:
2    """Return the current maintenance_work_mem setting."""
3    with psycopg.connect(conninfo) as conn:
4        with conn.cursor() as cur:
5            cur.execute("SELECT current_setting('maintenance_work_mem')")
6            return cur.fetchone()[0]

An undersized maintenance_work_mem causes HNSW builds to spill to disk and run orders of magnitude slower; the recommended minimum is 512 MB. If the returned value is smaller, update postgresql.conf or issue SET maintenance_work_mem = '512MB' in your session before building indexes.

Confirm that verify_pgvector_installation returns a non-None result and that check_maintenance_work_mem reports at least 512MB before proceeding to index creation.

Architecture overview

Before touching any configuration files, understand where pgvector fits in the broader data pipeline. Embeddings are generated by an external model API, transmitted to your application layer, and persisted into PostgreSQL alongside the source text and metadata. At query time, the same model encodes the user's query into a vector, and PostgreSQL uses its index to find the closest stored vectors.

Loading diagram...

The diagram highlights a critical architectural constraint: every vector you store and every vector you query must come from the same embedding model. Mixing text-embedding-ada-002 vectors (1536 vector dimensions) with Gemini vectors (768 vector dimensions) in the same column produces meaningless distance calculations because the geometric spaces are incompatible.

In a typical GenAI data pipeline, the application layer is a Python service that calls an embedding API to encode incoming documents, then opens a single PostgreSQL connection to write the vector and its source metadata in one transaction. The same service later issues read queries that ORDER BY a distance operator to retrieve the top-k matches for a user prompt. Co-locating the embedding store with the relational tables — as the diagram shows — keeps this round-trip on a single connection pool and avoids the dual-write consistency problems that arise when embeddings live in a separate vector database.

Do's and Don'ts

Do's

✓
Do pin the pgvector version in your Docker image or package manager — Extension upgrades can change index internals. A minor version bump should not silently alter your production index behavior. Use apt-get install postgresql-16-pgvector=0.7.0-1 or equivalent version pinning.
✓
Do set maintenance_work_mem to at least 512 MB before building indexes — HNSW index construction is memory-intensive. The default 64MB forces the build to spill temporary data to disk, inflating build time from minutes to hours on tables with more than 500K rows.
✓
Do use CREATE EXTENSION IF NOT EXISTS vector in migration scripts — The IF NOT EXISTS clause makes the migration idempotent, safe to re-run in CI/CD pipelines without failing on environments where pgvector is already enabled.

Don'ts

✗
Don't mix embeddings from different models in the same vector column — A 1536-dimensional vector from OpenAI and a 1536-dimensional vector from Cohere occupy the same geometric space syntactically but represent entirely different semantic spaces. Distance calculations between them are meaningless.
✗
Don't leave shared_buffers at the default 128MB for vector workloads — Vector queries scan large contiguous memory regions. Undersized shared buffers force PostgreSQL to read index pages from disk on every query, destroying latency.
✗
Don't use superuser credentials for application connections — Create a dedicated role with USAGE on the schema and SELECT, INSERT, UPDATE, DELETE on the tables. The CREATE EXTENSION command requires superuser or the pg_extension role, but that should run once in a migration — not on every application startup.

Keep going with GenAI Platform Engineering

Create a free account to track your progress and open this lesson in the full learning view. Subscribe to unlock the entire path — every goal, the hands-on labs, quizzes, and your verifiable skill graph — from . Cancel anytime.

Create a free account Subscribe — →