New PostgreSQL Extension pg_repack Optimizes Tables Without Downtime

Apps & Games / Desktop / New PostgreSQL Extension pg_repack Optimizes Tables Without Downtime
08 Jul 2024

PostgreSQL users frequently encounter the issue of database bloat, which can severely degrade performance. Traditional methods such as VACUUM FULL and CLUSTER can be quite disruptive, but pg_repack offers an online solution that minimizes downtime. Here, we delve into the installation and configuration of pg_repack on a PostgreSQL 14 setup running on Red Hat Enterprise Linux 9 (RHEL 9).

What is pg_repack?

pg_repack is a PostgreSQL extension designed to eliminate bloat from tables and indexes. Unlike CLUSTER and VACUUM FULL, pg_repack works online without requiring an exclusive lock on the tables being processed, making it a more efficient and less intrusive option.

Installation

  1. Download the RPM Package
    First, download the pg_repack RPM package for PostgreSQL 14 on RHEL 9:
    wget https://download.postgresql.org/pub/repos/yum/14/redhat/rhel-9-x86_64/pg_repack_14-1.5.0-1PGDG.rhel9.x86_64.rpm
  2. Install the RPM Package
    Next, install the downloaded RPM package:
    rpm -iv pg_repack_14-1.5.0-1PGDG.rhel9.x86_64.rpm
  3. Verify Installation
    Verify the installation by listing the files included in the package:
    rpm -ql pg_repack_14-1.5.0-1PGDG.rhel9.x86_64.rpm | grep bin
    Alternatively, use the find command to locate the pg_repack binary:
    find / -name pg_repack
  4. Update PATH Environment Variable
    Add the PostgreSQL bin directory to your PATH:
    export PATH=$PATH:/usr/pgsql-14/bin
    Make this change permanent by adding it to your ~/.bashrc file:
    echo 'export PATH=$PATH:/usr/pgsql-14/bin' >> ~/.bashrc
    source ~/.bashrc

Configuration

  1. Create the Extension
    Connect to your PostgreSQL database and create the pg_repack extension:
    psql -c "CREATE EXTENSION pg_repack" -d employee
  2. Verify pg_repack Availability
    To ensure pg_repack is available, list the PostgreSQL binaries:
    pg_re
    You should see pg_repack listed among the other PostgreSQL binaries:
    pg_receivewal   pg_recvlogical  pg_repack       pg_resetwal     pg_restore      pg_rewind
    If you don't see it, you may have forgotten to source your bashrc (source ~/.bashrc)

Usage

Basic Usage
To repack a specific database (e.g., employee), use the following command:

pg_repack -d employee

Advanced Options
pg_repack offers several options to customize its behavior. Here are some useful ones:

  • Repack all databases:
    pg_repack -a
  • Repack a specific table:
    pg_repack -t table_name -d employee
  • Repack tables in a specific schema:
    pg_repack -s schema_name -d employee
  • Move repacked tables to a new tablespace:
    pg_repack --tablespace=new_tablespace -d employee
  • Order by specific columns:
    pg_repack --order-by=column_name -d employee

For a complete list of options, refer to the pg_repack help:

pg_repack --help

Update: 08 Jul 2024