Why I Married pg_migrator

Page content

With the introduction of PostgreSQL 8.4, Bruce Momjian, a significant core developer, contributed a tool that can actually upgrade an entire database cluster in place. The time required is essentially only that necessary to copy the data files from the old installation to the new one. On a quick RAID system, this can be an order of magnitude faster than a dump/restore. The main drawback is similar to Slony: disk space must effectively be doubled for this upgrade method.

This effectively reduces an upgrade to:

  1. Install a copy of the old version in a different location.
  2. Install a copy of the new version.
  3. Stop the postgresql service.
  4. Move the old data directory.
  5. Run initdb.
  6. Run pg_migrator.
  7. Start the postgresql service.

While it’s not quite this simple, most of the other steps are actually preparation work, and need only be done once. It’s not as simple as a dump/restore, but also not as fragile or complicated as a full database mirror. So what’s involved, exactly? Well, here’s a really high-level overview of what’s involved, assuming you have functional knowledge in RPM packaging, minor system administration skills, and a strong comfort level tweaking PostgreSQL config settings.

Setting Up

Using pg_migrator isn’t all fun and games. There are certain constraints that must be followed due to assumptions it makes about the data and the server compilation options. For the purposes of these samples, we’ll assume we’re migrating from 8.3 to 8.4.

Building New RPMs

The official binary packages supplied by Postgres are insufficient for our needs. For pg_migrator to work, we’ll need to build new ones. For full compatibility with patches and your chosen server OS, it’s best to obtain a source RPM from The YUM repository, install it to a local build environment, and alter the spec file. So go grab a source RPM and do this:

mkdir pgtemp
rpm2cpio postgresql-8.4*.src.rpm > pgtemp/pg.cpio
cd pgtemp
cpio -i < pg.cpio
rm pg.cpio
mv *.spec /your/home/redhat/SPECS
mv * /your/home/redhat/SOURCES

Now edit /your/home/redhat/SPECS/postgresql-8.4.spec and change the following:

%{!?intdatetimes:%define intdatetimes 1}

To:

%{!?intdatetimes:%define intdatetimes 0}

Which will disable integers from having datetimes assigned, which is apparently an insurmountable obstacle to the migration process. Now:

cd /your/home/redhat/SPECS
rpmbuild -ba postgresql-8.4.spec

When building is done, do not install the resulting RPMs. At least, not yet.

Installing Migration Source

The pg_migrator tool requires both old and new binaries to exist for the conversion process. But it’s clear both sets of RPMs can’t be installed simultaneously, and we’ve already seen with our last section, the official RPMs don’t set the right configure options. This is easy to fix. To avoid any confusion, and make it possible to discard the migration harness after we’re done, we’ll set up two installations.

Simply obtain a tarball from PostgreSQL’s source archive for the installed version, and the new version. In this case, the most recent branches of 8.3 and 8.4.

Now complete the following:

tar -xzf postgresql-8.3.8.tar.gz
cd postgresql-8.3.8
./configure --prefix=/opt/postgres-8.3 --disable-integer-datetimes
make -j2
make install
cd ..
tar -xzf postgresql-8.4.2.tar.gz
cd postgresql-8.4.2
./configure --prefix=/opt/postgres-8.4 --disable-integer-datetimes
make -j2
make install

The major benefit here is that this can be done far in advance of the actual upgrade.

Installing pg_migrator

Now pg_migrator itself must be installed. First, obtain the newest pg_migrator source from pgfoundry. The INSTALL file the source contains discusses all the necessary steps, but we’ll summarize here:

tar -xzf pg_migrator-*.tgz
cd pg_migrator-[version]
gmake USE_PGXS=1 PG_CONFIG=/opt/postgres-8.4/bin/pg_config install

This should install pg_migrator in the same area as the new version of your PostgreSQL migration directories. pg_migrator looks for its own binaries here, so this should complete our setup.

The Migration Process

Once everything is ready, upgrading is deceptively easy. pg_migrator has options to avoid duplicating disk space, but we don’t want to use that, because it saves space by creating symbolic links from the old data directory to the new one, and that’s definitely not what we want. That said, you should be able to follow these steps:

source /etc/sysconfig/pgsql/postgresql
sudo /sbin/service postgresql stop
sudo rpm -Uvh /your/home/redhat/RPMS/i386/postgresql-*8.4*.rpm
sudo -u postgres mv $PGDATA ${PGDATA}-old
sudo /sbin/service postgresql initdb
sudo cp -a ${PGDATA}-old/pg_hba.conf ${PGDATA}
sudo cp -a ${PGDATA}-old/postgresql.conf ${PGDATA}
sudo -u postgres pg_migrator -d ${PGDATA}-old -D ${PGDATA} \
     -b /opt/postgres-8.3/bin -B /opt/postgres-8.4/bin

Carefully watch the pg_migrator output for errors. If any occur, attempt to resolve them and try again. Specifically, you may need to alter the pg_hba.conf file and remove any references to ‘sameuser’ as this parameter has been deprecated in newer releases of PostgreSQL. In addition, the max_fsm_pages and max_fsm_relations parameters of postgresql.conf have been deprecated, and will prevent the migration from succeeding.

Assuming everything worked properly, finish with the following:

sudo /sbin/service postgresql start
sudo -u postgres vacuumdb -az

Database queries may get invalid query plans and act oddly until the vacuum analyze completes, but the data will be available under the new version.

Congratulations, you’ve upgraded PostgreSQL without a full dump/restore!