Here’s a quick manual on how to compile, install and set up PostgreSQL 9.5. I will also explain how to set up asynchronous replication between two nodes.

This has been tested on a 64-bit Debian Jessie.

Note: the binaries will be installed to /opt/postgres, and the scripts and data will be installed to /home/postgres. Feel free the change the paths from the script according to your needs.

Compiling and installing PostgreSQL

Install all needed dependencies:

apt-get install wget bzip2 g++ flex bison libreadline-dev zlib1g-dev make

Get the source archive from the official website, un-tar it, compile and install the beast:

tar xjvf postgresql-9.5.0.tar.bz2
cd postgresql-9.5.0
./configure --prefix=/opt/postgres && make install && cd contrib && make install

Then, create a user and its environment:

groupadd postgres ; useradd -g postgres -s /bin/bash -d /home/postgres postgres ; mkdir /home/postgres ; cp /etc/skel/.* /home/postgres ; chown -R postgres:postgres /home/postgres

Change to the postgres user:

su - postgres

Add the following lines to .bashrc (make sure all paths are correct):

export PG_HOME="/opt/postgres"
export PG_DATA="/home/postgres/data"
export PATH=$PG_HOME/bin:$PATH
export LD_LIBRARY_PATH=$PG_HOME/lib:$LD_LIBRARY_PATH

Create a file called ~/vars with the following (make sure all paths are correct):

export PG_DATA="/home/postgres/data"
export PG_WALPATH="/home/postgres/walarchives"
export PG_THISNODE="pg01"
export PG_OTHERNODE="pg02"

Configuring PostgreSQL

Make sure you’re still logged in as postgres, and initialize the database:

initdb -D $PG_DATA

This will create an empty database, as well as default configuration files. Here’s where you start tweaking things on your own, but here’s something to get you started; add the following lines to $PG_DATA/postgresql.conf:

listen_addresses = '*'
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d.log'
log_rotation_age = 1d

Starting and stopping PostgreSQL

You are now ready to use your PostgreSQL instance. Here’s how it goes:

pg_ctl -D $PG_DATA [start|stop|reload]

Setting up asynchronous replication

This set-up will assume you have two different servers with no shared filesystem between them. If you do happen to have a shared filesystem, you might want to change the ugly “scp” from archive_command.sh below to a simple “cp” (still ugly, but should be faster).

How replication works in PostgreSQL

Many find PG replication a dirty business to set-up. Granted, things look kind of messy when you start calling shell scripts from your database configuration files. But once you get the hang of it, it actually becomes simple, and even robust.

Bear in mind that a PG cluster is far more suitable in a dedicated infrastructure (like two servers on the same LAN in a datacenter with a SAN attached to them). It can still work in other infrastructure types, but you definitely won’t get the same performance out of it.

Basically, in a nominal state, a cluster will have a master node, which you can use to do SELECTs and INSERTs/UPDATEs/DELETEs, and the like… The slave node(s) will maintain a connection with the master node to replicate all operations made by the master. If you do anything on the master databases, it will be replicated to the slave databases.

This is the purpose of having “recovery.conf” in your $PG_DATA directory. When you boot up PostgreSQL, and it finds recovery.conf, it will start as a slave, and immediately connect to the master node to be kept updated of all changes that are made. Being a slave PG node also means the database will have to be created from a backup of the master node. This can be done on-the-fly by using the init_as_slave.sh script given below. You’ll notice that this step can be achieved with a simple command run from the slave node: pg_basebackup.

In case communication fails between the master and the slave, redo logs are created by the master node, and sent over to the slave node. These are called WAL archives. This is done by the “archive_command.sh” script below. You’ll notice that it uses the “scp” command to copy the WAL archive over to the slave. Make sure that the “postgres” user from the master node can SSH to the “postgres” user of the slave node. And the other way around.

Once the WAL archives have been assimilated by the slave node, it runs the pg_archivecleanup command, as declared in recovery.conf. WAL archives are then deleted from the slave. Make sure you monitor disk space, though: if the slave node fails, the WAL archives will keep accumulating and will eventually (and rather quickly) gobble up all your storage space.

And that’s pretty much all there is to understand about replication in PostgreSQL :)

Configuration

We’ll assume you’re setting up two replicated servers, one of which will be the master, the other one will be the slave. Start by setting up the slave server the same way you did with the master. Then, add the following to $PG_DATA/postgresql.conf on both servers (yes, both - also, make sure the paths are correct):

hot_standby = on
wal_level = hot_standby
archive_mode = on
archive_command = '/home/postgres/archive_command.sh "%p" "%f"'
archive_timeout = 60
max_wal_senders = 5

Execute the following script to create ~/archive_command.sh on both servers:

cat > ~/archive_command.sh<<EOF
#!/bin/sh
. ~/vars

if [ -z "\$PG_OTHERNODE" ]; then
  echo "PG_OTHERNODE undefined!"
  exit 1
fi
fullpath="\$1"
walfile="\$2"
scp "\$fullpath" \$PG_OTHERNODE:\$PG_WALPATH/\$walfile > /dev/null 2>&1
exit 0
EOF
chmod +x ~/archive_command.sh

Then, create ~/recovery.conf on both servers (make sure all paths are correct, and don’t forget to replace ‘other_node’ and ‘this_node_name’ with the appropriate values):

cat > ~/recovery.conf<<EOF
standby_mode = 'on'
primary_conninfo = 'host=other_node application_name=this_node_name'
trigger_file = '/tmp/pgtrigger'
restore_command = 'cp /home/postgres/walarchives/%f %p'
archive_cleanup_command = '/opt/postgres/bin/pg_archivecleanup /home/postgres/walarchives %r'
EOF

Add the following line to $PG_DATA/pg_hba.conf (again, on both servers):

host    replication     all     0.0.0.0/0               trust

Create the final utility script on both servers:

cat > ~/init_as_slave.sh<<EOF
#!/bin/sh

. ~/vars

if [ -z \$PG_DATA ]; then
  echo "PG_DATA undeclared!"
  exit 1
fi
if [ -z \$PG_OTHERNODE ]; then
  echo "PG_OTHERNODE undeclared!"
  exit 1
fi
if [ ! -f \$PG_DATA/postgresql.conf ]; then
  echo "Wrong directory, mate!"
  exit 1
fi
pg_ctl -D \$PG_DATA stop > /dev/null 2>&1
rm -rf \$PG_DATA/*
rm -rf \$HOME/walarchives/*
pg_basebackup -D \$PG_DATA --host=\$PG_OTHERNODE --port=5432
if [ \$? -eq 0 ]; then
  cp ~/*.conf \$PG_DATA/
else
  echo "Could not back up database!"
  exit 1
fi
EOF
chmod +x ~/init_as_slave.sh

Ok, so what are all these scripts for ?

You’ve probably guessed by now… For the moment, PostgreSQL bases all of its replication work on scripts and actions on the filesystem:

  • copying the redo logs from the master node to the slave node(s) is done by a script, archive_command.sh
  • automatic purging of the redo logs on the slave nodes is also done by a script: the archive_cleanup_command and restore_command settings defined in recovery.conf
  • promoting a node from slave to master is also done by creating “/tmp/pgtrigger” on the slave node; this is defined in recovery.conf as well

Usage

You can start the instance on the master node:

pg_ctl -D $PG_DATA start

And replication from the slave side can be initiated as follows:

~/init_as_slave.sh
pg_ctl -D $PG_DATA start

This concludes part 1/2

If everything went according to plan, you should have a working simple PostgreSQL cluster, where:

  • One node is a master, and can be used to read from and write to
  • The second node is a slave, and can only be used to read from it
  • Replication is asynchronous; the performance impact will be kept to a minimum. Also, if the slave node fails, the master won’t mind.

Now, we’ve paved the way for all the replication work, here. But keep in mind that all of this doesn’t make for a full-blown, usable cluster; your application will need to use some kind of PostgreSQL proxy in order to always be sure to connect to a master node. This can be done with PGPool, and will be covered in part 2 :)