Tuesday, December 06, 2011

Postgresql replication

There are many solutions to postgresql replication available in the market. Almost all of them are third party solutions, since there was no inbuilt replication in postgresql. Postgresql 9.0 introduced replication into the database - it is also known as streaming replication.And it can be used only for master-slave replication. There is no master-master or clustering feature available with postgresql SR (streaming replication).

The way SR works is that there are log files (known as XLOG files) which are shipped to the standby or slave server via network. Multiple slave servers can connect to the master over the network. The stand by servers continuously replay the XLOG records shipped in continuous recovery mode.As soon as XLOG files are shipped, they are replayed on the slave. This makes latest data available on slave almost immediately. Log shipping does not interfere with any query execution on master. In case the primary goes offline, the standby server will wait for the primary to become active.

Here is how i did a test setup of master-slave replication using postgresql.

I had 2 machines 241 and 242. I downloaded postgresql-9.1.1.tar.bz2 on both.

Steps to setup replication :

1. untar, compile and install

241/242 ]#  tar -xvjf postgresql-9.1.1.tar.bz2
241/242 ]#  cd postgresql-9.1.1
241/242 postgresql-9.1.1]#  ./configure
241/242 postgresql-9.1.1]#  make
241/242 postgresql-9.1.1]#  sudo make install

This will install postgresql in /usr/local/pgsql folder

2. Setup 241 as master. Initialize the database cluster on 241

241 ]# adduser postgres
241 ]# mkdir /usr/local/pgsql/data
241 ]# chown postgres /usr/local/pgsql/data
241 ]# su - postgres
241 ]# /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data

Do not start the postgres database server now.

3. configure master server to listen on all ip addresses.

241 ]# vim /usr/local/pgsql/data/postgresql.conf
  
    listen_addresses = '*'
  
4. Allow standby server to connect to postgresql on master with replication privilege

241 ]# vim /usr/local/pgsql/data/pg_hba.conf
  
    host   replication   postgres    192.168.1.242/22    trust
  
5. Setup replication related parameters in the master server

241 ]# vim /usr/local/pgsql/data/postgresql.conf

    # To enable read-only queries on a standby server, wal_level must be set to
    # "hot_standby". But you can choose "archive" if you never connect to the
    # server in standby mode.
    wal_level = hot_standby

    # Set the maximum number of concurrent connections from the standby servers.
    max_wal_senders = 5

    # To prevent the primary server from removing the WAL segments required for
    # the standby server before shipping them, set the minimum number of segments
    # retained in the pg_xlog directory. At least wal_keep_segments should be
    # larger than the number of segments generated between the beginning of
    # online-backup and the startup of streaming replication. If you enable WAL
    # archiving to an archive directory accessible from the standby, this may
    # not be necessary.
    wal_keep_segments = 128
  
    # Enable WAL archiving on the primary to an archive directory accessible from
    # the standby. If wal_keep_segments is a high enough number to retain the WAL
    # segments required for the standby server, this is not necessary.
    archive_mode    = on
    archive_command = 'cp %p /usr/local/pgsql/data/pg_archive/%f'

6. start postgresql on master

241 ]# /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &

7. copy the master server's data to standby server

241 ]# /usr/local/pgsql/bin/psql -c "SELECT pg_start_backup('label', true)"
 pg_start_backup
-----------------
 0/4000020
(1 row)

241 ]# rsync -a /usr/local/pgsql/data/ root@192.168.1.242:/usr/local/pgsql/data --exclude postmaster.pid

241 ]# /usr/local/pgsql/bin/psql -c "SELECT pg_stop_backup()"
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup
----------------
 0/40000D8
(1 row)

This will also copy all the configuration parameters and authentication related stuff from primary to standby slave.
Ensuring that the slave can be converted to a master/primary in case of a failover.

8. Change postgresql.conf to enable readonly queries on standby server

242 ]# vim /usr/local/pgsql/data/postgresql.conf

    hot_standby = on
  
9. Enable recovery on the standby server and change configuration.

242 ]# cp /usr/local/pgsql/share/recovery.conf.sample /usr/local/pgsql/data/recovery.conf
242 ]# vim /usr/local/pgsql/data/recovery.conf


    # Specifies whether to start the server as a standby. In streaming replication,
    # this parameter must to be set to on.
    standby_mode          = 'on'

    # Specifies a connection string which is used for the standby server to connect
    # with the primary.
    primary_conninfo      = 'host=192.168.1.241 port=5432 user=postgres'

    # Specifies a trigger file whose presence should cause streaming replication to
    # end (i.e., failover). Once the trigger file is found the server acts as a primary server.
    trigger_file = '/home/postgres/failover'

    # Specifies a command to load archive segments from the WAL archive. If
    # wal_keep_segments is a high enough number to retain the WAL segments
    # required for the standby server, this may not be necessary. But
    # a large workload can cause segments to be recycled before the standby
    # is fully synchronized, requiring you to start again from a new base backup.
    restore_command = 'cp /usr/local/pgsql/data/pg_archive/%f "%p"'

10. Start postgres on standby server. This will start streaming replication on the standby server.

242 ]# /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &

11. You can check the status of streaming replication using either the ps command or through psql - postgresql command prompt

241 (primary) ]# /usr/local/pgsql/bin/psql -c "SELECT pg_current_xlog_location()"
 pg_current_xlog_location
--------------------------
 0/5000EC0
(1 row)

242 (standby) ]# /usr/local/pgsql/bin/psql -c "select pg_last_xlog_receive_location()"
 pg_last_xlog_receive_location
-------------------------------
 0/5000EC0
(1 row)

242 (standby) ]$ /usr/local/pgsql/bin/psql -c "select pg_last_xlog_replay_location()"
 pg_last_xlog_replay_location
------------------------------
 0/5000EC0
(1 row)

To check using ps use the following commands

241 (master)]# ps ax | grep sender
 2728 ?        Ss     0:00 postgres: wal sender process postgres 192.168.1.242(54792) streaming 0/5000EC0
 2768 pts/1    R+     0:00 grep sender

242 (standby)]# ps ax| grep receiver
 28125 ?        Ss     0:00 postgres: wal receiver process   streaming 0/5000EC0
 28154 pts/1    S+     0:00 grep receiver


To do a failover, all that needs to be done is to create the 'trigger' file at the specified location. This will automatically turn off standby mode and the postgres server will start acting as a primary or master.

Do remember to use the "pg_ctl stop" command to stop either the primary or standby server. This will ensure graceful shutdown and no records will be missed being replicated.

In order to create another standby server repeat steps from 7 onwards - after adding the ip of the standby server in master configuration as in step 4