Postgresql replication

There are many solutions to postgresql replication available in the market. Almost all of them are third party solutions, since there was no inbuilt replication in postgresql. Postgresql 9.0 introduced replication into the database – it is also known as streaming replication.And it can be used only for master-slave replication. There is no master-master or clustering feature available with postgresql SR (streaming replication).

The way SR works is that there are log files (known as XLOG files) which are shipped to the standby or slave server via network. Multiple slave servers can connect to the master over the network. The stand by servers continuously replay the XLOG records shipped in continuous recovery mode.As soon as XLOG files are shipped, they are replayed on the slave. This makes latest data available on slave almost immediately. Log shipping does not interfere with any query execution on master. In case the primary goes offline, the standby server will wait for the primary to become active.

Here is how i did a test setup of master-slave replication using postgresql.

I had 2 machines 241 and 242. I downloaded postgresql-9.1.1.tar.bz2 on both.

Steps to setup replication :

1. untar, compile and install

241/242 ]#  tar -xvjf postgresql-9.1.1.tar.bz2
241/242 ]#  cd postgresql-9.1.1
241/242 postgresql-9.1.1]#  ./configure
241/242 postgresql-9.1.1]#  make
241/242 postgresql-9.1.1]#  sudo make install

This will install postgresql in /usr/local/pgsql folder

2. Setup 241 as master. Initialize the database cluster on 241

241 ]# adduser postgres
241 ]# mkdir /usr/local/pgsql/data
241 ]# chown postgres /usr/local/pgsql/data
241 ]# su – postgres
241 ]# /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data

Do not start the postgres database server now.

3. configure master server to listen on all ip addresses.

241 ]# vim /usr/local/pgsql/data/postgresql.conf
  
    listen_addresses = ‘*’
  
4. Allow standby server to connect to postgresql on master with replication privilege

241 ]# vim /usr/local/pgsql/data/pg_hba.conf
  
    host   replication   postgres    192.168.1.242/22    trust
  
5. Setup replication related parameters in the master server

241 ]# vim /usr/local/pgsql/data/postgresql.conf

    # To enable read-only queries on a standby server, wal_level must be set to
    # “hot_standby”. But you can choose “archive” if you never connect to the
    # server in standby mode.
    wal_level = hot_standby

    # Set the maximum number of concurrent connections from the standby servers.
    max_wal_senders = 5

    # To prevent the primary server from removing the WAL segments required for
    # the standby server before shipping them, set the minimum number of segments
    # retained in the pg_xlog directory. At least wal_keep_segments should be
    # larger than the number of segments generated between the beginning of
    # online-backup and the startup of streaming replication. If you enable WAL
    # archiving to an archive directory accessible from the standby, this may
    # not be necessary.
    wal_keep_segments = 128
  
    # Enable WAL archiving on the primary to an archive directory accessible from
    # the standby. If wal_keep_segments is a high enough number to retain the WAL
    # segments required for the standby server, this is not necessary.
    archive_mode    = on
    archive_command = ‘cp %p /usr/local/pgsql/data/pg_archive/%f’

6. start postgresql on master

241 ]# /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &

7. copy the master server’s data to standby server

241 ]# /usr/local/pgsql/bin/psql -c “SELECT pg_start_backup(‘label’, true)”
 pg_start_backup
—————–
 0/4000020
(1 row)

241 ]# rsync -a /usr/local/pgsql/data/ root@192.168.1.242:/usr/local/pgsql/data –exclude postmaster.pid

241 ]# /usr/local/pgsql/bin/psql -c “SELECT pg_stop_backup()”
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup
—————-
 0/40000D8
(1 row)

This will also copy all the configuration parameters and authentication related stuff from primary to standby slave.
Ensuring that the slave can be converted to a master/primary in case of a failover.

8. Change postgresql.conf to enable readonly queries on standby server

242 ]# vim /usr/local/pgsql/data/postgresql.conf

    hot_standby = on
  
9. Enable recovery on the standby server and change configuration.

242 ]# cp /usr/local/pgsql/share/recovery.conf.sample /usr/local/pgsql/data/recovery.conf
242 ]# vim /usr/local/pgsql/data/recovery.conf

    # Specifies whether to start the server as a standby. In streaming replication,
    # this parameter must to be set to on.
    standby_mode          = ‘on’

    # Specifies a connection string which is used for the standby server to connect
    # with the primary.
    primary_conninfo      = ‘host=192.168.1.241 port=5432 user=postgres’

    # Specifies a trigger file whose presence should cause streaming replication to
    # end (i.e., failover). Once the trigger file is found the server acts as a primary server.
    trigger_file = ‘/home/postgres/failover’

    # Specifies a command to load archive segments from the WAL archive. If
    # wal_keep_segments is a high enough number to retain the WAL segments
    # required for the standby server, this may not be necessary. But
    # a large workload can cause segments to be recycled before the standby
    # is fully synchronized, requiring you to start again from a new base backup.
    restore_command = ‘cp /usr/local/pgsql/data/pg_archive/%f “%p”‘

10. Start postgres on standby server. This will start streaming replication on the standby server.

242 ]# /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &

11. You can check the status of streaming replication using either the ps command or through psql – postgresql command prompt

241 (primary) ]# /usr/local/pgsql/bin/psql -c “SELECT pg_current_xlog_location()”
 pg_current_xlog_location
————————–
 0/5000EC0
(1 row)

242 (standby) ]# /usr/local/pgsql/bin/psql -c “select pg_last_xlog_receive_location()”
 pg_last_xlog_receive_location
——————————-
 0/5000EC0
(1 row)

242 (standby) ]$ /usr/local/pgsql/bin/psql -c “select pg_last_xlog_replay_location()”
 pg_last_xlog_replay_location
——————————
 0/5000EC0
(1 row)

To check using ps use the following commands

241 (master)]# ps ax | grep sender
 2728 ?        Ss     0:00 postgres: wal sender process postgres 192.168.1.242(54792) streaming 0/5000EC0
 2768 pts/1    R+     0:00 grep sender

242 (standby)]# ps ax| grep receiver
 28125 ?        Ss     0:00 postgres: wal receiver process   streaming 0/5000EC0
 28154 pts/1    S+     0:00 grep receiver

To do a failover, all that needs to be done is to create the ‘trigger’ file at the specified location. This will automatically turn off standby mode and the postgres server will start acting as a primary or master.

Do remember to use the “pg_ctl stop” command to stop either the primary or standby server. This will ensure graceful shutdown and no records will be missed being replicated.

In order to create another standby server repeat steps from 7 onwards – after adding the ip of the standby server in master configuration as in step 4

Database speed tests (mysql and postgresql) – part 3 – code

Here is the code structure

dbfuncs.php : is the file which contains classes and functions for firing queries on mysql and pgsql
mysqlinsert.php : creates and fires inserts on mysql
mysqlselect.php : creates and fires selects on mysql
pgsqlinsert.php : creates and fires inserts on pgsql
pgsqlselect.php : creates and fires selects on pgsql
benchmark.php : script used to control concurrency and number of queries per script

Please refer to http://www.jayantkumar.in/index.php/2010/09/29/database-speed-tests-mysql-and_29/ and http://www.jayantkumar.in/index.php/2010/09/27/database-speed-tests-mysql-and-2/ for benchmarks of selects and inserts respectively.

And the code….

dbfuncs.php

abstract class dbfuncs
{
  abstract function insertqry($qry);
  abstract function selectqry($qry);
  abstract function connectdb();

  public function log($str)
  {
    $file = "error.log";
    $fp = fopen($file, "a");
    fwrite($fp, "$strn");
    fclose($fp);
  }
}

class mysqldb extends dbfuncs
{
  private $user = "root";
  private $pass = "jayant";
  private $host = "localhost";
  //private $port = 3307;
  private $socket = "/tmp/mysql.sock";
  private $database = "benchmark";

  public $db;

  function __construct()
  {
    $this->connectdb();
  }

  public function connectdb()
  {
    $this->db = mysql_connect($this->host.':'.$this->socket, $this->user, $this->pass) or die(mysql_error())
    mysql_select_db($this->database, $this->db);
  }

  public function insertqry($qry)
  {
    if(!mysql_ping($this->db))
      $this->connectdb();

    mysql_query($qry, $this->db) or $this->log(mysql_error());
  }

  public function selectqry($qry)
  {
    if(!mysql_ping($this->db))
      $this->connectdb();

    $rs = mysql_query($qry, $this->db) or $this->log(mysql_error());
    return $rs;
  }
}

class pgsqldb extends dbfuncs
{
  private $dns = "host=localhost port=5432 user=jayant password=12qwaszx dbname=benchmark";

  public $db;

  function __construct()
  {
    $this->connectdb();
  }

  public function connectdb()
  {
    $this->db = pg_connect($this->dns) or die(pg_last_error());
  }

  public function insertqry($qry)
  {
    if(!pg_ping($this->db))
      connectdb();

    pg_query($this->db, $qry) or $this->log(pg_last_error($this->db));
  }

  public function selectqry($qry)
  {
    if(!pg_ping($this->db))
      $this->connectdb();

    $rs = pg_query($this->db, $qry) or $this->log(pg_last_error($this->db));
    return $rs;
  }
}

function logtime($str)
{
    $file = "benchmark.log";
    $fp = fopen($file, "a");
    fputs($fp, $str);
    fclose($fp);
}

mysqlinsert.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new mysqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "insert into data (val, txt) values ('$xx','$x in $scriptno')";
  $mysql->insertqry($qry);
}
$end = microtime(true);
$log = "nMysql innodb Time to insert $count in run $scriptno = ".($end-$start);
logtime($log);

pgsqlinsert.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new pgsqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "insert into data (val, txt) values ('$xx','$x in $scriptno')";
  $mysql->insertqry($qry);
}
$end = microtime(true);
$log = "nAvg Pgsql Time to insert $count in run $scriptno = ".($end-$start);
logtime($log);

mysqlselect.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new mysqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "select * from `data` where val ='$xx'";
  $mysql->selectqry($qry);
}
$end = microtime(true);
$log = "nMysql innodb Time to select $count in run $scriptno = ".($end-$start);
logtime($log);

pgsqlselect.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new pgsqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "select * from data where val ='$xx'";
  $mysql->selectqry($qry);
}
$end = microtime(true);
$log = "nPgsql Time to select $count in run $scriptno = ".($end-$start);
logtime($log);

benchmark.php

$count = 100000;
$concurrency = 40;

for($i=0; $i<$concurrency; $i++)
{
   exec("php -q mysqlselect.php $i $count > /dev/null &");
//   exec("php -q pgsqlselect.php $i $count > /dev/null &");
//   exec("php -q mysqlinsert.php $i $count > /dev/null &");
//   exec("php -q pgsqlinsert.php $i $count > /dev/null &");
}

All test runs were individual – meaning that only one script was run at a time with different count and concurrency. For inserts $count was set to 10,000 and for selects $count was set to 100,000 while $concurrency kept on varying.

I am using ubuntu 10.04 with 32 bit kernel 2.6.32-24 and ext4 file system. And my system has around 3GB of RAM and Intel Core 2 DUO T5870 @ 2.0 GHz.

Database speed tests (mysql and postgresql) – part 2

Here is the comparison between mysql and postgresql for selects (only). I had used the same table that i had created earlier http://www.jayantkumar.in/index.php/2010/09/27/database-speed-tests-mysql-and-2/ while comparing insertion speed. I have created approximately 1,000,000 records in the table and ran selects on them. I also modified the configuration of both mysql and postgresql to enable faster selects.

Mysql

In mysql I specially disabled query_cache – the reason being that I would use innodb for tables with large number of inserts – due to its support for row level locking. And with every insert t
he query cache is marked as dirty. So enabling query cache would lead to caching of queries which will not be the scenario in a live setup.

Mysql Configuration :
innodb_buffer_pool_size = 256MB
key_buffer_size = 256MB
read_buffer_size = 512KB
sort_buffer_size = 512KB
query_cache_size = 0
thread_cache_size = 32
table_open_cache = 64

table information :
No of records : 9755366
data (idb) size : 692 MB

Mysql : (time for 100000 selects)

Avg time for 500000 selects with concurrency = 5 : 58.67
Avg time for 1000000 selects with concurrency = 10 : 122.8
Avg time for 2000000 selects with concurrency = 20 : 225.67
Avg time for 3000000 selects with concurrency = 30 : 351.66
Avg time for 4000000 selects with concurrency = 40 : 452.3

PostgreSQL :

Mysql has better table compression as compared to postgres. Same data in innodb is of around 700 MB while that in Postgres is of around 900 MB.

Postgres configuration :
shared_buffers = 128MB
work_mem = 1MB
random_page_cost = 4.0
effective_cache_size = 256MB

table information :
No of records : 9755366
data size : 912 MB

Pgsql : (time for 100000 selects)

Avg time for 500000 selects with concurrency = 5 : 86.8
Avg time for 1000000 selects with concurrency = 10 : 144.74
Avg time for 2000000 selects with concurrency = 20 : 274.37
Avg time for 3000000 selects with concurrency = 30 : 402.92
Avg time for 4000000 selects with concurrency = 40 : 528.17

Mysql seems to perform better with selects. The graph also shows that with increase in concurrency, selects in innodb take lesser time than that in postgresql.

So, why would you switch from mysql to postgresql – only if you have a very high ratio of inserts as compared to selects. The benefit in inserts outweigh the loss in selects to some extent.

Database speed tests (mysql and postgresql) – part 1

There has been major changes in mysql and postgres over a couple of years. Mysql has been focusing on improving and optimizing innodb. Postgres on the other hand has been focusing on database replication and hot standby.

Recently postgres came out with version 9.0 which has built-in replication and hot standby – the two most requested feature in postgresql. Earlier people used to shy away from postgres because there was no proper “easily deployable” solution available for replication. Now with this release, postgres had taken a major step forward. Here http://www.postgresql.org/docs/9.0/static/release-9-0 is a list of features that has been introduced in postgres 9.0

Mysql has released the rc version of Mysql 5.5 which has a bunch of improvements over the previous version of mysql. Support for multi-core cpus, Changes in Innodb for effective use of available I/O capacity, semisynchronous replication – are some of the features that mysql 5.5 promices. Here is a list of all the new features in MySQL 5.5 http://dev.mysql.com/doc/refman/5.5/en/mysql-nutshell.html

It has been a long time, since posted my last benchmark http://www.jayantkumar.in/index.php/2008/06/03/mysql-versus-postgresq/. And i believe it is time i do some rough benchmarks and post it out. The scope is to check out innodb tables in mysql 5.5.6 versus the tables in postgresql 9.0. I am focusing only on inserts and selects. And i will be benchmarking pure inserts and selects only. Thie blog focuses only on inserts. I will be focusing on selects in my next blog. I am running these tests on my laptop which has a Intel Core 2 Duo T5870 @ 2.00 GHz and 3 GB of RAM

I have created a simple php script to perform the benchmark. Which spawns out multiple php scripts that work on background. Let me know if you need the scripts and i will share it with you.

Mysql:

Innodb Settings in my.cnf :

innodb_buffer_pool_size = 256M
innodb_additional_mem_pool_size = 8M
innodb_log_file_size = 64M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
innodb_file_per_table

Table structure :
CREATE TABLE `data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`val` int(11) NOT NULL,
`txt` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_val` (`val`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Avg Time for 20000 inserts with concurrency of 2 : 1367 seconds
Load on machine : 3.4
size on disk : 9 MB

Avg Time for 50000 inserts with concurrency of 5 : 1537.2 seconds
Load on machine : 4.6
Size on disk : 13 MB

Avg Time for 100000 inserts with concurrency of 10 : 1255
Load on machine : 3.5
Size on disk : 17 MB

Avg Time for 200000 inserts with concurrency of 20 : 1403
Load on machine : 4.2
Size on disk : 26 MB

Time for 400000 inserts with concurrency of 40 : 1730
Load in machine : 6.6
Size on disk : 52 MB

Postgresql :

Settings in postgresql.conf:
shared_buffers = 64MB
work_mem = 1MB
synchronous_commit = on

Table structure :
Column | Type | Modifiers | Storage | Description
——–+———————–+—————————————————+———-+————-
id | integer | not null default nextval(‘data_id_seq’::regclass) | plain |
val | integer | not null | plain |
txt | character varying(20) | | extended |
Indexes:
“data_pkey” PRIMARY KEY, btree (id)
“idx_val” btree (val)
Has OIDs: no

Avg Time for 20000 inserts with concurrency of 2 : 221.3 seconds
Load on machine : 2.0
size on disk : 2 MB

Avg Time for 50000 inserts with concurrency of 5 : 232 seconds
Load on machine : 2.0
Size on disk : 4.8 MB

Avg Time for 100000 inserts with concurrency of 10 : 248.75 seconds
Load on machine : 4.0
Size on disk : 9.8 MB

Avg Time for 200000 inserts with concurrency of 20 : 276.34
Load on machine : 3.6
Size on disk : 19 MB

Time for 400000 inserts with concurrency of 40 : 350.11
Load in machine : 5.8
size on disk : 38 MB

The graph shows that mysql is heavy as compared to pgsql. The base timings are almost 5 times more in mysql as compared to pgsql. Also as the concurrency goes up the time required for inserts in mysql spikes up more steeply as compared to that required for postgres.

I did a sample run on mysql by turning innodb_flush_logs_at_trx_commit=2 and the benefit I got was a lot

Avg Time for 20000 inserts with concurrency of 2 (innodb_flush_logs_at_trx_commit=2) : 5.2 seconds
Avg Time for 100000 inserts with concurrency of 10 (innodb_flush_logs_at_trx_commit=2) : 18.69 seconds

Similarly i disabled synchronous_commit on postgres and did a sample run

Avg Time for 20000 inserts with concurrency of 2 (synchronous_commit = off) : 2.95 seconds
Avg Time for 100000 inserts with concurrency of 10 (synchronous_commit = off) : 15.06 seconds

PS : The average time is the time for 10000 inserts (inserts per instance)

Lets see what do the selects tell – in the next blog.

Postgresql tuning – quick tips

The configuration file for postgresql is located in <postgresql_install_dir>/data/postgresql.conf

You can alter the following settings in the config for better performance of postgresql.

# listen to all ip addresses
listen_addresses = ‘*’

# should not exceed 1000. Req 400 bytes per connection
max_connections = 500

# used for managing connection, active operations
# should be 1/4 of the available memory

shared_buffers = 1024MB

# dedicated memory for each operation. Used basically for sorting
# should be available_memory/max_connections for normal operations. Max available_memory/(2*max_connections)
# another way of getting this number is using the EXPLAIN ANALYZE query. If the plan shows “sort method: external merge disk: 7532kb”, then work_mem of 8Mb can do wonders.

work_mem = 1MB

# same as work_mem but for vaccum, alter, other ddl qry.
# should be around 256MB

maintenance_work_mem = 256MB

# size of write ahead log files
# default 8 KB. 1 MB is enough for large systems.
# SMP machines are better with 8 MB

wal_buffers = 1MB

# After every transaction, pgsql forces a commit to disk out its write-ahead log.
# defaults to fsync.
# generally switched to open_sync, but it is buggy on many platforms.
# Should be benchmarked with very heavy query, before switching.

wal_sync_method = fsync

# estimate of how much memory is available for disk caching by the OS and within the DB itself
# recomended to 1/2 of available memory. On unix can be set to free+cached from “free”.

effective_cache_size = 512MB

# by default 3*16MB per segment = 48 MB. Can be resource intensive on modern systems.
# setting it to 32 – checkpoint every 512 MB can be effective in reducing disk io

checkpoint_segments = 32

# checkpoint occurs every 5 minutes. can be increased
checkpoint_timeout = 5min

# should be increased if you want to collect a lot of information for helping pgsql to create query plans
default_statistics_target=100

# Synchronous commit introduced in pgsql 8.3 allows a small amount of data loss (in case of failure) for large boost in number of updates on the database per second.
synchronous_commit=on

# If after tweaking every variable, your query execution plan still is not acceptable,
# you can bring down the random page cost to 3.0 or 2.0.
# setting this variable lower will encourage the query optimizer to use random access index scans.

random_page_cost=4.0

Setting up replication on postgresql9.0 beta

Eventually postgresql comes out with its inbuilt replication solution in 9.0 beta. Setting up of replication is quite simple. We will look at setting up a simple master-slave replication between two servers of postgresql.

The way replication works in postgres is that the master keeps on writing “write ahead log” files (also known as wal files). The slave can be configured to run either in recovery mode or hot standby. The difference being that in recovery mode the wal files require to be shipped using something like a cron. And the slave applies the wal files to its database. The problem here is that it is not smooth and the delay between master and slave can be big. The logic of hot standby is that the slave connects to the master and fetches the log files from master. And it applies those log files on its database. Here the lag between master and slave is not huge since the slave tries to keep up with the master. This is also known as streaming replication because the wal files are continuously shipped to the slave and applied there. There could be multiple slaves fetching wal files from master and applying to their own database. As opposed to mysql the slave here is readonly – it cannot be modified. It can be used only as read-only slave for select queries.

Lets look at deploying master-slave replication between two servers

Master server : 172.16.3.173
Slave server : 172.16.3.211

Step 1:
Install postgresql 9.0 on both servers. Simple steps for installation are
$ ./configure
$ make
$ make install
$ adduser postgres

Step 2:
Initialize the data directory on master. No need to initialize it on slave – since we will be copying the master directory to slave for starting replication.
$ mkdir /usr/local/pgsql/data
$ chown -R postgres.postgres /usr/local/pgsql/data
$ su – postgres
$ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data

In case you are trying to create a slave of existing postgres database, you can simply skip these steps and install postgres on a slave machine – and not initialize the data directory on the slave machine.

Step 3:
Change config on master so that any slave is authorized connect to the master. Add the following line to pg_hba.conf of master.
$ vim data/pg_hba.conf

host    replication     postgres        172.16.3.211/22         trust

This says that 172.16.3.211 (the slave) is allowed to connect to master for replication without any password. Configure this for all slaves (if you plan to have multiple slaves).

Step 4:
Change parameters on the master postgresql for streaming replication.
$ vim data/postgresql.conf

# allow all ip addresses to connect to the master.
listen_addresses = '*'

# set wal_level for hot standby replication.
wal_level = hot_standby

# enable archiving to an archive directory accessible from the standby. Not required for streaming replication. But set it for "just in case" scenarios 
archive_mode = on
archive_command = 'cp "%p" /path/to/archive_directory/"%f"'

# set the number of concurrent slave connections
max_wal_senders = 5

# set the minimum no of WAL segments to be retained in the pg_xlog directory from where the slave can pick up the WAL segments.
# This prevents removal of WAL segments before they are shipped the slave.
wal_keep_segments = 32


Step 5:
Start postgres on primary/master server.

Step 6:
Create a backup of master and ship it to the slave server.
$ ./bin/psql -c “SELECT pg_start_backup(‘postgres’,true)”
$ rsync -a /usr/local/pgsql/data postgres@172.16.3.211:/usr/local/pgsql/data
$ ./bin/psql -c “SELECT pg_stop_backup()”

Step 7:
Clean the data directory in slave and set replication parameters.
$ rm /usr/local/pgsql/data/postmaster.pid
Here the postgresql.conf is same as master, so just set the parameters required by slave.
$ vim /usr/local/pgsql/data/postgresql.conf

# disable archiving
archive_mode = off
# enable read-only queries on slave - during replication/recovery
hot_standby = on

Step 8:
Setup the parameters in recovery.conf required for streaming replication
$ vim recovery.conf

# shell command for copying log files back from the archive. Should not be required.
restore_command = 'cp -i /home/postgres/archive/%f "%p" </dev/null'
# enable standby
standby_mode = 'on'
# connection information for connecting to master
primary_conninfo = 'host=172.16.3.173 port=5432 user=postgres'
# stop streaming and enable the server to work in read/write mode, if this file is found.
# server keeps on polling for the trigger file and stops replication once this file is found. The server can then be used in read/write mode.
trigger_file = '/usr/local/pgsql9.0/data/stop.replication'

Step 9:
Start postgres on slave. It will start streaming replication between master and slave.
To check replication see if sender process is active on master

$ ps -ef | grep sender
postgres 12332 12158  0 10:13 ?        00:00:00 postgres: wal sender process postgres 172.16.3.211(41497) streaming 0/90149F0
postgres 16233  9474  0 11:36 pts/2    00:00:00 grep sender

On slave check if receiver process is working

$ ps -ef | grep receiver
postgres 27952 27945  0 10:12 ?        00:00:00 postgres: wal receiver process   streaming 0/9014974        
postgres 30354 23492  0 11:36 pts/0    00:00:00 grep receiver


That is it. Run a few DML/DDL queries on master and check if they are being replicated on slave. The lag between master and slave is not noticable.

postgresql replication using slony-I

As most postgresql users must be knowing, postgresql does not provide any inbuilt replication solution. There are lots of 3rd party replication products available for postgresql. Slony is one of them. Slony is a trigger based replication solution, that is it used triggers to push data to the slave. Slony is supposed to be one of the most stable replication solutions for postgresql.

You can download slony from www.slony.info. There are two major versions of slony – slony-I & slony-II. Slony-I is a simple master-slave replication solution. Whereas slony-II is a advanced multi-master replication solution. We will go ahead with simple master-slave replication solution. So we will download Slony-I. The latest version available is Slony-I 1.2.15. Slony-I 2.0 is in RC and should be soon released. But we will go with a stable release – 1.2.15.

Postgresql version being used is 8.3.3. To install slony, simply untar the downloaded file and run
./configure –with-pgconfigdir=<path to pg_config>
make
sudo make install

I have used two machines for setting up replication. Installed postgresql and slony-I on both of them.

master server ip : 172.16.3.211
slave server ip : 172.16.3.173

We will be working with the superuser postgres which is used to start and stop the postgresql server.

Quick steps

  • Define environment variables on master & slave. The main purpose is to make our task easier. Lets create an env.sh file containing all the definitions.

    #!/bin/sh

    REPLICATIONUSER=postgres
    CLUSTERNAME=replcluster
    MASTERDBNAME=repltestdb
    SLAVEDBNAME=repltestdb
    MASTERHOST=172.16.3.211
    SLAVEHOST=172.16.3.173
    MASTERPORT=5432
    SLAVEPORT=5432
    MASTERDBA=postgres
    SLAVEDBA=postgres
    PSQL=/usr/local/pgsql/bin/psql
    CREATEDB=/usr/local/pgsql/bin/createdb
    CREATELANG=/usr/local/pgsql/bin/createlang
    CREATEUSER=/usr/local/pgsql/bin/createuser
    PGDUMP=/usr/local/pgsql/bin/pg_dump

    export REPLICATIONUSER CLUSTERNAME MASTERDBNAME SLAVEDBNAME MASTERHOST SLAVEHOST PSQL CREATEDB CREATELANG CREATEUSER PGDUMP MASTERPORT SLAVEPORT MASTERDBA SLAVEDBA

    As you can see here, my postgresql is installed in /usr/local/pgsql. I have defined the IP addresses & ports of master and slave servers. I have used the superuser postgres for replication. And i have defined the master and slave databases to be used for replication. You can replicate between databases with different names on master and slave – just change the names in all the scripts.

  • Create database on master & slave
    On master run
    /usr/local/pgsql/bin/createdb -O $REPLICATIONUSER -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME
    On slave run
    /usr/local/pgsql/bin/createdb -O $REPLICATIONUSER -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME
  • Since slony-I depends on triggers for replication, you will need to install the plsql procedural language on master to generate and run triggers & stored procedures for pushing data to slave.
    /usr/local/pgsql/bin/createlang -h $MASTERHOST -p $MASTERPORT plpgsql $MASTERDBNAME
  • Put some tables in the $MASTERDBNAME on master, which you want to replicate. And port the tables to slave. It has to be done manually.

    Dump the tables on master
    /usr/local/pgsql/bin/pg_dump -s -U $MASTERDBA -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME > replmaster.sql

    Import the tables on slave
    /usr/local/pgsql/bin/psql -U $SLAVEDBA -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME < replmaster.sql

  • And now configure the databases for replication. When you install Slony-I, it puts two binaries slonik and slon in the pgsql/bin directory. Slonik is the tool which is used for creating configuration tables, stored procedures and triggers. All we need to do is create a configuration file to pass it to the slonik tool. Here i am assuming that there are two tables which need to be replicated – parent & child.

    vim replconfig.cnf
    # define the namespace the replication system uses in our example it is
    # replcluster
    cluster name = replcluster;
    # admin conninfo’s are used by slonik to connect to the nodes one for each
    # node on each side of the cluster, the syntax is that of PQconnectdb in
    # the C-API
    node 1 admin conninfo = ‘dbname=repltestdb host=172.16.3.211 port=5432 user=postgres’;
    node 2 admin conninfo = ‘dbname=repltestdb host=172.16.3.173 port=5432 user=postgres’;
    # init the first node. Its id MUST be 1. This creates the schema
    # _$CLUSTERNAME containing all replication system specific database
    # objects.
    init cluster ( id=1, comment = ‘Master Node’);
    # Add unique keys to table that do not have one.
    # This command adds a bigint column named “_Slony-I_$CLUSTERNAME_rowID” to the table which will have a default value of nextval(‘_$CLUSTERNAME.s1_rowid_seq’) and have UNIQUE & NOT NULL constraints applied on it.
    # table add key (node id = 1, fully qualified name = ‘table_name’);
    # Slony-I organizes tables into sets. The smallest unit a node can
    # subscribe is a set.
    # you need to have a set add table() for each table you wish to replicate
    create set (id=1, origin=1, comment=’parent child table’)
    set add table (set id=1, origin=1, id=1, fully qualified name = ‘public.parent’, comment=’parent table’);
    set add table (set id=1, origin=1, id=2, fully qualified name = ‘public.child’, comment=’child table’);
    # Create the second node (the slave) tell the 2 nodes how to connect to
    # each other and how they should listen for events.
    store node (id=2, comment = ‘Slave node’);
    store path (server = 1, client = 2, conninfo=’dbname=repltestdb host=172.16.3.211 port=5432 user=postgres’);
    store path (server = 2, client = 1, conninfo=’dbname=repltestdb host=172.16.3.173 port=5432 user=postgres’);
    store listen (origin=1, provider = 1, receiver =2);
    store listen (origin=2, provider = 2, receiver =1);

    Pass the config file to slonik for creating required triggers & config tables.

    /usr/local/pgsql/bin/slonik replconfig.cnf

  • Lets start the replication daemons on master & slave

    On master run
    /usr/local/pgsql/bin/slon $CLUSTERNAME “dbname=$MASTERDBNAME user=$MASTERDBA host=$MASTERHOST port=$MASTERPORT” > slon.log &

    On slave run
    /usr/local/pgsql/bin/slon $CLUSTERNAME “dbname=$SLAVEDBNAME user=$SLAVEDBA host=$SLAVEHOST port=$SLAVEPORT” > slon.log &

    Check out the output in slon.log files

  • Now everything is setup and from the slon.log files on master and slave you can see that both the servers are trying to sync with each other. But still replication is not on way. To start replication we need to make the slave subscribe to the master. Here is the required config file for doing this

    startrepl.cnf
    # This defines which namespace the replication system uses
    cluster name = replcluster;
    # connection info for slonik to connect to master & slave
    node 1 admin conninfo = ‘dbname=repltestdb host=172.16.3.211 port=5432 user=postgres’;
    node 2 admin conninfo = ‘dbname=repltestdb host=172.16.3.173 port=5432 user=postgres’;
    # Node 2 subscribes set 1
    subscribe set ( id = 1, provider = 1, receiver = 2, forward = no);

    Passing this file to slonik will do the trick and replication would start happening.

    /usr/local/pgsql/bin/slonik startrepl.cnf

Now simply make some inserts, updates and deletes on the master and check out whether they are happening on the slave as well. Officially, since replication is on full swing all changes in master tables should be replicated on the slave.

Please note that new tables & changes to table structures wont be replicated automatically. So whenever a new table is created or an existing table is altered the changes has to be manually propagated to slave and the scripts need to be run to make appropriate changes in the triggers and config tables.

Another important thing to note is that postgresql on master and slave should be able to communicate with both the ip addresses. For this add the ip addresses in the pgsql/data/pg_hba.conf.

For the able replication i had added the lines
host all all 172.16.3.211/32 trust
host all all 172.16.3.173/32 trust
to the pg_hba.conf file in both master & slave.

MySQL versus PostgreSQL – part II

My earlier post mysql versus postgresql brought me lots of negative comments – that i did not compare the transactional database of pgsql with the transactional engine (innodb) of mysql. The main reason why i did not do that was because i had found InnoDB to be very slow as compared to MyISAM.

But after all those comments i ran the benchmarks again using the same scripts and the same technology on the same machine (my laptop) and here are the results. I created a new table in both Mysql (using InnoDB engine) and pgsql. And i disabled the binary logging in mysql to speed up insert/update/delete queries. Please refer to the earlier post for the setup information.

Following notification would be used :

<operation(select/insert/update/delete)> : <no_of_threads> X <operations_per_thread>

  • Firstly i ran single thread with inserts both before and after disabling binary logging in mysql
    Mysql Insert : 1 X 100000
    Time : 65.22 Sec (binary logging enabled)
    Time : 32.62 Sec (binary logging disabled)
    So disabling binary logging in mysql would make your insert/update/delete queries take half the time.
    Pgsql Insert : 1 X 100000
    Time : 53.07 Sec
    Inserts in mysql are very fast.
  • Selects : 2 X 100000
    Mysql time : 30.1 Sec
    Pgsql time : 29.92 Sec
    Both are same
  • Updates : 2 X 50000
    Mysql time : 29.38 Sec
    Pgsql time : 36.98 Sec
    Mysql updates are faster
  • Ran 4 threads with different no_of_operations/thread
    Run 1 [Select : 1 X 100000, Insert : 1 X 50000, Update : 1 X 50000, Delete : 1 X 20000]
    Mysql time : 40.86 Sec
    Pgsql time : 45.03 Sec
    Run 2 [Select : 1 X 100000, Insert : 1 X 100000, Update : 1 X 50000, Delete : 1 X 10000]
    Mysql time : 49.91 Sec
    Pgsql time : 63.38 Sec
    Run 3 [Select : 1 X 100000, Insert : 1 X 20000, Update : 1 X 20000, Delete : 1 X 1000]
    Mysql time : 29.83 Sec
    Pgsql time : 29.3 Sec
    It could be seen that increasing the amount of insert/update/delete queries affects the performance of pgsql. Pgsql would perform better if number of selects are very high. Whereas mysql-innodb performs better in all cases
  • Had 4 runs with different no of threads.
    Run 1: 12 threads [Select : 2X30000 + 3X20000, Insert : 1X20000 + 2X10000, Update : 2X10000, Delete : 2X1000]
    Mysql time : 31.16 Sec
    Pgsql time : 30.46 Sec
    Run 2: 12 threads [Select : 2X50000 + 2X40000 + 1X30000, Insert : 1X20000 + 2X15000, Update : 2X15000, Delete : 2X2000]
    Mysql time : 52.25 Sec
    Pgsql time : 53.03 Sec
    Run 3: 20 Threads [Select : 4X50000 + 4X40000 + 2X30000, Insert : 2X20000 + 3X15000, Update : 2X20000 + 1X15000, Delete : 2X5000]
    Mysql time : 169.81 Sec
    Pgsql time : 136.04 Sec
    Run 4: 30 Threads [Select : 2X50000 + 3X40000 + 3X30000 + 3X20000 + 4X10000, Insert : 1X30000 + 2X20000 + 3X10000, Update : 3X20000 + 3X10000, Delete : 1X10000 + 2X5000]
    Mysql time : 200.25 Sec
    Pgsql time : 156.9 Sec
    So, it can be said that for a small system with less concurrency, mysql would perform better. But as concurrency increases, pgsql would perform better. I also saw that while running the pgsql benchmark, the system load was twice than while running mysql benchmark.

Enabling mysql binary logging for replication would ofcourse add an over head. Similarly enabling trigger based replication in pgsql would be another overhead. The fact that replication in mysql is very closely linked with the database server helps in making a high availability system easier. Whereas creating slaves using replication in pgsql is not that easy. All available products for replication in pgsql are external – 3rd party softwares. Still, for a high concurrency system pgsql would be a better choice.

MySQL versus PostgreSQL

I created and ran some simple tests on mysql and postgresql to figure out which one is faster. It is already known that postgresql is more stable and reliable than mysql. pgsql has a rich set of features. It is a complete RDBMS and also supports fulltext search.

All benchmarks were done on my laptop – Intel core 2 duo (2.0 GHz) with 4MB L2 cache & 2 GB ram. I have 64 Bit ubuntu system loaded with MySQL 5.1.24-rc (64 bit binary) and PostgreSQL 8.3.1 (compiled from source).

I used python as a scripting language for writing down my benchmark scripts. I used psycopg2 as a connector from python to postgres and mysql-python as a connector from python to mysql.

The benchmarking was done in phases. Firstly simple Insert, update and select queries were run to check the raw speed of these queries. Then threads were created to run simultaneous insert, update, select and delete queries. I checked the benchmark times for different number of concurrent threads.

I created a simple table on both mysql and pgsql. I used the MyISAM database engine to create table in mysql. :

ABC(id int not null auto_increment primary key, value varchar(250));

Queries that were run are:

Insert(I) : Insert ignore into ABC (id, value) …(For pgsql, a rule has to be created to ignore duplicate inserts)
Update(U) : Update ABC set value=<something> where id=<random_id>
Select(S) : Select * from ABC where id=<random_id>
Delete(D) : Delete from ABC where id=<random_id>

  • Insert – 100000 rows in 1 thread
    Time taken for Mysql : 20.8 seconds
    Time taken for Pgsql : 58.1 seconds
    So, raw insert speed of mysql is much better as compared to pgsql
  • 100000 selects in 1 thread
    Time taken for Mysql : 21.76 seconds
    Time taken for Pgsql : 20.15 seconds
    Raw selects are better in pgsql as compared to mysql
  • Selects – 2 threads of 100000 selects
    Time taken for Mysql : 40.46 seconds
    Time taken for Pgsql : 27.38 seconds
    So, if i increase the concurrency of selects, pgsql perfors much than mysql
  • Update – 2 threads of 50000
    Time taken for Mysql : 23.97 seconds
    Time taken for Pgsql : 34.03 seconds
    Mysql looks better in handling updates here.
  • 4 Threads
    Run 1 : [100000 Selects, 50000 Inserts, 50000 Updates, 20000 Deletes]
    Time taken for Mysql : 45.25 seconds
    Time taken for Pgsql : 54.58 seconds
    Run 2 : [100000 Selects, 100000 Inserts, 50000 Updates, 10000 Deletes]
    Time taken for Mysql : 59.05 seconds
    Time taken for Pgsql : 69.38 seconds
    Run 3 : [100000 Selects, 20000 Inserts, 20000 Updates, 1000 Deletes]
    Time taken for Mysql : 35.54 seconds
    Time taken for Pgsql : 31.23 seconds
    These runs show that Mysql is good when you have very large no of inserts/updates/deletes as compared to selects. But pgsql’s performance surpasses that of mysql when the number of selects are much higher.
  • Finally, lets approach the real life scenario where generally the number of selects are much more than the number of inserts and there are multiple threads performing selects and inserts.
    I will use the following notification here – <no_of_threads> X <no_of_operations(select/insert/update/delete)_per_thread>
    So, for example 3 X 20 Selects = 3 threads of 20 Selects in each thread

    Run 1 : [2 X 30000 selects, 3 X 20000 selects, 1 X 20000 inserts, 2 X 10000 inserts, 2 X 100000 updates, 2 X 1000 deletes] Total – 12 threads
    Time taken for Mysql : 42.28 seconds
    Time taken for Pgsql : 44.28 seconds
    Both Mysql and Pgsql are almost at par.

    Run 2 : [2 X 50000 selects, 2 X 40000 selects, 1 X 30000 selects, 1 X 20000 inserts, 2 X 15000 inserts, 2 X 15000 updates, 2 X 2000 deletes] Total – 12 threads but number of selects are quite high
    Time taken for Mysql : 61.02 seconds
    Time taken for Pgsql : 48.60 seconds
    So, as we increase the number of operations (specially selects) mysql’s performance degrades, whereas pgsql’s performance remains almost the same

    Run 3 : [4 X 50000 selects, 4 X 40000 selects, 2 X 30000 selects, 2 X 20000 inserts, 3 X 15000 inserts, 3 X 15000 updates, 2 X 3000 deletes] Total – 20 threads (10 threads for select, 5 for insert, 3 for update and 2 for delete) Which is the normal trend in database servers.
    Time taken for Mysql : 169.31 seconds
    Time taken for Pgsql : 128.7 seconds
    Bingo, so as concurrency increases pgsql becomes faster than mysql.

My earlier benchmarks with pgsql 7.x was not as good as this one. With postgresql 8.3.1, the speed of serving concurrent requests has increased a lot. So, in a high concurrency environment, i would generally recommend to go ahead with using postgresql rather than mysql.

Please check the comments section. We have some really interesting comments there…

From MySQL to PostgreSQL

Why? In brief, it is said that mysql is not as stable as postgresql. Postgresql or pgsql focuses on a single database engine as compared to mysql which has a pluggable engine architecture and has multiple engines. Also postgresql is well designed as compared to mysql. psql console is much better than mysql console (you will realize it when you use it). It is supposed to be much more scalable and have better performance than mysql. Pgsql uses a more standard sql language as compared to mysql.

Both have fulltext search capabilities. Though it is said that the fulltext search of pgsql is better than mysql, but i still have to look into it. And the most important of all – pgsql is a full fledged RDBMS, whereas mysql (using the default MyISAM engine) is a DBMS. Data integrity is better in pgsql as compared to mysql. Also i have seen lots of corrupt tables in mysql(might be because i have used mysql throughout my career), and have heard(on the web) that mysql is more crash prone than pgsql.

The final question then comes down to speed. Which one is faster? Logically, since mysql is a DBMS and it does not maintain foreign key relations, it should be faster than pgsql. I will benchmark and blog the results sometime later.

Basically, pgsql is more stable & reliable than mysql. Pgsql has a rich set of features and data integrity is well maintained in pgsql.

Lets look at some of the steps you need to follow if you decide to switch to pgsql from mysql:

  • First of all, down the database engine from www.postgresql.org. Untar it and compile it.

    $ tar -xvjf postgresql-8.3.1.tar.bz2
    $ cd postgresql-8.3.1
    $ ./configure –with-openssl –enable-thread-safety –with-libxml
    (check out the options that you need using ./configure –help)
    $ make
    $ sudo make install

    By default this will install pgsql in /usr/local directory

  • Next, create a user postgres and the database directory for pgsql
    $ adduser postgres
    $ mkdir /usr/local/pgsql/data
    $ chown postgres.postgres /usr/local/pgsql/data
  • Create the pgsql database:
    Firstly, log into mysql and do
    mysql> show variables like ‘%character’;
    +————————–+—————————————————————–+
    | Variable_name | Value |
    +————————–+—————————————————————–+
    | character_set_client | latin1 |
    | character_set_connection | latin1 |
    | character_set_database | latin1 |
    | character_set_filesystem | binary |
    | character_set_results | latin1 |
    | character_set_server | latin1 |
    | character_set_system | utf8 |
    | character_sets_dir | /usr/local/mysql-5.1.24-rc-linux-x86_64-glibc23/share/charsets/ |
    +————————–+—————————————————————–+
    You will see here that the character set for the database & server is latin1. Even though the default database is utf8. If you create the pgsql database using the default command, it will create a utf8 database and you will face problems importing your data into it.

    So, create a latin1 database for pgsql
    $ su – postgres
    $ initdb -D /usr/local/pgsql/data -E latin1 –locale=C

  • Now you have postgresql installed and you can create your own databases and tables. But before proceeding forward, do psql from user postgres’s shell.

    start pgsql:

    $ /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >~/logfile 2>&1 &

    $ psql
    Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

    Type: copyright for distribution terms
    h for help with SQL commands
    ? for help with psql commands
    g or terminate with semicolon to execute query
    q to quit

    postgres=#

    The =# says that you are logged in as a super user. To login as another user, create another user and give him required privileges. And then su to the user and login to pgsql from that user.

    postgres=# CREATE USER jayant WITH CREATEDB INHERIT LOGIN;
    postgres=# q
    $ su – jayant
    $ createdb test
    $ psql test
    test=>

    Here => says that i am not a super user. Now i can create my tables in the test database or create more databases.

    For help on how to create users do

    test=> h CREATE USER

    And it will give you the syntax for creating a user in postgres

  • To convert your table from mysql format to pgsql format, just dump the table and data separately from mysql using the mysqldump –no-data and mysqldump –no-create-info respectively.

    And you will get two files <mysql_table>.sql & <mysql_table_data>.sql

    download a perl script mysql2pgsql.perl which will be able to convert the sql for your table into appropriate format for pgsql.

  • Now load the table and data into pgsql

    $ psql -f <mysql_table>.sql
    $ psql -f <mysql_table_data>.sql

It looks simple, but it is not that simple. Some points that need to be remembered while using pgsql

  • auto_increment column is defined using the SERIAL word. So, to create an auto_increment primary key column the syntax would be

    CREATE TABLE MYTABLE(ID SERIAL NOT NULL PRIMARY KEY, ….);

    This would also create a sequence mytable_id_seq.

  • While importing data from the mysql dump you may get some warning about backslash escapes in your sql. To remove this warning you will have to set the escape_string_warning to off.

    edit your postgresql.conf file which can be found in the /usr/local/pgsql/data directory and change the variable to off. Now restart pgsql.

  • To start and stop pgsql following commands can be used

    start pgsql
    $ /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >~/logfile 2>&1 &

    stop pgsql
    $ /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data

  • Another thing to note is that you dont have custom queries like ‘insert ignore into’, ‘replace into’ or ‘insert into … on duplicate key…’ in pgsql. You will need to create rules on the table to handle these cases.

    test=> h create rule
    Command: CREATE RULE
    Description: define a new rewrite rule
    Syntax:
    CREATE [ OR REPLACE ] RULE name AS ON event
    TO table [ WHERE condition ]
    DO [ ALSO | INSTEAD ] { NOTHING | command | ( command ; command … ) }

    So for exampe to create an insert ignore into type of rule, the following rule needs to be created.

    test=> CREATE OR REPLACE RULE “insert_ignore_mytable” AS ON INSERT TO “mytable” WHERE EXISTS (SELECT 1 FROM “mytable” WHERE id = NEW.id) DO INSTEAD NOTHING;

  • Remember SHOW PROCESSLIST of mysql which used to list down all the processes.

    To list down all the processes for pgsql following command needs to be run

    test=> select * from pg_stat_activity;

We will looking more and more into pgsql.