A book every php developer should read

Once upon a time, a long long time ago, when there was no Solr and lucene used to be a search engine api available for php developers to use, they used to struggle for using lucene. Most people reverted back to Mysql full text search. And some ventured into using sphinx – another free full text search engine. Then came Solr and php developers were thrilled with the ease of use over the http interface for both indexing and searching text using lucene abstracted by Solr.

Even in those days, it was difficult to fully explore and use the features provided by lucene and Solr through php. There is an extension in Php to communicate to Solr. But the extension has not been in active development. As Solr came out with more and more features, the extension became very basic. Most of the advanced features provided by Solr were not available in the php Solr extension. Then came Solarium, an open source library which is being very actively developed and had support for the latest features of Solr.

But as the features of Solarium and Solr kept on increasing, php developers find it difficult to keep up to date with them. The book Apache Php Solr Integration provides an up to date and in depth view of the latest features provided by Solr and how it can be explored in php via Solarium. Php developers are generally very comfortable in writing code and in setting up systems. But in case you are a developer and are not very familiar with how to Setup solr or how to connect to Solr using php, the book hand holds you with configurations, examples and screen shots.

In addition to discussing simple topics like indexing and search which are very basic to Solr, the book also goes in depth on advanced queries in Solr like filter queries and faceting. The book also guides a developer on setting up Solr for highlighting hits in the results and goes into the implementation with sample codes. Other advanced functionalities discussed in the book are development and implementation of spell check in Solr and php, grouping of results, implementing the more like this feature in Php and Solr. The book also discusses distributed search – a feature used for Scaling Solr horizontally. Setting up of Master-Slave on Solr is discussed with sample configuration files. Load balancing of queries using php and Solarium is also discussed with sample code.

As a php developer, you may have some questions like

Q: Why should i read this book?
A: The book would make you an expert in search using Solr. That would be an additional skill that you can show off.

Q: I know Solr. What additional does the book provide ?
A: Are you up to date with the latest features provided by Solr ?  Have you implemented featues like spell check, suggestions, result grouping, more like this ?

Q: I am an expert in all the above features of Solr. What else does the book have ?
A: Are you comfortable implementing Solr on large scale sites with index which has millions of documents ? Do you know how Solr calculates relevance and how it can be tweaked ? Can you provide index statistics of Solr using php ?

If you are still undecided, the following article and table of contents of the book will help you make your mind.

http://www.packtpub.com/article/apache-solr-php-integration
http://www.packtpub.com/apache-solr-php-integration/book

How bigpipe works

Bigpipe is a concept invented by facebook to help speed up page load times. It paralellizes browser rendering and server processing to achieve maximum efficiency. To understand bigpipe lets see how the a user request-response cycle is executed in the current scenario

  • Browser sends an HTTP request to web server.
  • Web server parses the request, pulls data from storage tier then formulates an HTML document and sends it to the client in an HTTP response.
  • HTTP response is transferred over the Internet to browser.
  • Browser parses the response from web server, constructs a DOM tree representation of the HTML document, and downloads CSS and JavaScript resources referenced by the document.
  • After downloading CSS resources, browser parses them and applies them to the DOM tree.
  • After downloading JavaScript resources, browser parses and executes them.

In this scenario, while the web server is processing and creating the HTML document, the browser is idle and when the browser is rendering the html page, the web server remains idle.

Bigpipe concept breaks the page into smaller chunks known as pagelets. And makes page rendering on browser and processing on server side as parallel processes speeding up the page load time.

The request response cycle in the bigpipe scenario is as follows.

  • The browser sends an HTTP request to web server.
  • Server quickly renders a page skeleton containing the tags and a body with empty div elements which act as containers to the pagelets. The HTTP connection to the browser stays open as the page is not yet finished.
  • Browser will start downloading the bigpipe javascript library and after that it’ll start rendering the page
  • The PHP server process is still executing and its building the pagelets. Once a pagelet has been completed it’s results are sent to the browser inside a BigPipe.onArrive(…) javascript tag.
  • Browser injects the html code for the pagelet received into the correct place. If the pagelet needs any CSS resources those are also downloaded.
  • After all pagelets have been received the browser starts to load all external javascript files needed by those pagelets asynchronously.
  • After javascripts are downloaded browser executes all inline javascripts.

This results in a parallel system where as the pagelets are being generated the browser is rendering the pagelets. From the user’s perspective the page is rendered progressively. The initial page content becomes visible much earlier, which dramatically improves user perceived latency of the page.

Source : https://www.facebook.com/note.php?note_id=389414033919
open bigpipe implementation : https://github.com/garo/bigpipe

Database library to handle multiple masters and multiple slaves

In a large scale mysql deployment there could be multiple masters and multiple slaves. Masters are generally in circular replication. And are used for running all inserts, updates and deletes. Slaves are used to run selects.

When you are dealing with multiple mysql instances running in a large scale environment, it is important to take care of lags between masters and slaves. To handle such scenarios, the code should be capable of firing query on a server dynamically. Which means that for each query, I as a developer should have the flexibility to decide which server the query should go.

A list of existing scenarios :

1. All registrations / username generation process should happen on a single master. If you generate usernames at both masters, there may be scenarios where, due to lag between mysql masters, the user is not reflected. And in such a case, the user may register again and land on another master. Creating the same username again and breaking the circular replication. So all registrations and check for “username exists” should happen on a single master.

2. For all other Insert, Update and Delete operations, the user should be stuck to a single master. Why ? Assume there is a lag of around 30 minutes between the masters and slaves. The user inserts a record and immediately wants to see what record has been inserted. If we fetch the record from another master or slave, the record will not be available, because it has not yet been replicated. To take care of this scenario, whenever a record is inserted the immediate select has to be from the same server.

3. For all other selects, the query can be fired on any of the slaves. For example, the user logs into the site and sees his own profile. We show him his profile using one of the slave servers. This can be cached as well. The point here is that for data which has not been updated recently – the query can be fired on any of the slaves.

The following piece of code/library handles most of the scenarios. Please feel free to suggest modifications or improvements.


/**
* Created by : Jayant Kumar
* Description : php database library to handle multiple masters & multiple slaves
**/
class DatabaseList // jk : Base class
{
public $db = array();
public function setDb($db)
{
$this->db = $db;
}
public function getDb()
{
return $this->db;
}
}
class SDatabaseList extends DatabaseList // jk : Slave mysql servers
{
function __construct()
{
$this->db[0] = array(‘ip’=>’10.20.1.11’, ‘u’=>’user11’, ‘p’=>’pass11’, ‘db’=>’database1’);
$this->db[1] = array(‘ip’=>’10.20.1.12’, ‘u’=>’user12’, ‘p’=>’pass12’, ‘db’=>’database1’);
$this->db[2] = array(‘ip’=>’10.20.1.13’, ‘u’=>’user13’, ‘p’=>’pass13’, ‘db’=>’database1’);
//print_r($db);
}
}
class MDatabaseList extends DatabaseList // jk : Master mysql servers
{
function __construct()
{
$this->db[0] = array(‘ip’=>’10.20.1.1’, ‘u’=>’user1’, ‘p’=>’pass1’, ‘db’=>’database1’);
$this->db[1] = array(‘ip’=>’10.20.1.2’, ‘u’=>’user2’, ‘p’=>’pass2’, ‘db’=>’database2’);
//print_r($db);
}
}
class MemcacheList extends DatabaseList // jk : memcache servers
{
function __construct()
{
$this->db[0] = array(‘ip’=>’localhost’, ‘port’=>11211);
}
}
Interface DatabaseSelectionStrategy  // jk : Database interface
{
public function getCurrentDb();
}
class StickyDbSelectionStrategy implements DatabaseSelectionStrategy // jk : sticky db . For update / delete / insert
{
private $dblist;
private $uid;
private $sessionDb;
private $sessionTimeout = 3600;
function __construct(DatabaseList $dblist)
{
$this->dblist = $dblist;
}
public function setUserId($uid)
{
$this->uid = $uid;
}
public function setSessionDb($sessionDb)
{
$this->sessionDb = $sessionDb->db;
}
private function getDbForUser() // jk : get db for this user. If not found – assign him random master db.
{
$memc = new Memcache;
foreach ($this->sessionDb as $key => $value) {
$memc->addServer($value[‘ip’], $value[‘port’]);
}
$dbIp = $memc->get($this->uid);
if($dbIp == null)
{
$masterlist = new MDatabaseList();
$randomdb = new RandomDbSelectionStrategy($masterlist);
$mdb = $randomdb->getCurrentDb();
$dbIp = $mdb[‘ip’];
$memc->set($this->uid, $dbIp, false, $this->sessionTimeout);
}
return $dbIp;
}
public function getCurrentDb()
{
$dbIp = $this->getDbForUser();
foreach ($this->dblist->db as $key => $value) 
{
if($value[‘ip’] == $dbIp)
return $value;
}
}
}
class RandomDbSelectionStrategy implements DatabaseSelectionStrategy // jk : select random db from list
{
private $dblist;
function __construct(DatabaseList $dblist)
{
//print_r($dblist);
$this->dblist = $dblist;
}
public function getCurrentDb()
{
//print_r($this->dblist);
$cnt = sizeof($this->dblist->db);
$rnd = rand(0,$cnt-1);
$current = $this->dblist->db[$rnd];
return $current;
}
}
class SingleDbSelectionStrategy implements DatabaseSelectionStrategy // jk : select one master db – to generate unique keys
{
private $dblist;
function __construct(DatabaseList $dblist)
{
$this->dblist = $dblist;
}
public function getCurrentDb()
{
//print_r($this->dblist);
return $this->dblist->db[0];
}
}
Interface Database
{
public function getIp();
public function getDbConnection();
}
class DatabaseFactory implements Database // cmt : database factory
{
private $db;
public function getIp()
{
return $this->db[‘ip’];
}
public function getDbConnection($type = ‘slave’, $uid = 0)
{
$dbStrategy;
switch($type)
{
case ‘slave’:
$dblist = new SDatabaseList();
//print_r($dblist);
$dbStrategy = new RandomDbSelectionStrategy($dblist);
break;
case ‘master’:
$dblist = new MDatabaseList();
//print_r($dblist);
$dbStrategy = new StickyDbSelectionStrategy($dblist);
$dbStrategy->setSessionDb(new MemcacheList());
$dbStrategy->setUserId($uid);
break;
case ‘unique’:
$dblist = new MDatabaseList();
//print_r($dblist);
$dbStrategy = new SingleDbSelectionStrategy($dblist);
break;
}
$this->db = $dbStrategy->getCurrentDb();
print_r($this->db);
// return mysql_connect($this->db[‘ip’], $this->db[‘u’], $this->db[‘p’], $this->db[‘db’]);
}
}
// tst :  test this out…
$factory = new DatabaseFactory();
echo ‘Slave : ‘; $factory->getDbConnection(‘slave’);
echo ‘Slave2 : ‘; $factory->getDbConnection(‘slave’);
echo ‘Unique : ‘; $factory->getDbConnection(‘unique’);
echo ‘New Master 100: ‘; $factory->getDbConnection(‘master’,100);
echo ‘New Master 101: ‘; $factory->getDbConnection(‘master’,101);
echo ‘New Master 102: ‘; $factory->getDbConnection(‘master’,102);
echo ‘old Master 100: ‘; $factory->getDbConnection(‘master’,100);
echo ‘old Master 102: ‘; $factory->getDbConnection(‘master’,102);
?>

Deployments

Deployments are a critical phase of any project. In the “stone age”, developers used to simply scp the files required into production. And there used to be issues when you are dealing with multiple http servers. Keeping all the servers in sync was always the issue.

Then capistrano came into picture. It made deployment of ruby/rails apps very easy. Me and a few other people went ahead and modified it to deploy code into production for php apps as well. But it was a tricky job since capistrano was originally designed to work for ruby/rails apps. Here is a sample capistrano code that sucks out code from svn and pushes it to multiple web servers – and then runs some specific post deployment tasks on them

deploy.rb

set :application, “app”
set :repository,  “http:///tags/TAG102”
set :imgpath, “/var/images”

# svn settings
set :deploy_via, :copy
set :scm, :subversion
set :scm_username, “svn_username”
set :scm_password, “svn_password”
set :scm_checkout, “export”
set :copy_cache, true
set :copy_exclude, [“.svn”, “**/.svn”]

# ssh settings
set :user, “server_username”
set :use_sudo, true
default_run_options[:pty] = true

#deployment settings
set :current_dir, “html”
set :deploy_to, “”
set :site_root, “/var/www/#{current_dir}”
set :keep_releases, 3

#web servers
role :web, “192.168.1.1”,”192.168.1.2″,”192.168.1.3″

#the actual script
namespace :deploy do
    desc <<-DESC
  deploy the app
    DESC
    task :update do
      transaction do
        update_code
            symlink
        end
      end

    task :finalize_update do
      transaction do
        sudo “chown -R apache.apache #{release_path}”
        sudo “ln -nfs #{imgpath}/images #{release_path}/images”     
      end
    end

    task :symlink do
      transaction do
            puts “Symlinking #{current_path} to #{site_root}.”
            sudo “ln -nfs #{release_path} #{site_root}”
      end
    end

    task :migrate do
      #do nothing
    end

    task :restart do
      #do nothing
    end   
end

This sucks out the code from the svn repository. creates a tar on local. Scps it to production web servers. Untars it to the specified location. Runs all the tasks specified in finalize_update and finally changes the symlink of “html” directory to the new deployed path. The good point about capistrano is that you are almost blind as to what happens in the backend. The bad point is that since you are blind, you do not know how to do what you want to do. It would need a bit of digging and a bit of tweaking to get your requirements fulfilled by this script.

Now lets check fabric.

Installation is quite easy using pip.

sudo pip install fabric

In case you like the old fashioned way, you can go ahead and download the source code and do a

sudo python setup.py install

To create a fabric script, you need to create a simple fab file with whatever you require. For example, if you need to run a simple command like ‘uname -a’ on all your servers, just create a simple script fabfile.py with the following code

from fabric.api import run

def host_type():
        run(‘uname -a’)

And run the script using the following command

$ fab -f fabfile.py -H localhost,127.0.0.1 host_type

[localhost] Executing task ‘host_type’
[localhost] run: uname -a
[localhost] Login password:
[localhost] out: Linux gamegeek 3.2.0-24-generic #37-Ubuntu SMP Wed Apr 25 08:43:22 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

[127.0.0.1] Executing task ‘host_type’
[127.0.0.1] run: uname -a
[127.0.0.1] out: Linux gamegeek 3.2.0-24-generic #37-Ubuntu SMP Wed Apr 25 08:43:22 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Done.
Disconnecting from localhost… done.
Disconnecting from 127.0.0.1… done.

A simple fabric script which can do whatever the earlier capistrano script was doing is here.

fabfile.py

from __future__ import with_statement
from fabric.api import *
from fabric.operations import local,put

def production():
  env.user = ‘server_username’
  env.hosts = [‘192.168.1.1′,’192.168.1.2′,’192.168.1.3’]
  env.deploy_to = ”
  env.site_root = ‘/var/www/html’
  env.tag_name = ‘tag101’
  env.repository = {  ‘url’:’http:///tags/TAG101′,
            ‘username’: ‘svn_username’,
            ‘password’: ‘svn_password’,
            ‘command’: ‘svn export –force’,
          }
  env.image_path = ‘/var/images’

def deploy():
  checkout()
  pack()
  unpack()
  symlinks()
  makelive()

def checkout():
  local(‘%s –username %s –password %s –no-auth-cache %s /tmp/%s’ %
    (env.repository[‘command’], env.repository[‘username’], env.repository[‘password’], env.repository[‘url’], env.tag_name));

def pack():
  local(‘tar -czf /tmp/%s.tar.gz /tmp/%s’ % (env.tag_name, env.tag_name))

def unpack():
  put(‘/tmp/%s.tar.gz’ % (env.tag_name), ‘/tmp/’)
  with cd(‘%s’ % (env.deploy_to)):
    run(‘tar -xzf /tmp/%s.tar.gz’ % (env.tag_name))

def symlinks():
  run(‘ln -nfs %s/images %s/%s/images’ % (env.image_path, env.deploy_to, env.tag_name))

def makelive():
  run(‘ln -nfs %s/%s %s’ % (env.deploy_to, env.tag_name, env.site_root))

The good point is that i have more control on what i want to do using fabric as compared to capistrano. And it took me a lot less time to cook the fabric recipe as compared to capistrano.

To run this script simply do

fab production deploy

This will execute the tasks production and deploy in that order. You can have separate settings for staging and local in the same script. You can even go ahead and create your own deployment infrastructure and process to do whatever you want without running into any restrictions.

cfengine – a beginners guide A tool to automate infra…

opcode cache comparison – xcache versus apc versus no-cache

Opcode caching tools like xcache, apc are also known as php accelerators. They work by storing the bytecode of interpreted php in shared memory.

To explain in common english, php is an interpreted language. Every time a http request comes, it picks up all php files required for its processing and interprets them – compiles them into machine readable code – known as opcode. The opcode is then run on the machine. Php accelerators speed up php execution by lazily compiling php (on demand compiling) and storing the generated opcode in shared memory. Thus preventing file io and overhead of interpreting the code again and again.

There are multiple solutions available for opcode caching in php. Here is the list available at wikipedia http://en.wikipedia.org/wiki/List_of_PHP_accelerators. APC and Xcache are two out of these which are somewhat famous.

I did a benchmark of both and tried to figure out which one would be better. Here is the benchmark result, and an analysis of the benchmark.

I ran the benchmark on my system

Configuration
Intel(R) Core(TM)2 Duo CPU T6570 @ 2.10GHz
L2 cache : 2048 Kb
RAM : 2 GB
php version 5.2.17
apache version 2.2.19
apc version 3.1.9
xcache version 1.3.2
Ubuntu 11.04 – 32 bit
siege was used to run the benchmarks

Installation
APC : sudo pecl install apc
xcache : download xcache, untar, ./configure; make; make install
Load the respective caching module in php.ini
extension = apc.so OR extension = xcache.so

Cache Configuration
For both xcache and apc, i accepted the default settings and changed only these variables
gc_interval = 3600;
ttl = 3600;
size = 32M;

I benchmarked a page deployed on my local system with images, tons of php code and lots of mysql queries.

Run 1 : 5 minutes with concurrency = 10

without cache

Transactions: 422 hits
Availability: 100.00 %
Elapsed time: 299.20 secs
Data transferred: 39.65 MB
Response time: 6.50 secs
Transaction rate: 1.41 trans/sec
Throughput: 0.13 MB/sec
Concurrency: 9.16
Successful transactions: 422
Failed transactions: 0
Longest transaction: 8.17
Shortest transaction: 3.44

APC

Transactions: 465 hits
Availability: 100.00 %
Elapsed time: 299.74 secs
Data transferred: 43.66 MB
Response time: 5.86 secs
Transaction rate: 1.55 trans/sec
Throughput: 0.15 MB/sec
Concurrency: 9.09
Successful transactions: 465
Failed transactions: 0
Longest transaction: 9.20
Shortest transaction: 3.91
Total Hits in cache: 85,773
Total Misses in cache: 223

Xcache

Transactions: 479 hits
Availability: 100.00 %
Elapsed time: 299.11 secs
Data transferred: 44.99 MB
Response time: 5.67 secs
Transaction rate: 1.60 trans/sec
Throughput: 0.15 MB/sec
Concurrency: 9.07
Successful transactions: 479
Failed transactions: 0
Longest transaction: 7.39
Shortest transaction: 3.80
Total Hits on cache: 87,884
Total Misses in cache: 158

As you can see with a concurrency of 10, xcache gives a transaction rate of 1.6, apc gives 1.55 and no-cache gives 1.41 transactions per second. There is a 10% improvement with apc and 14% improvement with xcache.

Shortest transaction with nocache was 3.44 where as that with xcache was 3.8 and that with apc was 3.91. This shows that xcache took less time as compared to apc for caching a page miss. The longest transaction of apc was higher than that of no-cache. But xcache’s longest transaction was better than the longest transaction of no-cache. Ideally the longest transaction of cached page should be less than that of no-cache page. Why APC had a higher longest transaction – i was unable to figure out.

Run 2 : 5 minutes with concurrency = 20

No cache

Transactions: 373 hits
Availability: 100.00 %
Elapsed time: 299.70 secs
Data transferred: 35.11 MB
Response time: 15.10 secs
Transaction rate: 1.24 trans/sec
Throughput: 0.12 MB/sec
Concurrency: 18.79
Successful transactions: 373
Failed transactions: 0
Longest transaction: 20.58
Shortest transaction: 5.41

APC

Transactions: 458 hits
Availability: 100.00 %
Elapsed time: 299.93 secs
Data transferred: 43.09 MB
Response time: 12.28 secs
Transaction rate: 1.53 trans/sec
Throughput: 0.14 MB/sec
Concurrency: 18.75
Successful transactions: 458
Failed transactions: 0
Longest transaction: 19.19
Shortest transaction: 9.73

Xcache

Transactions: 459 hits
Availability: 100.00 %
Elapsed time: 299.85 secs
Data transferred: 43.18 MB
Response time: 12.30 secs
Transaction rate: 1.53 trans/sec
Throughput: 0.14 MB/sec
Concurrency: 18.82
Successful transactions: 459
Failed transactions: 0
Longest transaction: 15.12
Shortest transaction: 6.60

In the second run though both xcache and apc have the same transaction rate of 1.53 which is 23% higher than that of no cache 1.24, but the longest transaction and shortest transaction of xcache was better than that of apc. It shows that xcache handles caching better than apc.

Eventually, there were some bugs in apc caused crashes when i tried to implement it. And xcache ran fine without any issues. This scenario shows that xcache is better.

Database speed tests (mysql and postgresql) – part 3 – code

Here is the code structure

dbfuncs.php : is the file which contains classes and functions for firing queries on mysql and pgsql
mysqlinsert.php : creates and fires inserts on mysql
mysqlselect.php : creates and fires selects on mysql
pgsqlinsert.php : creates and fires inserts on pgsql
pgsqlselect.php : creates and fires selects on pgsql
benchmark.php : script used to control concurrency and number of queries per script

Please refer to http://www.jayantkumar.in/index.php/2010/09/29/database-speed-tests-mysql-and_29/ and http://www.jayantkumar.in/index.php/2010/09/27/database-speed-tests-mysql-and-2/ for benchmarks of selects and inserts respectively.

And the code….

dbfuncs.php

abstract class dbfuncs
{
  abstract function insertqry($qry);
  abstract function selectqry($qry);
  abstract function connectdb();

  public function log($str)
  {
    $file = "error.log";
    $fp = fopen($file, "a");
    fwrite($fp, "$strn");
    fclose($fp);
  }
}

class mysqldb extends dbfuncs
{
  private $user = "root";
  private $pass = "jayant";
  private $host = "localhost";
  //private $port = 3307;
  private $socket = "/tmp/mysql.sock";
  private $database = "benchmark";

  public $db;

  function __construct()
  {
    $this->connectdb();
  }

  public function connectdb()
  {
    $this->db = mysql_connect($this->host.':'.$this->socket, $this->user, $this->pass) or die(mysql_error())
    mysql_select_db($this->database, $this->db);
  }

  public function insertqry($qry)
  {
    if(!mysql_ping($this->db))
      $this->connectdb();

    mysql_query($qry, $this->db) or $this->log(mysql_error());
  }

  public function selectqry($qry)
  {
    if(!mysql_ping($this->db))
      $this->connectdb();

    $rs = mysql_query($qry, $this->db) or $this->log(mysql_error());
    return $rs;
  }
}

class pgsqldb extends dbfuncs
{
  private $dns = "host=localhost port=5432 user=jayant password=12qwaszx dbname=benchmark";

  public $db;

  function __construct()
  {
    $this->connectdb();
  }

  public function connectdb()
  {
    $this->db = pg_connect($this->dns) or die(pg_last_error());
  }

  public function insertqry($qry)
  {
    if(!pg_ping($this->db))
      connectdb();

    pg_query($this->db, $qry) or $this->log(pg_last_error($this->db));
  }

  public function selectqry($qry)
  {
    if(!pg_ping($this->db))
      $this->connectdb();

    $rs = pg_query($this->db, $qry) or $this->log(pg_last_error($this->db));
    return $rs;
  }
}

function logtime($str)
{
    $file = "benchmark.log";
    $fp = fopen($file, "a");
    fputs($fp, $str);
    fclose($fp);
}

mysqlinsert.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new mysqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "insert into data (val, txt) values ('$xx','$x in $scriptno')";
  $mysql->insertqry($qry);
}
$end = microtime(true);
$log = "nMysql innodb Time to insert $count in run $scriptno = ".($end-$start);
logtime($log);

pgsqlinsert.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new pgsqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "insert into data (val, txt) values ('$xx','$x in $scriptno')";
  $mysql->insertqry($qry);
}
$end = microtime(true);
$log = "nAvg Pgsql Time to insert $count in run $scriptno = ".($end-$start);
logtime($log);

mysqlselect.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new mysqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "select * from `data` where val ='$xx'";
  $mysql->selectqry($qry);
}
$end = microtime(true);
$log = "nMysql innodb Time to select $count in run $scriptno = ".($end-$start);
logtime($log);

pgsqlselect.php

include "dbfuncs.php";

$scriptno = $argv[1]+1;
$count = $argv[2];

$mysql = new pgsqldb();
$start = microtime(true);
for($x=0; $x<$count; $x++)
{
  $xx = $x*$scriptno;
  $qry = "select * from data where val ='$xx'";
  $mysql->selectqry($qry);
}
$end = microtime(true);
$log = "nPgsql Time to select $count in run $scriptno = ".($end-$start);
logtime($log);

benchmark.php

$count = 100000;
$concurrency = 40;

for($i=0; $i<$concurrency; $i++)
{
   exec("php -q mysqlselect.php $i $count > /dev/null &");
//   exec("php -q pgsqlselect.php $i $count > /dev/null &");
//   exec("php -q mysqlinsert.php $i $count > /dev/null &");
//   exec("php -q pgsqlinsert.php $i $count > /dev/null &");
}

All test runs were individual – meaning that only one script was run at a time with different count and concurrency. For inserts $count was set to 10,000 and for selects $count was set to 100,000 while $concurrency kept on varying.

I am using ubuntu 10.04 with 32 bit kernel 2.6.32-24 and ext4 file system. And my system has around 3GB of RAM and Intel Core 2 DUO T5870 @ 2.0 GHz.

Hashing algos : Consistent Hashing

Hashing is a way of mapping keys to locations. Normally you would hash by using a simple Key%n algorithm – which ensures that keys are mapped evenly across n splits. The problem with this algo is that adding or removing a node (or a split) would require a complete rehash of all the keys. And if you have a huge data set, it is ideally not feasable to rehash and re-distribute the keys.

Consistent hashing is a way of hashing that ensures that adding or removing a slot or node does not change the mapping of keys to slots significantly. When using consistent hashing, only K/n keys need to be remapped on average – where K is the number of keys and n is the number of slots.

The way this works is that both keys and slots are mapped to edges of a circle. Meaning that all slots are mapped on to a series of angles around a circle. And the bucket where each item should be stored is chosen by selecting the next highest angle which an available bucket maps to. So, each bucket contains resources mapping to an angle between it and the next smallest angle. If a bucket becomes unavailable, the keys being mapped to that bucket get mapped to the next highest bucket (or the next bucket in the circle). So, only keys which were in the bucket which became unavailable is lost. Similarly when a bucket is added, the keys between the new bucket and the next smallest bucket is mapped to the new bucket. Keys which should be associated with the new bucket and were stored previously will become unavailable.

figure 2
figure 1

Here is an example. Objects 1,2,3 and 4 map to slots A,B and C. To find which slot an object goes in, we move around the circle until we find a slot. So here objects 1 and 4 go into slot A, 2 goes into slot B and 3 goes into slot C. If C is removed, object 3 would belong to slot A. If another slot D is added as shown in figure 2, it will take objects 3 and 4 and only leave object 1 belonging to A.

Lets look at a php example which does consistent hasing us.

<?php
class ConsistentHasher
{
  private $nodeDistribution;
  private $virtualNodes;

  // nodeDistribution is the replica count per node.
  public function __construct($nodenames, $nodedistribution)
  {
    $this->nodeDistribution = $nodedistribution;
    $this->virtualNodes = array();

    for($i=0; $i<count($nodenames); $i++)
    {
      $this->addNode($nodenames[$i]);
    }
  }

  // The addNode() function takes a name and creates virtual nodes (or replicas) by 
  // appending the index of the local loop to the node name.
  // The hash value of a virtual node is an MD5 hash, base converted into an integer. 
  // The virtual node is added to a list and sorted by its value so that we ensure 
  // a lowest to highest traversal order for looking up previous and next virtual nodes
  public function addNode($name)
  {
    for($i=0; $ivirtualNodes[$name.$i] = $virtualNodeHashCode;
    }
    asort($this->virtualNodes, SORT_NUMERIC);
  }

  public function dump()
  {
    print_r($this->virtualNodes);
    echo "nn";
  }

  public function sortCompare($a, $b)
  {
    if($a == $b)
    {
      return 0;
    }
    return ($a < $b) ? -1 : 1;
  }

  // The removeNode() function takes a name and removes its corresponding virtual nodes 
  // from the virtualNode list.
  // We then resort the list to ensure a lowest to highest traversal order.
  public function removeNode($name)
  {
    for($i=0; $i<$this->nodeDistribution; $i++)
    {
      unset($this->virtualNodes[$name.$i]);
    }
    asort($this->virtualNodes, SORT_NUMERIC);
  }

  // The hashToNode() function takes a key and locates the node where its value resides.
  // We loop through our virtual nodes, stopping before the first one that has a
  // hash greater than that of the key’s hash.
  // If we come to the end of the virtual node list, we loop back to the beginning to 
  // form the conceptual circle.

  public function hashToNode($key)
  {
    $keyHashCode = base_convert(substr(md5($key, false),0,8),16,10);
    $virtualNodeNames = array_keys($this->virtualNodes);
    $firstNodeName = $virtualNodeNames[0];
    $lastNodeName = $virtualNodeNames[count($virtualNodeNames)-1];
    $prevNodeName = $lastNodeName;

    foreach($this->virtualNodes as $name => $hashCode)
    {
      if($keyHashCode < $hashCode)
        return $prevNodeName;

      if($name == $lastNodeName)
        return $firstNodeName;

      $prevNodeName = $name;
    }
    return $prevNodeName;
  }
}

// demo

$hash = new ConsistentHasher(array("node1","node2","node3","node4","node5"),10);
$hash->dump();

$hash->removeNode("node2");
$hash->dump();

$hash->addNode("node6");
$hash->dump();

echo $hash->hashToNode("testing123")."n";
echo $hash->hashToNode("key1111")."n";
echo $hash->hashToNode("data4444")."n";
?>

Here is a library which provides consistent hasing for php
http://code.google.com/p/flexihash/

Memcache is a widely used distributed cache which uses consistent hashing very efficiently to map keys to caching nodes.

References:
http://en.wikipedia.org/wiki/Consistent_hashing
http://www.osconvo.com/post/view/2010/9/1/distributed-hashing-algorithms-by-example-consistent-hashing

An intro to Thrift

What is thrift – a software framework for scalable cross language service development. It combines software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, Ruby, Erlang, Perl, Smalltalk etc.

Thrift allows you to define data types and service interfaces in a simple definition file. Taking the definition file as input the thrift compiler generates code that can be used to easily build RPC clients and servers that communicate seamlessly across programming languages.

You can find all this stuff at http://incubator.apache.org/thrift/.

My intention here is to provide a step by step guide to installing and building simple services in java/php using thrift. Since till date there is no proper tutorial, this should be helpful.

To download thrift, you can either get it from http://incubator.apache.org/thrift/download/ or do

$ svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk thrift
$ cd thrift

Installation steps

$ ./bootstrap.sh
$ ./configure
$ make
$ make install

thrift depends on c++ boost libraries. To install boost libraries on ubuntu do

$ sudo apt-get install libboost-dev libboost-doc libboost-thread-dev libboost-python-dev

or

$ sudo apt-get install libboost-*

One problem i ran into while doing “sudo make install” was

Error: JAVA_HOME is not defined correctly.
We cannot execute java

The solution to this is

$ su
$ <enter password>
$ export JAVA_HOME=/path/to/jdk
$ export ANT_HOME=/path/to/ant
$ export PATH=$JAVA_HOME/bin:$ANT_HOME/bin:$PATH
$ make install

It would install the required libraries for all supported languages.

Now lets try writing a very simple service using thrift and use php/java clients to communicate with the service.

First we need to create a thrift definition file.

$ mkdir test # create a separate working directory
$ cd test
$ vim time.thrift # paste the following code in time.thrift

namespace java tserver.gen  // define namespace for java code
namespace php tserver  // define namespace for php code

typedef i64 Timestamp

service TimeServer {  // define the service you want to implement
  Timestamp time(),   // define the funcionalities the service would provide.
            string date(),
            i64 multiply(1:i32 num1, 2:i32 num2),
}

Namespaces are the places where code would be generated – what we call package in java.

Create a separate directory to store source files

$ mkdir src
$ cd src
$ vim TimeServerImpl.java

Create a java source file for implementing the functions that we had defined in the time.thrift file. The name of the file is the Service name (as defined in the thrift file) followed by “Impl”. Provide the body of the functions in this file.

package server;

import java.util.*;
import org.apache.thrift.*;
import tserver.gen.*;

class TimeServerImpl implements TimeServer.Iface
{
  public long time() throws TException
  {
    long time = System.currentTimeMillis();
    System.out.println("Current time : "+time);
    return time;
  }

  public String date() throws TException
  {
    Calendar now = Calendar.getInstance();
    String dt = now.get(Calendar.YEAR)+"-"+(now.get(Calendar.MONTH)+1)+"-"+now.get(Calendar.DAY_OF_MONTH)+" "+now.get(Calendar.HOUR_OF_DAY)+":"+now.get(Calendar.MINUTE)+":"+now.get(Calendar.SECOND);
    System.out.println(" date : "+dt);
    return dt;
  }

  public long multiply(int num1, int num2) throws TException
  {
    System.out.println("Got : "+num1+", "+num2);
    return num1*num2;
  }
}

$ vim Server.java

Create a file for providing the service. This program simply has a main function which binds the service to a particular port and makes the server ready to accept connections and provide response. This code will generally remain constant unless you want to provide additional functionality at server level.

package server;

import java.io.*;
import org.apache.thrift.protocol.*;
import org.apache.thrift.protocol.TBinaryProtocol.*;
import org.apache.thrift.server.*;
import org.apache.thrift.transport.*;
import tserver.gen.*;

public class Server
{
  private void start()
  {
    try
    {
      TServerSocket serverTransport = new TServerSocket(7911);
      TimeServer.Processor processor = new TimeServer.Processor(new TimeServerImpl());
      Factory protFactory = new TBinaryProtocol.Factory(true, true);
      TServer server = new TThreadPoolServer(processor, serverTransport, protFactory);
      System.out.println("Starting server on port 7911 ...");
      server.serve();
    }catch(TTransportException e)
    {
      e.printStackTrace();
    }
  }

  public static void main(String[] args)
  {
    Server srv = new Server();
    srv.start();
  }
}

$ vim TimeClient.java

And create a client which will connect to the server on the specified port and call the provided functions.

package client;

import java.net.*;
import org.apache.thrift.*;
import org.apache.thrift.protocol.*;
import org.apache.thrift.transport.*;

import tserver.gen.TimeServer.Client;

public class TimeClient {
  private void start(){
    TTransport transport;
    try {
      transport = new TSocket("localhost", 7911);
      TProtocol protocol = new TBinaryProtocol(transport);
      // this client depends on the client class in the gen-java/tserver/gen/TimeServer.java file
      // public static class Client implements TServiceClient, Iface
      Client client = new Client(protocol); 
      transport.open();
      long time = client.time();
      System.out.println("Time from server:" + time);
      String dt = client.date();
      System.out.println("Got date from server: "+dt);
      long res = client.multiply(30,100);
      System.out.println("Multiply from server: "+res);
      transport.close();
    } catch (TTransportException e) {
      e.printStackTrace();
    } catch (TException e) {
      e.printStackTrace();
    }
  }

  public static void main(String[] args) {
    TimeClient c = new TimeClient();
    c.start();
  }
}

Now lets move ahead and generate the thrift code. And compile everything.

$ thrift –gen java time.thrift
$ vim build.xml

Write a simple build.xml to build these files

<project name="tserver" default="tserver" basedir=".">
<description>Time Server Tutorial</description>

<property name="src" location="src" />
<property name="gen" location="gen-java" />
<property name="build" location="build" />
<property name="cpath" location="/usr/local/lib/libthrift.jar:/usr/local/lib/slf4j-api-1.5.11.jar" />

<target name="init">
<tstamp />
<mkdir dir="${build}"/>
</target>

<target name="compile" depends="init">
<javac srcdir="${gen}" destdir="${build}" classpath="${cpath}" >  
<compilerarg value="-Xlint:deprecation"/> 
</javac>
<javac srcdir="${src}" destdir="${build}" classpath="${cpath}:${gen}" >   
<compilerarg value="-Xlint:deprecation"/> 
</javac>
</target>

<target name="tserver" depends="compile">
<jar jarfile="tserver.jar" basedir="${build}"/>
</target>

<target name="clean">
<delete dir="${build}" />
<delete file="tserver.jar" />
</target>

</project>

If you will check out the build file, you will see that the classpath contains slf4j-api-x.y.z.jar file. Download these files from http://www.slf4j.org/download.html and put two files (slf4j-api-x.y.z.jar and slf4j-simple-x.y.z.jar) from the archive in the /usr/local/lib directory to satisfy the required dependencies.

To complile the file run ant and create two files runClient and runServer to run the server and clients.

$ ant
$ cat runClient
java -cp tserver.jar:/usr/local/lib/libthrift.jar:/usr/local/lib/slf4j-api-1.5.11.jar:/usr/local/lib/slf4j-simple-1.5.11.jar client.TimeClient
$ cat runServer
java -cp tserver.jar:/usr/local/lib/libthrift.jar:/usr/local/lib/slf4j-api-1.5.11.jar:/usr/local/lib/slf4j-simple-1.5.11.jar server.Server

On compiling you will get the tserver.jar file created.

Now run the server and the client

$ ./runServer
Starting server on port 7911 …

$ ./runClient
Time from server:1270726146716
Got date from server: 2010-4-8 16:59:6
Multiply from server: 3000

Now lets try building a client in php. First we will generate the thrift-code for php and then write the client to interact with the java server.

$ thrift –gen php time.thrift
$ mkdir php # create a directory to store your php client source code.
$ cd php
$ vim PhpClient.php

<?php

//location of thrift php libraries
$GLOBALS['THRIFT_ROOT'] = './lib';

require_once $GLOBALS['THRIFT_ROOT'].'/Thrift.php';
require_once $GLOBALS['THRIFT_ROOT'].'/protocol/TBinaryProtocol.php';
require_once $GLOBALS['THRIFT_ROOT'].'/transport/TSocket.php';
require_once $GLOBALS['THRIFT_ROOT'].'/transport/THttpClient.php';
require_once $GLOBALS['THRIFT_ROOT'].'/transport/TBufferedTransport.php';

error_reporting(E_ALL);
$GEN_DIR = '../gen-php';
require_once $GEN_DIR.'/time/TimeServer.php';
require_once $GEN_DIR.'/time/time_types.php';
error_reporting(E_ALL);

try {
  $socket = new TSocket('localhost', 7911);
  $transport = new TBufferedTransport($socket, 1024, 1024);
  $protocol = new TBinaryProtocol($transport);
  // this client depends on the client class in the gen-php/time/TimeServer.php file
  // class TimeServerClient implements TimeServerIf
  $client = new TimeServerClient($protocol);

  $transport->open();

  $time = $client->time();
  print "time frm server = $timen";
  $res = $client->multiply(100,5000);
  print "multiply frm server = $resn";
  $transport->close();

} catch (TException $tx) {
  print 'TException: '.$tx->getMessage()."n";
}
?>

As you can see, in the code, the thrift library for php is included in the lib directory inside the php directory.
Create the lib directory and copy the required library files from thrift installation directory to the lib directory.

$ mkdir lib
$ cp -r /path/to/thrift/lib/php/src/* lib/

Now try running your php client. Remember to fire up your java server.

$ php -q PhpClient.php

You may get tons of errors “failed opening ‘./lib/packages/time/time_types.php’ for inclusion”. For this you will need to edit the generated php files. The time_types.php file is in the generated directory “gen-php”.

$ vim ../gen-php/time/TimeServer.php

Add the following lines

include_once '../gen-php/time/time_types.php';
//include_once $GLOBALS['THRIFT_ROOT'].'/packages/time/time_types.php';

now “php -q PhpClient.php” works…

Design patterns : Decorator pattern

A decorator is a class which wraps the original class. This allows the user to extend the functionality of certain objects at runtime. The pattern is designed in a way that multiple decorators can be stacked on top of each other. Decorators are used to avoid the rigidity that subclassing provides.

php example

<?php                                                                                                                                                                   
/**                                                                                                                                                                     
 * The smallest cohesive interface we can think of for this type        
 * of Decorator. This is the Component interface.
*/                                                                                                                                                                     
interface HtmlElement                                                   
{ /**                                                                                                * @return string    html code
*/                                                                                                                                                                   
  public function __toString();                                         

  /**
   * @return string    the name of the POST request key for this element,
   *                   aka the "name" attribute.                         
   */                                                                    
  public function getName();                                             
}                                                                        

/**
 * Represents a <input type="text" /> html element.
 * It can be created programmatically and then printed.  
 * This is the only ConcreteComponent.                   
 */                                                      
class InputText implements HtmlElement                   
{                                                        
  protected $_name;                                      

  public function __construct($name)
  {                                 
    $this->_name = $name;           
  }                                 

  public function getName()
  {                        
    return $this->_name;   
  }                        

  public function __toString()
  {                           
    return "<input type="text" id="{$this->_name}" name="{$this->_name}" />n";
  }                                                                                        
}                                                                                          

/**
 * Very simple base class to share the wrapping code between Decorators.
 * This is the Decorator participant.                                   
 */                                                                     
abstract class HtmlDecorator implements HtmlElement                     
{                                                                       
  protected $_element;                                                  

  public function __construct(HtmlElement $input)
  {                                              
    $this->_element = $input;                    
  }                                              

  /**
   * All operations are delegated by default, not changing anything
   * of the original behavior.                                     
   * ConcreteDecorators will override the methods they are interested in.
   */                                                                    
  public function getName()                                              
  {                                                                      
    return $this->_element->getName();                                   
  }                                                                      

  public function __toString()
  {                           
    return $this->_element->__toString();
  }                                      
}                                        

/**
 * Adds a custom <label> element alongside the <input> one.
 * Example of ConcreteDecorator.                           
 */                                                        
class LabelDecorator extends HtmlDecorator                 
{
  protected $_label;                                       

  public function setLabel($label)
  {                               
    $this->_label = $label;       
  }                               

  public function __toString()
  {                           
    $name = $this->getName(); 
    return "<label for="{$name}">{$this->_label}</label>n"
      . $this->_element->__toString();                                      
  }                                                                         
}                                                                           

/**
 * Adds a <span> containing an error message after the <input> element.
 * Example of ConcreteDecorator.                                       
 */                                                                    
class ErrorDecorator extends HtmlDecorator                             
{                                                                      
  protected $_error;                                                   

  public function setError($message)
  {                                 
    $this->_error = $message;       
  }                                 

  public function __toString()
  {                           
    return $this->_element->__toString()
      . "<span>{$this->_error}</span>n";
  }                                                     
}                                                       

$input = new InputText('nickname');
$labelled = new LabelDecorator($input);
$labelled->setLabel('Nick:');          
echo "Using labelDecorator => n",$labelled, "n";

$input = new InputText('nickname');
$error = new ErrorDecorator($input);
$error->setError('You must enter a unique nickname');
echo "Using errorDecorator => n",$error, "n";   

// how can we obtain a LabelledErrorInputText, which has both the <label>
// and <span> elements?                                                  
$input = new InputText('nickname');                                      
$labelled = new LabelDecorator($input);                                  
$labelled->setLabel('Nick:');                                            
$error = new ErrorDecorator($labelled); // a Decorator wrapping another one
$error->setError('You must enter a unique nickname');                      
echo "Using both labelDecorator & errorDecorator => n".$error;         
?>                                                                         

Output : 

Using labelDecorator =>
<label for="nickname">Nick:</label>
<input type="text" id="nickname" name="nickname" />

Using errorDecorator =>
<input type="text" id="nickname" name="nickname" />
<span>You must enter a unique nickname</span>

Using both labelDecorator & errorDecorator =>
<label for="nickname">Nick:</label> 
<input type="text" id="nickname" name="nickname" />
<span<You must enter a unique nickname>/span>

Some code in java

interface iComponent                                             
{                                                                
  public void doStuff();                                         
}                                                                

class component implements iComponent
{                                    
  public void doStuff()              
  {                                  
    System.out.println("component does stuff");
  }                                            
}                                              

interface decorator extends iComponent
{                                     
  public void addedBehaviour();       
}                                     

class concreteDecorator implements decorator
{                                           
  iComponent comp;                          
  public concreteDecorator(iComponent comp) 
  {                                         
    super();                                
    this.comp = comp;                       
  }                                         

  public void addedBehaviour()
  {                           
    System.out.println("Added behaviour in decorator");
  }                                                    

  public void doStuff()
  {                    
    comp.doStuff();    
    addedBehaviour();  
  }                    
}                      

class concreteDeco2 implements decorator
{                                       
  iComponent comp;                      
  public concreteDeco2(iComponent comp) 
  {                                     
    super();                            
    this.comp = comp;                   
  }                                     

  public void addedBehaviour()
  {
    System.out.println("Added behaviour in deco no 2");
  }

  public void doStuff()
  {
    comp.doStuff();
    addedBehaviour();
  }

}

public class decoClient
{
  public static void main(String[] args)
  {
    iComponent comp = new component();
    decorator deco = new concreteDecorator(comp);
    deco.doStuff();
    decorator deco2 = new concreteDeco2(comp);
    deco2.doStuff();
  }
}

Output :

component does stuff
Added behaviour in decorator
component does stuff
Added behaviour in deco no 2

Design Patterns : Composite Design Pattern

A composite design pattern builds complex objects out of simple objects and itself like a tree structure. Its goal is managing a hierarchy of objects where both leaf objects and composition of other objects conform to a common interface.

The client sends a message to the head component. The component declares the interface that various parts of the graph should respect. A leaf is a concrete class that has no children. A composite is a concrete class that composes other components.

PHP Code :

/**
 * Component interface.
 * The Client depends only on this abstraction, whatever graph is built using
 * the specializations.
 */
interface HtmlElement
{
  /**
   * @return string   representation
   */
  public function __toString();
}

/**
 * Leaf sample implementation.
 * Represents an <h1> element.
 */
class H1 implements HtmlElement
{
  private $_text;

  public function __construct($text)
  {
    $this->_text = $text;
  }

  public function __toString()
  {
    return "<h1>{$this->_text}</h1>";
  }
}

/**
 * Leaf sample implementation.
 * Represents a <p> element.
 */
class P implements HtmlElement
{
  private $_text;

  public function __construct($text)
  {
    $this->_text = $text;
  }

  public function __toString()
  {
    return "<p>{$this->_text}</p>";
  }
}

/**
 * A Composite implementation, which accepts as children generic Components.
 * These children may be H1, P or even other Divs.
 */
class Div implements HtmlElement
{
  private $_children = array();

  public function addChild(HtmlElement $element)
  {
    $this->_children[] = $element;
  }

  public function __toString()
  {
    $html = "<div>n";
    foreach ($this->_children as $child) {
      $childRepresentation = (string) $child;
      $childRepresentation = str_replace("n", "n    ", $childRepresentation);
      $html .= '    ' . $childRepresentation . "n";
    }
    $html .= "</div>";
    return $html;
  }
}

// Client code
$div = new Div();
$div->addChild(new H1('Title'));
$div->addChild(new P('Lorem ipsum...'));
$sub = new Div();
$sub->addChild(new P('Dolor sit amet...'));
$div->addChild($sub);
echo $div, "n";


Output

<div>
  <h1>Title</h1>
  <p>Lorem ipsum...</p>
  <div>
    <p>Dolor sit amet...</p>
  </div>
</div>

Java Code : 

import java.util.List;
import java.util.ArrayList;

/** "Component" */
interface Graphic {

  //Prints the graphic.
  public void print();
}

/** "Composite" */
class CompositeGraphic implements Graphic {

  //Collection of child graphics.
  private List mChildGraphics = new ArrayList();

  //Prints the graphic.
  public void print() {
    for (Graphic graphic : mChildGraphics) {
      graphic.print();
    }
  }

  //Adds the graphic to the composition.
  public void add(Graphic graphic) {
    mChildGraphics.add(graphic);
  }

  //Removes the graphic from the composition.
  public void remove(Graphic graphic) {
    mChildGraphics.remove(graphic);
  }
}

/** "Leaf" */
class Ellipse implements Graphic {

  String myname = "Ellipse";

  public Ellipse(int i){
    this.myname = myname + ":"+i;
  }

  //Prints the graphic.
  public void print() {
    System.out.println(this.myname);
  }
}

/** Client */
public class Program {

  public static void main(String[] args) {
    //Initialize four ellipses
    Ellipse ellipse1 = new Ellipse();
    Ellipse ellipse2 = new Ellipse();
    Ellipse ellipse3 = new Ellipse();
    Ellipse ellipse4 = new Ellipse();

    //Initialize three composite graphics
    CompositeGraphic graphic = new CompositeGraphic();
    CompositeGraphic graphic1 = new CompositeGraphic();
    CompositeGraphic graphic2 = new CompositeGraphic();

    //Composes the graphics
    graphic1.add(ellipse1);
    graphic1.add(ellipse2);
    graphic1.add(ellipse3);

    graphic2.add(ellipse4);

    graphic.add(graphic1);
    graphic.add(graphic2);

    //Prints the complete graphic (four times the string "Ellipse").
    graphic.print();
  }
}

OUTPUT:

Ellipse:1
Ellipse:2
Ellipse:3
Ellipse:4