Monday, March 22, 2010

Design patterns : Decorator pattern

A decorator is a class which wraps the original class. This allows the user to extend the functionality of certain objects at runtime. The pattern is designed in a way that multiple decorators can be stacked on top of each other. Decorators are used to avoid the rigidity that subclassing provides.

php example
<?php                                                                                                                                                                   
/**                                                                                                                                                                     
 * The smallest cohesive interface we can think of for this type        
 * of Decorator. This is the Component interface.
*/                                                                                                                                                                     
interface HtmlElement                                                   
{ /**                                                                                                * @return string    html code
*/                                                                                                                                                                   
  public function __toString();                                         

  /**
   * @return string    the name of the POST request key for this element,
   *                   aka the "name" attribute.                         
   */                                                                    
  public function getName();                                             
}                                                                        

/**
 * Represents a <input type="text" /> html element.
 * It can be created programmatically and then printed.  
 * This is the only ConcreteComponent.                   
 */                                                      
class InputText implements HtmlElement                   
{                                                        
  protected $_name;                                      

  public function __construct($name)
  {                                 
    $this->_name = $name;           
  }                                 

  public function getName()
  {                        
    return $this->_name;   
  }                        

  public function __toString()
  {                           
    return "<input type=\"text\" id=\"{$this->_name}\" name=\"{$this->_name}\" />\n";
  }                                                                                        
}                                                                                          

/**
 * Very simple base class to share the wrapping code between Decorators.
 * This is the Decorator participant.                                   
 */                                                                     
abstract class HtmlDecorator implements HtmlElement                     
{                                                                       
  protected $_element;                                                  

  public function __construct(HtmlElement $input)
  {                                              
    $this->_element = $input;                    
  }                                              

  /**
   * All operations are delegated by default, not changing anything
   * of the original behavior.                                     
   * ConcreteDecorators will override the methods they are interested in.
   */                                                                    
  public function getName()                                              
  {                                                                      
    return $this->_element->getName();                                   
  }                                                                      

  public function __toString()
  {                           
    return $this->_element->__toString();
  }                                      
}                                        

/**
 * Adds a custom <label> element alongside the <input> one.
 * Example of ConcreteDecorator.                           
 */                                                        
class LabelDecorator extends HtmlDecorator                 
{
  protected $_label;                                       

  public function setLabel($label)
  {                               
    $this->_label = $label;       
  }                               

  public function __toString()
  {                           
    $name = $this->getName(); 
    return "<label for=\"{$name}\">{$this->_label}</label>\n"
      . $this->_element->__toString();                                      
  }                                                                         
}                                                                           

/**
 * Adds a <span> containing an error message after the <input> element.
 * Example of ConcreteDecorator.                                       
 */                                                                    
class ErrorDecorator extends HtmlDecorator                             
{                                                                      
  protected $_error;                                                   

  public function setError($message)
  {                                 
    $this->_error = $message;       
  }                                 

  public function __toString()
  {                           
    return $this->_element->__toString()
      . "<span>{$this->_error}</span>\n";
  }                                                     
}                                                       

$input = new InputText('nickname');
$labelled = new LabelDecorator($input);
$labelled->setLabel('Nick:');          
echo "Using labelDecorator => \n",$labelled, "\n";

$input = new InputText('nickname');
$error = new ErrorDecorator($input);
$error->setError('You must enter a unique nickname');
echo "Using errorDecorator => \n",$error, "\n";   

// how can we obtain a LabelledErrorInputText, which has both the <label>
// and <span> elements?                                                  
$input = new InputText('nickname');                                      
$labelled = new LabelDecorator($input);                                  
$labelled->setLabel('Nick:');                                            
$error = new ErrorDecorator($labelled); // a Decorator wrapping another one
$error->setError('You must enter a unique nickname');                      
echo "Using both labelDecorator & errorDecorator => \n".$error;         
?>                                                                         

Output : 

Using labelDecorator =>
<label for="nickname">Nick:</label>
<input type="text" id="nickname" name="nickname" />

Using errorDecorator =>
<input type="text" id="nickname" name="nickname" />
<span>You must enter a unique nickname</span>

Using both labelDecorator & errorDecorator =>
<label for="nickname">Nick:</label> 
<input type="text" id="nickname" name="nickname" />
<span<You must enter a unique nickname>/span>


Some code in java

interface iComponent                                             
{                                                                
  public void doStuff();                                         
}                                                                

class component implements iComponent
{                                    
  public void doStuff()              
  {                                  
    System.out.println("component does stuff");
  }                                            
}                                              

interface decorator extends iComponent
{                                     
  public void addedBehaviour();       
}                                     

class concreteDecorator implements decorator
{                                           
  iComponent comp;                          
  public concreteDecorator(iComponent comp) 
  {                                         
    super();                                
    this.comp = comp;                       
  }                                         

  public void addedBehaviour()
  {                           
    System.out.println("Added behaviour in decorator");
  }                                                    

  public void doStuff()
  {                    
    comp.doStuff();    
    addedBehaviour();  
  }                    
}                      

class concreteDeco2 implements decorator
{                                       
  iComponent comp;                      
  public concreteDeco2(iComponent comp) 
  {                                     
    super();                            
    this.comp = comp;                   
  }                                     

  public void addedBehaviour()
  {
    System.out.println("Added behaviour in deco no 2");
  }

  public void doStuff()
  {
    comp.doStuff();
    addedBehaviour();
  }

}

public class decoClient
{
  public static void main(String[] args)
  {
    iComponent comp = new component();
    decorator deco = new concreteDecorator(comp);
    deco.doStuff();
    decorator deco2 = new concreteDeco2(comp);
    deco2.doStuff();
  }
}

Output :

component does stuff
Added behaviour in decorator
component does stuff
Added behaviour in deco no 2

Friday, March 05, 2010

Storing trees in databases

A tree is a graph which is connected, uni-directed and acyclic. Lets look at different options of storing such trees in databases.

Parent - Child Model

The most common model for storing hierarchical information is storing the reference of the parent node along with the child node.

so for a tree

EmployeeBosssalary
ANULL1000
BA900
CA950
DC800
EC700
FC600

For this model, the table to be created would be:

create table employee_tree(employee varchar(10) not null primary key, boss varchar(10), salary decimal(10,2) not null);

Some sample operations :
  • Finding the root node : select * from employee_tree where boss is NULL
  • Finding all leaf nodes : select e1.* from employee_tree as e1 left join employee_tree as e2 on e1.employee=e2.boss where e2.employee is null
  • Retrieving a single path : select e1.employee as lev1, e2.employee as lev2, e3.employee as lev3 from employee_tree as e1 left join employee_tree as e2 on e2.boss = e1.employee left join employee_tree as e3 on e3.boss = e2.employee where e1.employee = 'A' and e3.employee='F';

Some problems with this model:
  • Update : if you run "update employee_tree set employee = 'C1' where employee='C'", then all nodes below "C" are left with boss who does notexist. To handle this, you should run a transaction with two queries, one being the above query and another being the query "update employee_tree set boss = 'C1' where employee = 'C'"
  • Insert : if you run a query "Insert into employee_tree (employee, boss) values ('G','H'),('H','G')" you get two employees 'G' & 'H' who are both employees and bosses of each other.
  • Insert : Another problem with insert is "Insert into employee_tree (employee, boss) values ('M','M')". Now 'M' is the boss of 'M'.
  • Delete : Run "Delete from employee_tree where emp = 'C'". Again we create a situation where 'D','E','F' are left with a boss who does not exist. So in effect, deletion would require the tree to be traversed.
  • Descendants : It is difficult to get the number of descendents for a given node in a single query. You will have to write some script and un some recursive function to get that number

Fixing this model: Following checks should be in place to fix this model.
  • An employee cannot be his own boss : Check if ( boss != employee ) or (boss = 0 and employee = 0)
  • On deletion use cascade to remove the employees to that boss
  • To prevent cyclic relations : add unique key (employee, boss)
  • Check for validity of tree (edges = nodes - 1) : count(*) from employee_tree - 1 = count(boss) from employee_tree where boss is not null
  • Check for only single root : select count(*) from employee_tree where boss_id is null = 1

Path Enumeration model

create table employee(emp_id char(1) not null primary key, emp_name varchar(10), emp_path varchar(255) not null);
emp_idemp_nameemp_path
AA NameA
BB NameAB
CC NameAC
DD NameACD
EE NameACE
FF NameACF
The sample operations are easy here
  • Finding root node : select * from employee where length(emp_path)=1;
  • Finding all leaf nodes : This is going to be difficult, You will have to write a script to extract this info
  • Prevent cyclic relations : check whether after removing the node from the path string, the path is reduced by only 1 node. Query : CHECK(NOT EXISTS(select * from employee as e1, employee as e2 where length(replace(e1.emp_path,e2.emp_id,'')) < (length(e1.emp_path)-length(e2.emp_id)) ));
  • No path can be longer than the total no of nodes in the tree : CHECK( (select max(length(e1.emp_path)) from employee as e1) <= (select coun t(e2.emp_id) * length(e2.emp_id) from employee as e2) )
  • Depth of the tree is the path divided by the length of emp_id string : CHAR_LENGTH(emp_path) / CHAR_LENGTH(emp_id); If emp_ids are not fixed length then the depth is : CHAR_LENGTH(emp_path) - CHAR_LENGTH(REPLACE(emp_path,'/','')) + 1 [here / is the separator that is path is written as 'A/B/C'. If you use any other separator, you need to change it in the function as well]
  • To find all subtrees for a parent : the immediate solution is "Select * from employee where emp_path like concat('%',<Parent_emp_id>,'%')". But this query does not use indexes. Another way of doing this is "select * from employee where emp_path like concat((select emp_path from employee where emp_id='<Parent_emp_id>'),'%');"
  • To find the immediate subchildren of a parent : "select * from employee where emp_path like concat((select emp_path from employee where emp_id='<Parent_emp_id>'),'_');"
  • To search for superiors : "create table sequence(seq int not null); insert into sequence values (1),(2),(3),(4),(5); select substring(e1.emp_path from (seq * char_length(e1.emp_id)) for char_length(e1.emp_id)) as emp_id from employee e1, sequence s1 where e1.emp_id = '<emp_id_to_search_for>' and s1.seq <= char_length(e1.emp_path)/char_length(e1.emp_id);"
  • To find the relationships among superiors : "select e2.* from employee e1, employee e2 where e1.emp_id = '<emp_id_to_search_for>' and POSITION(e2.emp_path in e1.emp_path) = 1;"
  • To delete a subtree : "DELETE from employee where emp_path like concat(select emp_path from employee where emp_id='<Dead_node_emp_id>','%')" We may need to write the two queries separately;
  • To delete a node : "Delete from employee where emp_id='<emp_id_of_dead_node>'" And "Update employee set emp_path = replace(emp_path,'<emp_id_of_dead_node>','')" Both queries should be in a single transaction.
  • Inserting a node : "select emp_path from employee where emp_id = '<emp_id_of_boss>'; insert into employee values ('<new_emp_id>','<new_emp_name>',concat('<selected_emp_boss_path>','<new_emp_id>'))"

A derivative of the path enumeration model is the edge enumeration model. In Edge enumeration model the path consists of a set of integers that make the path from the root to the node - from left to right.

For example :
Employee nameEdge path
A1.
B1.1.
C1.2.
D1.2.1
E1.2.2
F1.2.3
The benefit of the edge enumeration model over the path enumeration model is that you do not have to worry about long strings. The numbers give an implied ordering to siblings that have meaning.

Nested set model


Here each node is assigned a beginning and an ending node hierarchy number based on the total amount of data in the hierarchical tree.

An example table could be :

create table emp(emp_id int not null auto_increment primary key, emp_name varchar(10), lft int not null, rgt int not null);
Emp idEmp NameLftRgt
1A112
2B23
3C411
4D56
5E78
6F910
Operations in this model:
  • Finding the root node : "select * from emp where lft = 1"
  • Finding leaf nodes : "select * from emp where lft = rgt - 1"
  • Finding subtree : select node.* from emp as node, emp as parent where node.lft > parent.lft and node.lft < parent.rgt and parent.emp_id='&l t;parent_employee_id>';"
  • Retrieving a single path : "select parent.* from emp as node, emp as parent where node.lft > parent.lft and node.lft < parent.rgt and node. emp_id='<leaf_node_id>' order by parent.lft;"
  • Finding depth of nodes : "select node.emp_name, (COUNT(parent.emp_name)-1) as depth from emp as node, emp as parent where node.lft BETWEEN parent.lft and parent.rgt group by node.emp_name order by node.lft;"
  • Add a single node as child of node with no existing children : If you want to insert a node 'G' below 'B', then the new node 'G' would have left and right values '3' and '4' and all nodes after it (in sequence) would have to increase their left and right values by 2. The procedure
    for it would be
    lock table emp write;
    select @myleft := lft from emp where emp_name='<add_below_this_node>';
    update emp set rgt = rgt+2 where rgt > @myleft;
    update emp set lft = lft+2 where lft > @myleft;
    insert into emp values ('','G',&myleft+1,@myleft+2);
    unlock tables;
    
  • As a single node after a node(not as a new child) : modify the earlier procedure a little.
    lock table emp write;
    select @myright := rgt from emp where emp_name='<add_after_this_node>';
    update emp set rgt = rgt+2 where rgt > @myright;
    update emp set lft = lft+2 where lft > @myright;
    insert into emp values ('','G',&myright+1,@myright+2);
    unlock tables;
    
  • Delete a node and its children : Deletion is just opposite to a adding. After deletion, its width has to be deleted from all nodes to the right.
    lock table emp write;
    select @myleft := lft, @myright := rgt, @mywidth := rgt - lft + 1 from emp where emp_name='<node_to_be_deleted>';
    delete from emp where lft between @myleft and @myright;
    update emp set rgt = rgt - @mywidth where rgt > @myright;
    update emp set lft = lft - @mywidth where lft > @myright;
    unlock tables;
    
  • Delete only the parent node: In this case the child nodes will have to be moved up to the level of the parent.
    lock table emp write;
    select @myleft := lft, @myright := rgt, @mywidth := rgt - lft + 1 from emp where emp_name='<parent_to_be_deleted>';
    delete from emp where lft = @myleft;
    update emp set rgt = rgt - 1, lft = lft - 1 where lft between @myleft and @myright;
    update emp set rgt = rgt - 2 where rgt > @myright;
    update emp set lft = lft - 2 where lft > @myleft;
    unlock tables;
    
Hybrid model

A parent child with nested set model would server the purpose much better. (node, parent, left, right). But with hybrid models, inserts would be more expensive as compared to selects. So it is important to evaluate the pros and cons and figure out what would be best for the purpose tha t your application serves.

Nested intervals model.

This model is similar to the nested set model with the use of rational numbers (a/b) instead of numbers. So in our above tree the left and right values would change.
Emp idEmp NameLftRgt
1A0/1111/11
2B1/112/11
3C3/1110/11
4D4/115/11
5E6/117/11
6F8/119/11

This is a bit complex to implement, but plays well with all the queries.

Proprietary solutions

Oracle, DB2 and Microsoft SQL Server support heirarchial data. In opensource Postgresql 8.4 supports the addition and querying of heirarchial data basically by using a "CONNECT BY" clause. It would be better if their official documentation is referred to for info regarding the syntax for implementing heirarchial data in relational databases.