Saturday, June 15, 2013

Step by Step setting up solr cloud

Here is a quick step by step instruction for setting up solr cloud on few machines. We will be setting up solr cloud to match production environment - that is - there will be separate setup for zookeeper & solr will be sitting inside tomcat instead of the default jetty.

Apache zookeeper is a centralized service for maintaining configuration information. In case of solr cloud, the solr configuration information is maintained in zookeeper. Lets set up the zookeeper first.

We will be setting up solr cloud on 3 machines. Please make the following entries in /etc/host file on all machines.

172.22.67.101 solr1
172.22.67.102 solr2
172.22.67.103 solr3

Lets setup the zookeeper cloud on 2 machines

download and untar zookeeper in /opt/zookeeper directory on both servers solr1 & solr2. On both the servers do the following

root@solr1$ mkdir /opt/zookeeper/data
root@solr1$ cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg
root@solr1$ vim /opt/zookeeper/zoo.cfg

Make the following changes in the zoo.cfg file

dataDir=/opt/zookeeper/data
server.1=solr1:2888:3888
server.2=solr2:2888:3888

Save the zoo.cfg file.

server.x=[hostname]:nnnnn[:nnnnn] : here x should match with the server id - which is there in the myid file in zookeeper data directory.

Assign different ids to the zookeeper servers

on solr1

root@solr1$ cat 1 > /opt/zookeeper/data/myid

on solr2

root@solr2$ cat 2 > /opt/zookeeper/data/myid

Start zookeeper on both the servers

root@solr1$ cd /opt/zookeeper
root@solr1$ ./bin/zkServer.sh start

Note : in future when you need to reset the cluster/shards information do the following

root@solr1$ ./bin/zkCli.sh -server solr1:2181
[zk: solr1:2181(CONNECTED) 0] rmr /clusterState.json

Now lets setup solr and start solr with the external zookeeper.

install solr on all 3 machines. I installed them in /opt/solr folder.
Start the first solr and upload solr configuration into the zookeeper cluster.

root@solr1$ cd /opt/solr/example
root@solr1$ java -DzkHost=solr1:2181,solr2:2181 -Dbootstrap_confdir=solr/collection1/conf/ -DnumShards=2 -jar start.jar

Here number of shards is specified as 2. So our cluster will have 2 shards and multiple replicas per shard.

Now start solr on remaining servers.

root@solr2$ cd /opt/solr/example
root@solr2$ java -DzkHost=solr1:2181,solr2:2181 -DnumShards=2 -jar start.jar

root@solr3$ cd /opt/solr/example
root@solr3$ java -DzkHost=solr1:2181,solr2:2181 -DnumShards=2 -jar start.jar

Note : It is important to put the "numShards" parameter, else numShards gets reset to 1.

Point your browser to http://172.22.67.101:8983/solr/
Click on "cloud"->"graph" on your left section and you can see that there are 2 nodes in shard 1 and 1 node in shard 2.

Lets feed some data and see how they are distributed across multiple shards.

root@solr3$ cd /opt/solr/example/exampledocs
root@solr3$ java -jar post.jar *.xml

Lets check the data as it was distributed between the two shards. Head back to to Solr admin cloud graph page at http://172.22.67.101:8983/solr/.

click on the first shard collection1, you may have 14 documents in this shard
click on the second shard collection1, you may have 18 documents in this shard
check the replicas for each shard, they should have the same counts
At this point, we can start issues some queries against the collection:

Get all documents in the collection:
http://172.22.67.101:8983/solr/collection1/select?q=*:*

Get all documents in the collection belonging to shard1:
http://172.22.67.101:8983/solr/collection1/select?q=*:*&shards=shard1

Get all documents in the collection belonging to shard2:
http://172.22.67.101:8983/solr/collection1/select?q=*:*&shards=shard2

Lets check what zookeeper has in its cluster.

root@solr3$ cd /opt/zookeeper/
root@solr3$ ./bin/zkCli.sh -server solr1:2181
[zk: 172.22.67.101:2181(CONNECTED) 1] get /clusterstate.json
{"collection1":{
    "shards":{
      "shard1":{
        "range":"80000000-ffffffff",
        "state":"active",
        "replicas":{
          "172.22.67.101:8983_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.101:8983_solr",
            "base_url":"http://172.22.67.101:8983/solr",
            "leader":"true"},
          "172.22.67.102:8983_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.102:8983_solr",
            "base_url":"http://172.22.67.102:8983/solr"}}},
      "shard2":{
        "range":"0-7fffffff",
        "state":"active",
        "replicas":{"172.22.67.103:8983_solr_collection1":{
            "shard":"shard2",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.103:8983_solr",
            "base_url":"http://172.22.67.103:8983/solr",
            "leader":"true"}}}},
    "router":"compositeId"}}

It can be seen that there are 2 shards. Shard 1 has only 1 replica and shard 2 has 2 replicas. As you keep on adding more nodes, the number of replicas per shard will keep on increasing.

Now lets configure solr to use tomcat and add zookeeper related and numShards configuration into tomcat for solr.

Install tomcat on all machines in /opt folder. And create the following files on all machines.

cat /opt/apache-tomcat-7.0.40/conf/Catalina/localhost/solr.xml
<?xml version="1.0" encoding="UTF-8"?>
<Context path="/solr/home"
     docBase="/opt/solr/example/webapps/solr.war"
     allowlinking="true"
     crosscontext="true"
     debug="0"
     antiResourceLocking="false"
     privileged="true">

     <Environment name="solr/home" override="true" type="java.lang.String" value="/opt/solr/example/solr" />
</Context>

on solr1 & solr2 create the solr.xml file for shard1

root@solr1$ cat /opt/solr/example/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true" zkHost="solr1:2181,solr2:2181"> 
  <cores defaultCoreName="collection1" adminPath="/admin/cores" zkClientTimeout="${zkClientTimeout:15000}" hostPort="8080" hostContext="solr">
    <core loadOnStartup="true" shard="shard1" instanceDir="collection1/" transient="false" name="collection1"/>
  </cores>
</solr>
 

on solr3 create the solr.xml file for shard2

root@solr3$ cat /opt/solr/example/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true" zkHost="solr1:2181,solr2:2181"> 
  <cores defaultCoreName="collection1" adminPath="/admin/cores" zkClientTimeout="${zkClientTimeout:15000}" hostPort="8080" hostContext="solr">
    <core loadOnStartup="true" shard="shard2" instanceDir="collection1/" transient="false" name="collection1"/>
  </cores>
</solr>
 
 

set the numShards variable as a part of solr starup environment variable on all machines.

root@solr1$ cat /opt/apache-tomcat-7.0.40/bin/setenv.sh
export JAVA_OPTS=' -Xms4096M -Xmx8192M -DnumShards=2 '

To be on the safer side, cleanup the clusterstate.json in zookeeper. Now start tomcat on all machines and check the catalina.out file for errors if any. Once all nodes are up, you should be able to point your browser to http://172.22.67.101:8080/solr -> cloud -> graph and see the 3 nodes which form the cloud.

Lets add a new node to shard 2. It will be added as a replica of current node on shard 2.

http://172.22.67.101:8983/solr/admin/cores?action=CREATE&name=collection1_shard2_replica2&collection=collection1&shard=shard2
And lets check the clusterstate.json file now

root@solr3$ cd /opt/zookeeper/
root@solr3$ ./bin/zkCli.sh -server solr1:2181
[zk: 172.22.67.101:2181(CONNECTED) 2] get /clusterstate.json
{"collection1":{
    "shards":{
      "shard1":{
        "range":"80000000-ffffffff",
        "state":"active",
        "replicas":{
          "172.22.67.101:8080_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.101:8080_solr",
            "base_url":"http://172.22.67.101:8080/solr",
            "leader":"true"},
          "172.22.67.102:8080_solr_collection1":{
            "shard":"shard1",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.102:8080_solr",
            "base_url":"http://172.22.67.102:8080/solr"}}},
      "shard2":{
        "range":"0-7fffffff",
        "state":"active",
        "replicas":{
          "172.22.67.103:8080_solr_collection1":{
            "shard":"shard2",
            "state":"active",
            "core":"collection1",
            "collection":"collection1",
            "node_name":"172.22.67.103:8080_solr",
            "base_url":"http://172.22.67.103:8080/solr",
            "leader":"true"},
          "172.22.67.103:8080_solr_collection1_shard2_replica2":{
            "shard":"shard2",
            "state":"active",
            "core":"collection1_shard2_replica2",
            "collection":"collection1",
            "node_name":"172.22.67.103:8080_solr",
            "base_url":"http://172.22.67.103:8080/solr"}}}},
    "router":"compositeId"}}

Similar to adding more nodes, you can unload and delete a node in solr.

http://172.22.67.101:8080/solr/admin/cores?action=UNLOAD&core=collection_shard2_replica2&deleteIndex=true
More details can be obtained from

http://wiki.apache.org/solr/SolrCloud

1 comment:

J. said...

Two Zookeeper servers is not enough if you want to have fail-over.
You should have 3.

Check the documentation for more information:
http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html#sc_CrossMachineRequirements