Handling mongodb connections

Mongodb is fast and for medium to large setup, it just works out of the box. But for setups which are bigger than large, you may run into a situation where the number of connections max out. So for extra large setups, let us look at how to increase the number of connections in a mongodb server.

Mongodb uses file descriptors to manage connections. In most unix-like operating systems, the default number of file descriptors available are set to 1024. This can be verified by using the command ulimit -n which should give the output as 1024. Ulimit is the per user limitation of various resources. Ulimit can be temporarily changed by issueing the command ulimit -n .

To change file descriptors system wide, change or add the following line to your /etc/sysctl.conf

fs.file-max = 64000

Now every user in your server would be able to use 64000 file descriptors instead of the earlier 1024. For a per user configuration, you will have to tweak the hard and soft limits in /etc/security/limits.conf.

By default the mongodb configuration file mostly mongodb.conf does not specify the number of max connections. It depends directly on the number of available file descriptors. But you can control it using the maxConns variable. Suppose you want to set the number of max connections to 8000, you will have to put the following configuration line in your mongodb.conf file.

maxConns = 8000

Remember : mongodb cannot use more than 20,000 connections on a server.

I recently came across a scenario where my mongodb which was using 7000 connections maxed out. I have a replica set configured, where there is a single master and multiple replicas. All reads and writes were happening on the master. With replica sets, the problem is that if the master mongo is restarted, any one of the replica mongo servers may become the master or primary. The TMC problem was caused by a missing index which caused multiple update queries to get queued up. To solve the problem, firstly the missing index was applied. Next all queries which were locking the collection had to be killed.

Here is a quick function that we were able to put together to first list all pending write operations and then kill them.

db.currentOp().inprog.forEach(
   function(d){
     if(d.waitingForLock && d.lockType != “read”)
        printjson(d.opid);       
        db.killOp(d.opid);       
        i++
     })

Restore grub 2 after windows installation

Here is the step by step guide to recover Grub 2 (with ubuntu 9.10) after windows install. The steps are different than for recovering Grub 1 (as explained in this post:

You will need a LIVE cd if you are going to recover an Ubuntu Box. Boot the system with Live CD (I assume you are using Ubuntu Live CD). Press Alt+F2 and enter gnome-terminal command. And continue typing :

$sudo fdisk -l

This will show your partition table:

Device Boot Start End Blocks Id System
/dev/sda1 * 1 12748 102398278+ 7 HPFS/NTFS
/dev/sda2 12749 60800 385977690 f W95 Ext’d (LBA)
/dev/sda5 12749 13003 2048256 82 Linux swap / Solaris
/dev/sda6 13004 16827 30716248+ 83 Linux
/dev/sda7 16828 25272 67834431 83 Linux
/dev/sda8 25273 32125 55046302+ 7 HPFS/NTFS
/dev/sda9 32126 60800 230331906 7 HPFS/NTFS

Now we need to mount Linux (sda6 and sda7 here). sda6 is the / partition and sda7 is the /home partition. So mount them accordingly. If you have any other partitions (specially boot partition) dont forget to mount them.

$sudo mount /dev/sda6 /mnt
$sudo mount /dev/sda7 /mnt/home
$sudo mount –bind /dev /mnt/dev
$sudo mount –bind /proc /mnt/proc
$sudo mount –bind /sys /mnt/sys

Now chroot into the enviroment we made :

$sudo chroot /mnt

You may want to edit /etc/default/grub file to fit your system (timeout options etc)

$vi /etc/default/grub

Once you are thru install grub2 by using

$grub-install /dev/sda

If you get errors with that code use:

$grub-install –recheck /dev/sda

Now you can exit the chroot, umount the system and reboot your box :

$exit
$sudo umount /mnt/home
$sudo umount /mnt/sys
$sudo umount /mnt/dev
$sudo umount /mnt/proc
$sudo umount /mnt
$sudo reboot

Forking Vs. Threading

What is Fork/Forking?

Fork is nothing but a new process that looks exactly like the old or the parent process but still it is a different process with different process ID and having it’s own memory. Parent process creates a separate address space for child. Both parent and child process possess the same code segment, but execute independently from each other.

The simplest example of forking is when you run a command on shell in unix/linux. Each time a user issues a command, the shell forks a child process and the task is done.

When a fork system call is issued, a copy of all the pages corresponding to the parent process is created, loaded into a separate memory location by the OS for the child process, but in certain cases, this is not needed. Like in ‘exec’ system calls, there is not need to copy the parent process pages, as execv replaces the address space of the parent process itself.

Few things to note about forking are:

  • The child process will be having it’s own unique process ID.
  • The child process shall have it’s own copy of parent’s file descriptor.
  • File locks set by parent process shall not be inherited by child process.
  • Any semaphores that are open in the parent process shall also be open in the child process.
  • Child process shall have it’s own copy of message queue descriptors of the parents.
  • Child will have it’s own address space and memory.

Fork is universally accepted than thread because of the following reasons:

  • Development is much easier on fork based implementations.
  • Fork based code a more maintainable.
  • Forking is much safer and more secure because each forked process runs in its own virtual address space. If one process crashes or has a buffer overrun, it does not affect any other process at all.
  • Threads code is much harder to debug than fork.
  • Fork are more portable than threads.
  • Forking is faster than threading on single cpu as there are no locking over-heads or context switching.

Some of the applications in which forking is used are: telnetd(freebsd), vsftpd, proftpd, Apache13, Apache2, thttpd, PostgreSQL.

Pitfalls in Fork:

  • In fork, every new process should have it’s own memory/address space, hence a longer startup and stopping time.
  • If you fork, you have two independent processes which need to talk to each other in some way. This inter-process communication is really costly.
  • When the parent exits before the forked child, you will get a ghost process. That is all much easier with a thread. You can end, suspend and resume threads from the parent easily. And if your parent exits suddenly the thread will be ended automatically.
  • In-sufficient storage space could lead the fork system to fail.

What are Threads/Threading?

Threads are Light Weight Processes (LWPs). Traditionally, a thread is just a CPU (and some other minimal state) state with the process containing the remains (data, stack, I/O, signals). Threads require less overhead than “forking” or spawning a new process because the system does not initialize a new system virtual memory space and environment for the process. While most effective on a multiprocessor system where the process flow can be scheduled to run on another processor thus gaining speed through parallel or distributed processing, gains are also found on uniprocessor systems which exploit latency in I/O and other system functions which may halt process execution.

Threads in the same process share:
== Process instructions
== Most data
== open files (descriptors)
== signals and signal handlers
== current working directory
== User and group id

Each thread has a unique:
== Thread ID
== set of registers, stack pointer
== stack for local variables, return addresses
== signal mask
== priority
== Return value: errno

Few things to note about threading are:

  • Thread are most effective on multi-processor or multi-core systems.
  • For thread – only one process/thread table and one scheduler is needed.
  • All threads within a process share the same address space.
  • A thread does not maintain a list of created threads, nor does it know the thread that created it.
  • Threads reduce overhead by sharing fundamental parts.
  • Threads are more effective in memory management because they uses the same memory block of the parent instead of creating new.

Pitfalls in threads:

  • Race conditions: The big loss with threads is that there is no natural protection from having multiple threads working on the same data at the same time without knowing that others are messing with it. This is called race condition. While the code may appear on the screen in the order you wish the code to execute, threads are scheduled by the operating system and are executed at random. It cannot be assumed that threads are executed in the order they are created. They may also execute at different speeds. When threads are executing (racing to complete) they may give unexpected results (race condition). Mutexes and joins must be utilized to achieve a predictable execution order and outcome.
  • Thread safe code: The threaded routines must call functions which are “thread safe”. This means that there are no static or global variables which other threads may clobber or read assuming single threaded operation. If static or global variables are used then mutexes must be applied or the functions must be re-written to avoid the use of these variables. In C, local variables are dynamically allocated on the stack. Therefore, any function that does not use static data or other shared resources is thread-safe. Thread-unsafe functions may be used by only one thread at a time in a program and the uniqueness of the thread must be ensured. Many non-reentrant functions return a pointer to static data. This can be avoided by returning dynamically allocated data or using caller-provided storage. An example of a non-thread safe function is strtok which is also not re-entrant. The “thread safe” version is the re-entrant version strtok_r.

Advantages in threads:

  • Threads share the same memory space hence sharing data between them is really faster means inter-process communication (IPC) is real fast.
  • If properly designed and implemented threads give you more speed because there aint any process level context switching in a multi threaded application.
  • .Threads are really fast to start and terminate

Some of the applications in which threading is used are: MySQL, Firebird, Apache2, MySQL 323

FAQ’s:

1. Which should i use in my application ?

Ans: That depends on a lot of factors. Forking is more heavy-weight than threading, and have a higher startup and shutdown cost. Interprocess communication (IPC) is also harder and slower than interthread communication. Actually threads really win the race when it comes to inter communication. Conversely, whereas if a thread crashes, it takes down all of the other threads in the process, and if a thread has a buffer overrun, it opens up a security hole in all of the threads.

which would share the same address space with the parent process and they only needed a reduced context switch, which would make the context switch more efficient.

2. Which one is better, threading or forking ?

Ans: That is something which totally depends on what you are looking for. Still to answer, In a contemporary Linux (2.6.x) there is not much difference in performance between a context switch of a process/forking compared to a thread (only the MMU stuff is additional for the thread). There is the issue with the shared address space, which means that a faulty pointer in a thread can corrupt memory of the parent process or another thread within the same address space.

3. What kinds of things should be threaded or multitasked?

Ans: If you are a programmer and would like to take advantage of multithreading, the natural question is what parts of the program should/ should not be threaded. Here are a few rules of thumb (if you say “yes” to these, have fun!):

  • Are there groups of lengthy operations that don’t necessarily depend on other processing (like painting a window, printing a document, responding to a mouse-click, calculating a spreadsheet column, signal handling, etc.)?
  • Will there be few locks on data (the amount of shared data is identifiable and “small”)?
  • Are you prepared to worry about locking (mutually excluding data regions from other threads), deadlocks (a condition where two COEs have locked data that other is trying to get) and race conditions (a nasty, intractable problem where data is not locked properly and gets corrupted through threaded reads & writes)?
  • Could the task be broken into various “responsibilities”? E.g. Could one thread handle the signals, another handle GUI stuff, etc.?

Conclusions:

1. Whether you have to use threading or forking, totally depends on the requirement of your application.
2. Threads more powerful than events, but power is not something which is always needed.
3. Threads are much harder to program than forking, so only for experts.
4. Use threads mostly for performance-critical applications.

source : http://www.geekride.com/index.php/2010/01/fork-forking-vs-thread-threading-linux-kernel/

Using gearman to distribute your work…

Gearman is a system to farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance lots of function calls, or to call functions between languages.

How does gearman work? Well, a gearman powered app consists of a client, a worker and a job server. The client creats a job and sends it to the job server. The job server finds a suitable worker and sends the job to the worker. Once the job is done, the worker sends the response back to the client via the job server. There are client and worker APIs available for gearman for different languages which allow the app to communicate with the job server.

Too heavy is it… Lets check how this thing actually runs.

Download gearman daemon from http://gearman.org/index.php?id=download

You would need the “Gearman server and library” – the one written in c. And a php extension which we will use to create and communicate with our workers.

I am using gearmand version 0.11, gearman extension for php version 0.60 and php version 5.3 here.

extract the gearmand server and install it using
./configure
make
sudo make install

extract the php extension for gearman and install it using
phpize
./configure
make
sudo make install

Enable the extension in php.ini. Add the following line in php.ini

extension=gearman.so

To check if the extension has been enabled run

php -i | grep -i gearman

And you will see something like

gearman
gearman support => enabled
libgearman version => 0.11

Now lets write some scripts and check how this works

Create a php client :

                 
<?php
# Create our client object.
$client= new GearmanClient();

# Add default server (localhost).
$client->addServer();

echo "Sending jobn";

# Send reverse job
$result = $client->do("reverse", "Hello World");
if ($result)
echo "Success: $resultn";
?>

Create a php worker :


<?php
# Create our worker object.
$worker= new GearmanWorker();

# Add default server (localhost).
$worker->addServer();

# Register function "reverse" with the server.
$worker->addFunction("reverse", "reverse_fn");

while (1)
{
print "Waiting for job...n";

$ret= $worker->work();
if ($worker->returnCode() != GEARMAN_SUCCESS)
break;
}

# A much simple reverse function
function reverse_fn($job)
{
$workload= $job->workload();
echo "Received job: " . $job->handle() . "n";
echo "Workload: $workloadn";
$result= strrev($workload);
echo "Result: $resultn";
return $result;
}
?>

To test the process

start the gearmand server
gearmand

start the worker
php -q gearmanWorker.php

And send jobs to the worker
php -q gearmanClient.php

The output is as below

GPG Error: … : NO_PUBKEY D739676F7613768D

You run an apt-get update and it gives some errors which are difficult to comprehend. The output looks some like this

Fetched 924B in 2s (352B/s)
W: GPG error: http://ppa.launchpad.net karmic Release: The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY D739676F7613768D
W: GPG error: http://ppa.launchpad.net karmic Release: The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY 2836CB0A8AC93F7A
W: GPG error: http://ppa.launchpad.net karmic Release: The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY 2836CB0A8AC93F7A

What to do… How to remove these errors. Well, some public keys are unavailable due to which these errors are happening. Ok, ok, but how to go about it? How do i fix it?

The easiest way is to get this script. The script does not enable the PPAs that are disabled. But for all enabled PPAs it fetches their keys and installs them

jayant@gamegeek:~/bin$ cat launchpad-update 
#! /bin/sh

# Simple script to check for all PPAs refernced in your apt sources and
# to grab any signing keys you are missing from keyserver.ubuntu.com.
# Additionally copes with users on launchpad with multiple PPAs
# (e.g., ~asac)
#
# Author: Dominic Evans https://launchpad.net/~oldman
# License: LGPL v2

for APT in `find /etc/apt/ -name *.list`; do
grep -o "^deb http://ppa.launchpad.net/[a-z0-9-]+/[a-z0-9-]+" $APT | while read ENTRY ; do
# work out the referenced user and their ppa
USER=`echo $ENTRY | cut -d/ -f4`
PPA=`echo $ENTRY | cut -d/ -f5`
# some legacy PPAs say 'ubuntu' when they really mean 'ppa', fix that up
if [ "ubuntu" = "$PPA" ]
then
PPA=ppa
fi
# scrape the ppa page to get the keyid
KEYID=`wget -q --no-check-certificate https://launchpad.net/~$USER/+archive/$PPA -O- | grep -o "1024R/[A-Z0-9]+" | cut -d/ -f2`
sudo apt-key adv --list-keys $KEYID >/dev/null 2>&1
if [ $? != 0 ]
then
echo Grabbing key $KEYID for archive $PPA by ~$USER
sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com $KEYID
else
echo Already have key $KEYID for archive $PPA by ~$USER
fi
done
done

echo DONE

Make the script executable

$ chmod a+x launchpad-update

And run the script

jayant@gamegeek:~/bin$ sudo ./launchpad-update 
Grabbing key 7613768D for archive vlc by ~c-korn
Executing: gpg --ignore-time-conflict --no-options --no-default-keyring --secret-keyring /etc/apt/secring.gpg --trustdb-name /etc/apt/trustdb.gpg --keyring /etc/apt/trusted.gpg --recv-keys --keyserver keyserver.ubuntu.com 7613768D
gpg: requesting key 7613768D from hkp server keyserver.ubuntu.com
gpg: key 7613768D: public key "Launchpad PPA named vlc for Christoph Korn" imported
gpg: no ultimately trusted keys found
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
Already have key 4E5E17B5 for archive ppa by ~chromium-daily
Grabbing key 8AC93F7A for archive backports by ~kubuntu-ppa
Executing: gpg --ignore-time-conflict --no-options --no-default-keyring --secret-keyring /etc/apt/secring.gpg --trustdb-name /etc/apt/trustdb.gpg --keyring /etc/apt/trusted.gpg --recv-keys --keyserver keyserver.ubuntu.com 8AC93F7A
gpg: requesting key 8AC93F7A from hkp server keyserver.ubuntu.com
gpg: key 8AC93F7A: public key "Launchpad Kubuntu Updates" imported
gpg: no ultimately trusted keys found
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
Already have key 8AC93F7A for archive staging by ~kubuntu-ppa
DONE

Thats it… Done…
Now if you run sudo apt-get update, you should not get any errors about PUBKEYs…

ext4 filesystem

ext4 is the next “version” of filesystem after ext3. It was released with linux kernel version 2.6.28 which comes with ubuntu 9.04 (Jaunty).

Benefits of ext4 over ext3

  • Bigger filesystem and file sizes : ext3 supports 16TB of filesystem size and max file size of 2TB. And ext4 supports 1EB (10^18 bytes = 1024*1024 TB) of filesystem size and max file size of 16TB. Though you would never come across such huge storage system in desktop computers.
  • subdirectory limitations : ext3 allows “only” 32000 subdirectories/files in a directory. ext4 allows unlimited number of subdirectories.
  • Multiblock allocation : ext3 allocates 1 block (4KB) at a time. So if you write a 100 MB file you will be calling the ext3 block allocator 25600 times. Also this does not allow the block allocator to optimize the allocation policy because it does not know the total amount of data being allocated. Ext4 on the other hand used a multiblock allocator to allocate multiple blocks in a single go. This improves performance to a great extent.
  • Extents : Ext3 uses indirect block mapping scheme to keep track of each block for the blocks corresponding to the data of a file. Ext4 uses extents (contiguous physical blocks) that is to say that the data lies in the next n blocks. It improves performance and reduces file fragmentation
  • Delayed allocation : Ext3 allocates the blocks as soon as possible. Ext4 delays the allocation of physical blocks by as much as possible. Until then the blocks are kept in cache. This gives the block allocator the opportunity to optimize the allocation of blocks.
  • fast fsck : The total fsck time improves from 2 to 20 times,
  • journal checksumming : Ext4 checksums the journal data to know if the journal blocks are corrupted. Journal checksumming allows one to convert the two-phase commit system of Ext3’s journaling to a single phase, speeding the filesystem operation up to 20% in some cases – so reliability and performance are improved at the same time.
  • “No Journaling” mode : in Ext4 depending on your requirements, the journaling feature can be disabled.
  • Online defragmentation : Ext4 supports online defragmentation. There is a e4defrag tool which can defrag individual files or even complete filesystems.
  • Inode related features : Larger inodes, nanosecond timestamps, fast extended attributes, inodes reservation
  • Persistent preallocation : Applications tell the filesystem to preallocate the space, and the filesystem preallocates the necessary blocks and data structures, but there is no data on it until the application really needs to write the data in the future.
  • Barriers on by default : This option improves the integrity of the filesystem at the cost of some performance. A barrier forbids the writing of any blocks after the barrier until all blocks written before the barrier are committed to the media. By using barriers, filesystems can make sure that their on-disk structures remain consistent at all times.

Now lets see the steps needed to convert your desktop filesystem from ext3 to ext4.

  • check the version of linux kernel. It should be > 2.6.28-11.
    jayant@gamegeek:~$ uname -a
    Linux gamegeek 2.6.28-15-generic #49-Ubuntu SMP Tue Aug 18 19:25:34 UTC 2009 x86_64 GNU/Linux
  • Just in case – take a backup of your important data
  • boot from live cd and run the following commands for converting the partition /dev/sda6 to ext4 from ext3.
    $ sudo bash
    $ tune2fs -O extents,uninit_bg,dir_index /dev/sda6
    $ e2fsck -pf /dev/sda6
  • Mount the partition and change its type entry in /etc/fstab
    $ mount -t ext4 /dev/sda6 /mnt
    $ vim /mnt/etc/fstab
    Change
    # /dev/sda6
    UUID=XXXXXX / ext3 relatime,errors=remount-ro 0 1
    To
    # /dev/sda6
    UUID=XXXXXX / ext4 relatime,errors=remount-ro 0 1

    Save the changes

  • Reinstall grub – this is optional. If you do not do this and you get a fatal error 13 while booting the machine, just boot using the live cd and run these commands.
    $ sudo bash
    $ mkdir /mnt/boot
    $ mount /dev/sda6 /mnt/boot
    $ grub-install /dev/sda –root-directory=/mnt –recheck

After the reboot you would be using the ext4 filesystem.

Important note : Your old files have not been converted to the ext4 technology. Only new files written to disk will use the ext4 technology. But since ext3 & ext4 are compatible, you wont face any problems accessing older ext3 files on disk. With usage the ext3 files will automatically disappear and get converted to ext4.

mount linux drive on windows

Earlier there used to be a software known as explore2fs which used to allow the users to scan all file-systems in read-only mode and copy files from the linux drive to windows. I had used it long time ago…

Now-a-days things have changed quite a lot. When i googled for mount ext3 file system on windows, i got tons of links and tons of tools. For some time i was confused how to go about all of it.

But then after going thru all the tools, i came across something known as ext2fsd. I got the 0.46 version downloaded – only 973KB and installed it.

Once you start the ext2fsd volume manager, you could see all the file systems on your disk. All you have got to do is

1. right click on the drive you want to mount and assign it a drive letter.
2. go to the ext2 management and choose the drive letter as mount point for fixed boot
3. ask the program to automatically mount the drive on boot
4. enable ext2fsd to auto-start during boot.

A simple reboot and i could see my linux partition as a drive on the windows explorer. No need to copy files to read them (like i used to do when i used explore2fs). It runs like a charm…

I still have to look at how to go about mounting ext4 file system on windows. Though ext4 is not that common, but it should become common soon…

memcached replication

Wow… finally a solution that provides replication in memcache – repcached.

You can have a look at it repcached.lab.klab.org.

They provide two types of packages

1. a pached memcache source, which can be directly compiled.
2. a patch which can be applied to the memcache source and then compiled.

So, i downloaded the memcached-(version)-repcached-(version).tar.gz source and simply compiled it.

./configure –enable-replication
make
sudo make install

Note : When you enable replication, you cannot do –enable-threads.

I started two instances of memcached on ports 11211 & 11222

jayant@gamegeek:~/php$ memcached -p 11211 -m 64 -x 127.0.0.1 -v
replication: connect (peer=127.0.0.1:11212)
replication: marugoto copying
replication: close
replication: listen

jayant@gamegeek:~/php$ memcached -p 11222 -m 64 -x 127.0.0.1 -v
replication: connect (peer=127.0.0.1:11212)
replication: marugoto copying
replication: start

Now set and get a value on instance on port 11211

jayant@gamegeek:~$ telnet localhost 11211
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.
set hello 0 0 5
world
STORED
get hello
VALUE hello 0 5
world
END

Connect to port 11222 and try getting this value

jayant@gamegeek:~$ telnet localhost 11222
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.
get hello
VALUE hello 0 5
world
END

Try the reverse as well

On 11222
<—snip–>
set key 0 0 5
myval
STORED
get key
VALUE key 0 5
myval
END
<—snip–>

On 11212
<—snip–>
get key
VALUE key 0 5
myval
END
<—snip–>

Suppose the master goes down (in this case lets assume that the memcached on port 11211 goes down). So, we redirect all traffic on port 11222. But later when memcached on port 11211 comes up, the data should be automatically replicated on the new instance. Lets kill the memcache on port 11211 and restart it

On port 11211

Killed
jayant@gamegeek:~/php$ memcached -p 11211 -m 64 -x 127.0.0.1 -v
replication: connect (peer=127.0.0.1:11212)
replication: marugoto copying
replication: start

On port 11222

<—snip–>
replication: close
replication: listen
replication: accept
replication: marugoto start
replication: marugoto 2
replication: marugoto owari
<—snip–>

Lets see if the data has been replicated on port 11211

jayant@gamegeek:~$ telnet localhost 11211
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.
get hello
VALUE hello 0 5
world
END
get key
VALUE key 0 5
myval
END

Bingo…
Please share your experience if you have tried it on a live scenario with large number of sets and gets.

VIM improvements

If you have vim 7.x version, you should know that vim supports auto completion for certain languages like c, php, python, html, css, xml, javascript.

To enable autocompletion in vim, you need to create a file in your home directory

$ vim ~/.vimrc

Copy the following code into this file

autocmd FileType python set omnifunc=pythoncomplete#Complete
autocmd FileType javascript set omnifunc=javascriptcomplete#CompleteJS
autocmd FileType html set omnifunc=htmlcomplete#CompleteTags
autocmd FileType css set omnifunc=csscomplete#CompleteCSS
autocmd FileType xml set omnifunc=xmlcomplete#CompleteTags
autocmd FileType php set omnifunc=phpcomplete#CompletePHP
autocmd FileType c set omnifunc=ccomplete#Complete

And save the file.

Now open your file in vim and check out autocompletion

$ vim mycode.php

For autocompletion press Ctrl-X O. You should be getting a dropdown of available functions/variables and on the top of the screen a definition of the selected function.

Another thing that could be done in vim is tabs. Yup check out the following commands

:tabe oldfile.php – open oldfile.php in new tab for editing
:tabnew – open a new blank tab
:tabn – go to next tab
:tabp – go to previous tab
:tabr – go to first tab
:tabc – close current tab
:tabo – close other tabs

Coding is fun!!!