This is a review of the various programs used during my independent study on homology modeling at Georgia Institute Of Technology ,under Dr.Jung.Choi, Associate Professor Faculty Coordinator, Bioinformatics Masters Degree Program .I would like to thank Dr.Jung.Choi for giving me this opportunity and showing great confidence in me. The aim of this study was to help map the distance between membrane-associated progesterone receptors (MAPRs) and Cytochrome b5 (CB5).This webpage talks about Molecular Modeling, Homology Modeling and the various tools available for Homology Modeling. Most of the software accept the input sequence in FASTA format and the output file is mostly the pdb (Protein Data Bank) file. The output file can be viewed in any Molecular Visualization software for proteins, DNA and macromolecules. There are a number of programs available for this too which include Rasmol, Swiss-pdb viewer, Mod view, PREPI, Surfnet, Raster3D, WHAT_CHECK, Procheck, WebMol, VMD, WinMGM. A number of tools for Homology Modeling have been listed below, but not all of them do have a detailed explanation, as some of them did not give me a result for the sequences. Some of the programs listed below are Freeware.
Online Tutorials:
There are a number of websites that talk in depth about Homology Modeling. The CMBI's online Homology course is probably one of the best online courses I have read. There is also a course lecture website on Homology Modeling : Protocols and Concepts that talks about the various servers and software's used for Homology Modeling. The Abagyan Lab Server also talks about Molecular and Homology Modeling. The link here will take you to a good website that I used to understand Homology Modeling and the steps involved .
What is Molecular Modeling ?
Molecular Modeling is a very diverse subject. It includes the acquisition and subsequent display of molecular coordinates through a highly accurate numerical simulation using theoretically derived functions. Molecular Modeling is an essential method used to help predict the main structural features of a protein It may be referred as "molecular graphics", "molecular visualization", "computational chemistry", "computational quantum chemistry" or "theoretical chemistry" depending on where it is being used . A related area known as "molecular simulation" relates the use of molecular modeling techniques to describing and understanding the statistical behavior and properties of collections of molecules on a "macroscopic" scale. "Molecular dynamics" deals with those time-dependent properties of collections of molecules, and uses many of the techniques of molecule modeling and statistical mechanics. Molecular Modeling helps us better understand Homology Modeling. Molecular Modeling aids us in a number of other areas like secondary structure prediction, threading, tertiary structure prediction, protein simulation, docking ( protein-protein, protein-ligand) and drug design. There are a number of tools for Molecular Modeling. Few of them include:
Homology Modeling involves taking a known sequence with an unknown structure and mapping it against a known structure of one or several similar (homologous) proteins. It would be expected that two proteins of similar origin and function would have reasonable structural similarity. Therefore it is possible to use the known structure as a template for modeling the structure of the unknown structure. All Homology Modeling approaches consist of three steps: 1) Finding homologus PDB files. 2) Creation of the alignment, using single or multiple alignments :- Analysis of alignments; gap deletions and additions; secondary structure weighting 3) Structure calculation and model refinement. Given a correct alignment on a related template several methods can produce an accurate model, while without a correct alignment no method can produce a good model. Crucial step in the Homology Modeling procedure is after alignment, since aligned fragments can strongly deviate from the template at various distances from the alignment gap or even in the ungapped parts of the alignment. Homology is not a measure of similarity, but rather an absolute statement that sequences have a divergent rather than a convergent relationship. Among homologous sequences we can distinguish orthologs (proteins having the same function in different species) and paralogs (proteins performing different but related functions within one organism). The model building of a target structure based on the comparison with the data extracted from homologous sequences with known structures (parents or templates) is named Comparative Modeling.
Membrane-associated progesterone receptors (MAPRs) & Cytochrome b5 (Cb5)
Figure shows 1CYO.pdb when viewed in Rasmol (a 3d model viewer)
MAPRs1 are thought to mediate a number of rapid cellular effects not involving changes in gene expression. They have no significant sequence similarity to any steroid binding receptors. They were first identified in porcine liver membranes and sequenced from porcine vascular smooth muscle cells. They could be localized to both plasma and intercellular membranes. It is thought that MAPRs bind to progesterone intracellular. MAPRs contain a cytochrome b5 like ligand binding protein. The MAPRs have been identified as distant homologs of Cytochrome b5. The heme binding cytochrome b5 domain may have served as a template for the evolution of membrane-associated binding pockets for non-heme ligands such as steroid-binding site in the MAPR proteins. Conserved regions in MAPR (plants and metazoans) and some fungal chitin synthases have been found to have some similarity to cytochrome b5 domain, which has a mixed ß structure with two pairs of α-helices forming a binding pocket to one side of a ß sheet. The potential presence of MAPRs in plants suggests that they may use steroid for rapid cell signaling.
Program | Website address | Program description |
Geno3d | http://pbil.ibcp.fr/ | Automatic modeling of protein three-dimensional structure |
Swiss Model | http://www.expasy.org/swissmod/SWISS-MODEL.html | An automated knowledge-based protein modeling server; first approach and optimise |
CPHmodels | Http://www.cbs.dtu.dk/services/CPHmodels/ | Automated neural-network based protein modeling server |
Modeller | http://salilab.org/ | A program for automated protein Homology Modeling |
Amber | http://amber.scripps.edu/ | Similar package as CHARMm. Developed by Kollaman's group at UCSF |
Homology | http://www.accelrys.com/ | Automatic Homology Modeling module. The software suite also has Modeller, SeqFold modules, Quanta |
Wloop | http://psb00.snv.jussieu.fr/wloop/ | The Loop Homology Modeling Server |
What-If Server | http://www.cmbi.kun.nl/gv/servers/WIWWWI/ | V.Friend's What-IF Homology Modeling Server |
SPORulate | http://cgat.ukm.my/spores/Predictory/sporulate/s_predict_metaserver.html |
Send jobs by 'SPORulation' (meta server) to selected servers available above using the respective server's default values. |
SDSC1 | http://cl.sdsc.edu/hm.html | SDSC Protein Structure Homology Modeling Server |
Geno3D performs Comparative protein structure Modeling by spatial restraints (distances and dihedral) satisfaction. Geno3D is most frequently used for Homology or Comparative protein structure Modeling.Geno3d accepts input similar to Fasta format but only the one letter code has to be used. The result is obtained in the pdb format that can be viewed in any Molecular Modeling software.Geno3d offers many other features, it allows the user to select PDB entries as templates for Molecular Modeling after a 3 step iterative PSI BLAST. It presents the output for each template, along with the secondary structure prediction, displays percent of agreement in secondary structure and repartition of information from template on query sequence. The output link is sent to the user's email address. It also notifies the user when it's server begins the Homology Modeling. It has an option where the user can decide how many models to generate. The main idea behind having more than one model generated is that the user may have a better flexibility and understanding. It also returns a superimposed pdb file which has the models superimposed on each other. This is one of the good points in Geno3d as it allows us to compare the various models generated in one window. All the results obtained can be downloaded as a archive.tar.Z that can be opened in WinZip in windows and in UNIX or Linux platforms. So the user does not have to save results in webpage effect or in a document file. It also displays the Ramachandran plot in the result. For more details please visit this web link - It will direct you to the help file link of geno3D that also has diagrammatic representation.
Swiss model is a fully automated protein structure Homology Modeling server. It has a first approach mode that helps performs Homology Modeling. The user has to enter his / her email id and input the protein sequence in Fasta format. It allows the user to choose the BLAST limit for template selection. It can search the pdb file from the pdb database with the user providing the name of the pdb file or the user can upload his / her own pdb file. The output file is a pdb file that is returned to the user's email address. The result can be forwarded by Swiss Model to PHD Secondary structure prediction at Columbia University and Fold Recognition Server (3D-pssm) of the ICRF. Swiss Model however does not accept the sequences for homology modelling when similarity is less than 25%.
What-If Server requires three files to be generated by the user to perform Homology Modeling.1)A Template pdb file:-This is the file we are comparing our sequence against 2) A template PIR file :- This is the file that we need to compare. It must be in the PIR file format. We can convert it to the PIR file format using Clustal X which has an option that allows us to save the file as a PIR file 3) An aligned sequence PIR file :- This is the multiple alignment of the above two files. The multiple alignment (MA) can be done in Clustal X and the result of the MA must be saved as a PIR file. The output file is a pdb file which can be downloaded instantly from the website once it displays the result.
MODELLER models 3D structure of proteins. It is built in FORTRAN. Modeller is most frequently used for homology or comparative protein structure modeling. Modeller helps determine the spatial restraints from the templates. It generates a number of 3D models of the sequence you submit satisfying the template restraints.MODELLER automatically calculate a full-atom model. MODELLER models protein 3D structure keeping in the constraints of spatial restraints. The restraints can be derived from a number of different sources. These include NMR experiments (NMR refinement),cross-linking experiments, fluorescence spectroscopy, rules of secondary structure packing (combinatorial modeling),image reconstruction in electron microscopy, homologous structures (comparative modeling),site-directed mutagenesis, residue-residue and atom-atom potentials of mean force, etc. Modeller is not an automated homology modelling tool.
It is a very specific program. Any error in the format of the sequence alignment prevents the modeller from performing Homology Modeling . The program is very specific about the extension names of the file formats used for Homology Modeling. The user must read the online manual provided by Modeller website to get a good grip on Modeller. It is a very reliable program and it allows the user to specify what he wants in the end result. Modeller runs on platforms like Win XP, Linux, Sun Solaris and Macintosh. I have tried running modeller on the Linux platform, but was not completely successful. I was unable to run the program on Win XP. For a user who is new to the Linux platform running the program is quite a challenge. The current version of Modeller is Modeller 6. The website of Modeller is very detailed. The older version of Modeller has a good tutorial which has certain into a greater depth.
For users who do not have linux machine an ideal way to use is to use Knoppix/Bioknoppix. The cd can be customized to recognize an external drive that is formatted in linux mode. This causes the cd to run off from your cdrom which can be on a machine that has windows and the external drive that can be used to just store and retrieve data that one works on. Knoppix does not have installed on any local harddrive or machine. Knoppix allows the user to format the drive,but the user must log in as the superuser. It avoids the trouble of having a dual bootable system which can be very troublesome at times. Another way would be for the user to use the online version of Modeller - Modweb-A server for comparitive Modelling. It is very useful as the user need not go through the installation stage of Modeller. It runs the Modelling online and presents the results which can either sent to one's email or else can be stored in the database of Modeller. The user is sent a username and password which allows him log in and view results when he needs. Modeller accepts PAP,PIR,QUANTA and INSIGHTII formats. The user must ensure to give them the respective .pap,.ali,.aln(Quanta and InsightII). The .Top file contains the TOP script with instructions for the Modeller job. The script is details which tells modeller where the respective files involved are and their names and the output expected. When the user runs the modeller -it is executed by using the command - mod scriptfilename. Incase there is an error while executing thei command ,it may be most likely that the alias of modeller is not set to mod. It can changed in the modeller setup file. The output of MODELLER is a 3D structure of a protein . The optimization is carried out by the variable target function procedure employing methods of conjugate gradients and molecular dynamics with simulated annealing. MODELLER can also do several other tasks, including multiple comparison of protein sequences and/or structures, clustering, and searching of sequence databases.The modeller is till date the best program I have used for Homology Modelling.
CPHmodels is a collection of databases and methods developed to predict protein structure. It performs prediction of protein structure using Comparative Modeling. It does not accept more than 900 amino acids in the input sequence. The sequences are kept confidential and are deleted after processing. This program did not give me appropriate results. The error it displayed was similar to the one displayed by Swiss Model.
Amber is a package of programs for Molecular Modeling. Amber is not a Freeware. Amber refers to two things: a set of molecular mechanical force fields for the simulation of biomolecules and a package of molecular simulation programs which includes source code and demos. The current version of the code is Amber version 7. For more details please refer the help manual.
Homology utilizes structure and sequence similarities for predicting unknown protein structures. It is a part of the product family of Insight II. Homology is not a Freeware. A detailed explanation of how it works and what it does can be found at the website of Accelrys.
In SDSC1 the user has to input the sequence in FASTA format or free format (only characters representing sequence and line breaks allowed). I submitted my sequences to this program three times, but received no reply. It however does have this disclaimer put up on it's website.
Disclaimer : This server is provided mainly for testing and benchmarking purposes and isn't tuned for high selectivity of predictions hence results should be used with caution.
This server helps the user to submit the sequence for Homology Modelling a number of programs like SWISS-MODEL, CPHmodels and SDSC1.It also has a number of programs for secondary structure prediction and Fold recognition. The sequence I submitted here maintained the same errors as when submitted to the individual programs.
The Loop Homology Modeling Server is a tool to predict protein loop backbone structures from their sequences and their flank backbone structures. There is a help file available online to understand how to use it. I have not been able to figure out how to use this program.
NOTE: I have been also trying to understand in detail how AMMP works. I will put up more details to it once I have gained a complete understanding of it. The website of AMMP gives are good description of the tool.It is written in C and has been ported to many different computers. It is in short a free program suite for molecular mechanics, dynamics and modeling with some special features like docking or ab initio DFT calculations. Runs under Linux/Unix and Win9x.
References
Some contents on this website has been taken from the various Online Tutorials provided on the internet.
1) Mifsud D, Bateman A, Membrame-bound progesterone receptors contain a cytochrome b5 like ligand-binding domain. Genome Biol 2002, 3(12): 0068.1-0068.5
A few websites I found helpful while reading up more on Homology Modelling are listed below. I will keep updating this list as I refer through more.
[1] garlic.mefos.hr      [2] www.science-search.org     [3] vanderbilt
Website last updated in APRIL 2004