Welcome to MAP2.03D Server
Home Submission Instructions References Contact Us
Mutagenesis Assistant Program:

Mutagenesis Assistant Program (MAP) is a statistical tool which can be used for the benchmarking of random mutagenesis methods on protein level. The random mutagenesis methods have different mutational spectra and mutational frequency and differently affected by the redundancy of genetic code. Mutational spectrum is a set of data that include the frequency of mutations in a target nucleotides sequence under defined conditions.

Generally transition/transversion bias indicator is used to access the mutational spectra of any random mutagenesis method. The transition/transversion bias indicator provides the information on gene level which is insufficient for protein engineer to develop a directed evolution strategy because it is important to know which amino acid substitution can be generated on protein level for example at position which are functionally important. MAP provides two ways for the comprehensive analysis of random mutagenesis methods on the level of amino acid substitution: 1) Sequence based analysis and 2) Structure based analysis.

Examples of MAP analysis
MAP analysis for three model proteins can be accessed by using following links:

a) D-amino acid oxidase (EMBL-Bank AAB93974.1; PDB Id 1C0I)
    i) Sequence based analysis
    ii) Structure based analysis (for error-prone PCR: Taq(MnCl2; G=A=C=T))
    iii) Structure based analysis (for error-prone PCR: Taq(MnCl2; G=A,C=T))

b) Phytase (EMBL-Bank AY496073.1; PDB Id 1DKP)
    i) Sequence based analysis
    ii) Structure based analysis (for error-prone PCR: Taq(MnCl2; G=A,C=T))
    iii) Target based analysis (for charged residues)

c) N-acetylneuraminic acid aldolase (EMBL-Bank X03345.1; PDB Id 1NAL)
    i) Sequence based analysis
    ii) Structure based analysis (for error-prone PCR: Taq(MnCl2; G=A=C=T))
    iii) Target based analysis (for active site residues)

It is performed upon nucleotide sequence as input on amino acid substitution level and result is provided by three novel indicators based on the subset of amino acid substitution generated on the protein level:

a) Protein structure indicator
It measures the fraction of protein structure/function-disrupting (stop codons) and likely destabilizing amino acid substitutions (Gly and Pro residues).
 i) Fraction of variants with stop codons: Fraction of single nucleotide substitutions resulting in a stop codon (TAA, TGA or TAG)
 ii) Fraction of variants with Gly or Pro: Fraction of single nucleotide substitutions resulting in a glycine or proline codon (GGA, GGT, GGG, GGC, CCA, CCT, CCG or CCC)

b) Amino acid diversity indicator
It is measured by the fraction of variants with preserved amino acids and the average number of amino acid substitution per residue.
 i) Fraction of variants with preserved amino acids: Fraction of single nucleotide substitutions that do not change the encoded amino acid
 ii) Average amino acid substitution per residue: Average number of amino acid substitutions after single nucleotide exchange of a codon

c) Codon diversity coefficient
It is a coefficient that measures how random mutations are distributed among codons of a gene. In a method with non-biased mutational spectra (equal occurrence of A-N, T-N, G-N and C-N), the Codon diversity coefficient has a value of 0. Biased methods show preferences toward certain nucleotide exchanges and mutate certain nucleotides in codons preferentially. In other words, biased mutagenesis methods generate ?hot spots? of mutagenesis that compromise genetic diversity.

d) Chemical diversity indicator
It analyzes how chemically different the substituted amino acids are. For chemical diversity indicator, amino acids are grouped in four categories depending on chemical properties of amino acids. An ideal mutagenesis method allows us to substitute each residue equally with 19 other amino acids at each amino acid positions.
 i) Chemically categorized amino acid substitution graph: Shows the percentages of aliphatic, aromatic, neutral and charged amino acid substitutions generated by all 19 random mutagenesis methods
 ii) Chemically categorized amino acid substitution values: Data are reported as deviation of each random mutagenesis method from the ideal chemical distribution described above
 iii) Amino acid substitution patterns (matrix): Analyze to which extent each amino acid species is generated. These figures show the substitution pattern for each of the 20 amino acids in the protein of interest of all 19 random mutagenesis methods. Y-axis and X-axis represent 20 amino acid species in the protein of interest and amino acid substitution pattern generated respectively.
Structure based analysis
It is performed upon a selected random mutagenesis method at a time and requires a crystalographic structure or reliable homology model of the protein along with the nucleotide sequence as input. Along with the basic MAP indicators, it provides the informatics of the factors related to the protein stability, flexibility and activity with mutational spectra by correlating it with the local structure environment of the protein and the molecular interactions of its residues:

a) Local structure environment
Local structural environment of the protein comprises secondary structure element, residue flexibility and solvent accessibility

 i) Secondary structure assesment: Secondary structure assignments are important to assure the optimal yield of experimental structures and to cleverly select the target for mutagenesis. We provides the secondary structure information by usign DSSP program. Each residue is assigned with one of the four state: H: alpha helix, B: beta bridge and extended strand, T: hydrogen bonded turn and bend, *: loop or irregular structure. Ref
 ii) Residue flexibility: Proteins are dynamic molecules that are in constant motion, which enabled structural flexibility associated with various biological processes like molecular recognition and catalytic activity. Crystallographic B-factors (obtained from the crystallographic structure of the protein) is used as a representative of residue flexibility. The relative B-factor value of backbone atoms is used to differentiate flexible regions of protein from rigid. Ref
 iii) Relative solvent accessibility: Consequently the solvent-inaccessible residues have a lower rate of acceptance of mutations than those on surface and it has been used to estimate protein stability together with residue flefibility. Relative solvent accessibility (RSA) is used to differentiate between exposed and buried residues. RSA is calculated by the ratio of the number of water molecules in contact of residue/total surface area of the residue. Threshold is used to differentiate between exposed (RSA>=0.16) or buried residues (RSA<0.16) by using RSA. Ref

b) Molecular interactions
Intra residue interaction plays an important part in protein folding, stability and function. The knowledge of molecular interaction helps to evaluate the effect of amino acid substitutions in the stability or activity of the protein.

 i) Salt bridges: They are relatively weak ionic bonds between oppositely charged residues in protein structures. The script is used by the server to define salt bridges if the charged-group atoms in the residues are found to be lie within the distance of 2.0 to 4.0 in the structure. Ref
 ii) Hydrophobic interactions: If the distance between the hydrophobic chains of non polar amino acids, is within 5 , it was considered to be involved in hydrophobic interaction . We used the same criteria for the definition of hydrophobic interactions by using a Perl script at backend in our server. Ref
 iii) Aromatic interactions: If the aromatic residues is found to be separated by a preferential distance of between 4.5 to 7.0 , is considered to be involved in aromatic interaction. Ref
 iv) Side chain hydrogen bonds: The hydrogen bond formation was defined on the criterion of a donor-acceptor distance within 3.5 (oxygen and nitrogen) and 4.0 (sulphur), angular criteria is not considered during calculation of side chain to side chain and side chain to main chain hydrogen bonds in MAP analysis. Ref
 v) Disulphide bonds: These are the covalent bonds derived from the coupling of thiol group on cysteins and calculated by DSSP program. Ref
The 19 random mutagenesis methods that are incorporated in MAP alalysis:
Methods Designator used in MAP References
Taq-Pol (Unbalanced dNTP) Taq (-, G>A=C=T) 1
Taq-Pol (Mn2+/Balanced dNTP) Taq (+, G=A=C=T) 2
Taq-Pol (Mn2+/Unbalanced dNTP) Taq (+, G>A=C=T) 1
Taq-Pol (Mn2+/Unbalanced dNTP) Taq (+, G>>A=C=T) 1
Taq-Pol (Mn2+/Unbalanced dNTP) Taq (+, G=A, C=T) 3
Taq-Pol (Mn2+/Unbalanced dNTP) Taq (+, G=T, A=C) 4
Taq-Pol I164K Taq (I164K) 5
Mutazyme I Mutazyme I 6
Mutazyme II Mutazyme II 7
Pfu-Pol (exo-) D473G Pfu (exo-, D473G) 8
Enzymatic method employing reverse transcriptase Transcriptase 9
Taq-Pol (Nucleotide analogues dPTP and 8-oxodGTP) Taq (dPTP/8-oxodGTP) 10
Error-prone rolling circle amplification epRCA 11
Pol I method Pol I 12
E. coli expressing mutA allele of glyV gene E. coli (mutA) 13
Chemical mutagen (Nitrous acid) Nitrous acid 14
Chemical mutagen (Formic acid) Formic acid 14
Chemical mutagen (Hydrazine) Hydrazine 14
Chemical mutagen (Ethyl methane sulfonate) EMS 15

References for methods:
  1. Gurskaya, Nadya G., Arkady F. Fradkov, Natalia I. Pounkova, Dmitry B. Staroverov,Maria E. Bulina, Yurii G. Yanushevich, Yulii A. Labas, Sergey Lukyanov, Konstantin A.Lukyanov (2003). A colourless GFP homologue from the non-fluorescent hydromedusaAequorea coerulescens and its fluorescent mutants. Biochem J. 15, 373(Pt 2), 403-408.

  2. LinGoerke, J. L., Robbins, D. J. & Burczak, J. D. (1997). PCR-based random mutagenesis using manganese and reduced dNTP concentration. Biotechniques 23, 409-412

  3. Shafikhani, S., Siegel, R. A., Ferrari, E. & Schellenberger, V. (1997). Generation of large libraries of random mutants in Bacillus subtilis by PCR-based plasmid multimerization. Biotechniques 23, 304-310.

  4. Vartanian, J. P., Henry, M. & WainHobson, S. (1996). Hypermutagenic PCR involving all four transitions and a sizeable proportion of transversions. Nucleic Acids Res. 24, 2627-2631.

  5. Patel, P. H., Kawate, H., Adman, E., Ashbach, M. & Loeb, L. K. (2001). A single highly mutable catalytic site amino acid is critical for DNA polymerase fidelity. J. Biol. Chem. 276, 5044-5051.

  6. Cline, J. & Hogrefe, H. (2000). Randomize gene sequences with new PCR mutagenesis kit. Strategies 13, 157-162.

  7. Stratagene. (2004). Overcome mutational bias. Strategies 17, 20-21.

  8. Biles, B. D. & Connolly, B. A. (2004). Low-fidelity Pyrococcus furiosus DNA polymerase mutants useful in error-prone PCR. Nucleic Acids Res. 32, e176.

  9. Lehtovaara, P. M., Koivula, A. K., Bamford, J. & Knowles, J. K. C. (1988). A new method for random mutagenesis of complete genes: enzymatic generation of mutant libraries in vitro. Protein Eng. Des. Sel. 2, 63-68.

  10. Zaccolo, M., Williams, D. M., Brown, D. M. & Gherardi, E. (1996). An approach to random mutagenesis of DNA using mixtures of triphosphate derivatives of nucleoside analogues. J. Mol. Biol. 255, 589-603.

  11. Fujii, R., Kitaoka, M. & Hayashi, K. (2004). One-step random mutagenesis by error-prone rolling circle amplification. Nucleic Acids Res. 32, e145.

  12. Camps, M., Naukkarinen, J., Johnson, B. P. & Loeb, L. A. (2003). Targeted gene evolution in Escherichia coli using a highly error-prone DNA polymerase I. P. Natl. Acad. Sci. USA 100, 9727-9732.

  13. Balashov, S. & Humayun, M. Z. (2004). Specificity of spontaneous mutations induced in mutA mutator cells. Mutat. Res-Fund. Mol. M. 548, 9-18.

  14. Myers, R. M., Lerman, L. S. & Maniatis, T. (1985). A general-method for saturation mutagenesis of cloned DNA fragments. Science 229, 242-247.

  15. Lai, Y. P., Huang, J., Wang, L. F., Li, J. & Wu, Z. R. (2004). A new approach to random mutagenesis in vitro. Biotechnol. Bioeng. 86, 622-627.

Reference: Verma R, Schwaneberg U, Roccatano D 2012. MAP2.03D: A sequence/structure based server for protein engineering  
Website copyright 2005 Mutagenesis Assistant Program. All rights reserved