About
MAFFT
is a multiple sequence alignment program for
unix-like operating systems.
It offers a range of multiple alignment methods,
L-INS-i (accurate; for alignment of <∼200 sequences),
FFT-NS-2 (fast; for alignment of <∼10,000 sequences),
etc.
Download and Installation
The latest version is 7.045 (2013/Jun/6).
New!
Bug information:
In version 7.036 (2013/Apr/24) and version 7.035 (2013/Apr/23; only the source and linux packages were available),
-
The E-INS-i mode did not work.
-
The --clustalout option did not work.
These bugs have been fixed in version 7.037 (2013/Apr/25).
Input Format
Fasta format.
example1 (LSU rRNA),
example2 (protein)
The type of input sequences (amino acid or nucleotide)
is automatically recognized.
Usage
% mafft [arguments] input > output
An alias for an accurate option (L-INS-i) for an alignment of up to ∼200 sequences × ∼2,000 sites:
% mafft-linsi input > output
A fast option (FFT-NS-2) for a larger sequence alignment:
% mafft input > output
If not sure which option to use,
% mafft --auto input > output
Related Resources
References
-
Katoh, Standley 2013
(Molecular Biology and Evolution 30:772-780)
MAFFT multiple sequence alignment software version 7: improvements in performance and usability.
(outlines version 7)
-
Katoh, Frith 2012
(Bioinformatics 28:3144-3146)
Adding unaligned sequences into an existing alignment using MAFFT and LAST.
(describes the --add and --addfragments options)
-
Katoh, Toh 2010
(Bioinformatics 26:1899-1900)
Parallelization of the MAFFT multiple sequence alignment program.
(describes the multithread version)
-
Katoh, Asimenos, Toh 2009
(Methods in Molecular Biology 537:39-64)
Multiple Alignment of DNA Sequences with MAFFT. In Bioinformatics for DNA Sequence Analysis edited by D. Posada
(outlines DNA alignment methods and several tips including group-to-group alignment and rough clustering of a large number of sequences)
-
Katoh, Toh 2008
(BMC Bioinformatics 9:212)
Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework.
(describes RNA structural alignment methods)
-
Katoh, Toh 2008
(Briefings in Bioinformatics 9:286-298)
Recent developments in the MAFFT multiple sequence alignment program.
(outlines version 6;
Fast Breaking Paper in Thomson Reuters' ScienceWatch)
-
Katoh, Toh 2007
(Bioinformatics 23:372-374) Errata
PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences.
(describes the PartTree algorithm)
-
Katoh, Kuma, Toh, Miyata 2005
(Nucleic Acids Res. 33:511-518)
MAFFT version 5: improvement in accuracy of multiple sequence alignment.
(describes [ancestral versions of] the G-INS-i, L-INS-i and E-INS-i strategies)
-
Katoh, Misawa, Kuma, Miyata 2002
(Nucleic Acids Res. 30:3059-3066)
MAFFT: a novel method for rapid multiple sequence alignment based on
fast Fourier transform.
(describes the FFT-NS-1, FFT-NS-2 and FFT-NS-i strategies)
Contact
kazutaka.katoh@aist.go.jp
License
Copyright © 2013 Kazutaka Katoh