MAFFT version 6

Multiple alignment program for amino acid or nucleotide sequences

Why do recent versions generate LONG alignments?

A default parameter was changed in version 6.626 (Mar. 16, 2009). 

--ep 0.123--ep 0.0

In comparison with the old default (--ep 0.123), the new default (--ep 0.0) allows longer indels and thus is more robust to unusual evolutionary events, such as domain-level indels.  When little is known about the features of input sequences, the new default is safer, because less likely to make serious errors.  Alignments by the new default are generally longer than those by the old default.

How to emulate old versions

The old default (--ep 0.123) gives better benchmark scores than the new default (--ep 0.0) (ref:).  When it can be assumed that there is no large indels in the sequences, the use of a large --ep value (0.123 or larger) is recommended.

For small-scale data,

% mafft-ginsi --ep 0.123 input > output

For large-scale data,

% mafft-fftns --ep 0.123 input > output

For a large number (>∼20,000) of sequences,

% mafft --parttree --ep 0.123 input > output

Old versions (--ep 0.123) are faster than recent versions (--ep 0.0)

Because the CPU time is positively related to the alignment length.