Made small changes in output of mafft-homologs.rb and --scoreout.
Fixed a bug in the combination of --adjustdirection and --anysymbol.
Due to this bug, unnecessary texts had been added to title lines.
Fixed a problem in the --addfragments option, when a sequence to be added is longer than its closest homolog(s) in the reference alignment.
Due to this change, the assumed tree became different between --retree 0 and other cases.
Slightly changed the handling of internal gaps in --add and --addfragments options.
Fixed a bug in the --addfragments option.
In the multithread mode, when the sequences to be added include outlier(s) to the reference alignment, a memory error sometimes occurred due to this bug.
Fixed a compilation problem that occurred when multithreading is disabled.
Fixed a bug in the --merge option.
This option did not work in versions 7.182 - 7.187.
Fixed a bug in versions 7.182-7.186 in handling null or empty sequences in the multithread mode.
Fixed an environment-specific bug.
On Mac OSX 10.9 (Marvericks), the --progress option did not work.
Windows version now ignores the --thread -1 option.
Removed an unnecessary warning.
Fixed a bug in the --add option;
there was a possibility that repetitive sequences were truncated, when --add was applied without --localpair, --globalpair or --genafpair.
This bug affected the --add option in all the previous versions.
This bug did not affect the --addfragments or any options other than --add.
Updated the --fmodel option for nucleotide alignment with biased base composition.
v7.046 2013/06/12 (source only)
Changed the default compile option of MXSCARNA, such that RNA alignment methods (X-INS-i and Q-INS-i) work more stably in various environments.
"-funroll-loops" and "-finline-limit=" have been removed from mafft-*.*-with-extensions/extensions/mxscarna_src/Makefile.
Re-enabled the combination of --merge and --seed.
Fixed a memory leak.
Fixed a bug in the --merge option in the multithread mode in versions 7.036 - 7.040.
Changed the behavior of --merge such that each of the given groups is forced to form a monophyletic cluster.
Enabled iterative refinement in the --merge option.
Disabled the combination of --merge and --seed, because some problems were found.
Changed the order of sequences to reflect the similarity better, when the --reorder --addfragments options are given.
Fixed a bug in the E-INS-i mode in version 7.036.
Fixed a bug in the --clustalout option in version 7.036.
New option: --merge creates a single MSA from multiple sub-MSAs.
Changed the setting of X-INS-i back to that of version 6.864.
In versions 6.884 - 7.032, the accuracy of X-INS-i was slightly lower than that of the previous versions.
Ambiguous nucleotides (r, y, w, s, k, m, d, v, h, b; IUPAC-IUB codes) are scored as:
In previous versions, they were scored equivalently to n.
Fixed a bug in handling X in the seed alignment in the --seed option.
Improved the efficiency for all-to-all pairwise alignment.
Fixed a memory leak.
Fixed a memory allocation bug in the --treeout option.
Fixed a memory allocation bug in the multithread mode.
Fixed a bug in the f2cl program.
Support for titles of >10 characters in the phylip format
(--phylipout --namelength n).
n = 10 by default.
Fixed a bug in the score program.
Slightly changed the format of tree by --dpparttree --treeout
and --parttree --treeout.
Improved the efficiency of the --addfragments option for large data.
The effect of this change is small in most cases.
Fixed a Windows-specific bug; incorrect option name was displayed at the end of calculation in versions 7.012-7.015.
Changed some features only used in the web service.
Modified the behavior of --auto.
The --dpparttree --alga option is selected for large data.
There may be further changes in the future.
Fixed a problem that the order of sequences (with the --reorder option) was slightly different from the order of sequences in the guide tree (--treeout), in the FFT-NS-i option.
Changed an output format that is only used internally in the web service.
This version uses local alignment to estimate the direction of nucleotide sequences, in the --adjustdirectionaccurately option.
Modified the behavior of --auto --addfragments.
The thresholds may be changed in the future.
The number of threads for the iterative refinment stage can be specified by --threadit n, independently from --thread m.
By default, n = min( 6, m ).
The --auto --addfragments option checks the size of problem and automatically determines if an approximate method, --6merpair, is applied.
The threshold may be changed in the future.
The --treeout --addfragments option outputs the estimated phylogenetic positions of the sequences to be added.
The --retree 0 --treeout --addfragments option outputs the estimated phylogenetic positions of the sequences to be added.
Alignment calculation is skipped.
Fixed problems in stderr messages.
Modified Makefile such that it strips binaries.
Improved the efficiency of memory usage in the --6merpair --addfragments option.
Modified the --thread -1 option, such that it correctly counts the number of cores on Linux on VMware.
Corrected an example, test/sample.linsi, in the source package.
Fixed a bug in --addfragments in version 6.951.
If no close relative of a new sequence is found, it ran unstably.
Fixed a problem in indicator of similarity level for nucleotide alignment, in the clustal format.
Extended the length of sequence title, shown in a tree with --treeout, to ∼250 letters.
Improved the efficiency of memory usage in the --6merpair --addfragments option.
Fixed a bug in the combination of --addfragments and --reorder.
The order of sequences in the output was incorrect, in versions 6.923 - 6.950.
This bug did not affect the alignment.
Improved the efficiency of memory usage in the --addfragments option.
Improved the efficiency of the --addfragments option for a large number of unaligned sequences.
Improved the parallelization efficiency of the --addfragments for large data.
However, for small data, the efficiency has been slightly reduced.
The effect of this change is large when applying fast options, --6merpair and --10merpair, to large data,
but the effect is small in most cases.
Fixed a memory leak in --addfragments.
Enabled the --10merpair option for nucleotide alignment.
Distance matrix is computed based on the number of shared 10mers.
Slightly improved the speed of the --addfragments option.
Fixed a bug in --addfragments when the reference alignment has just one sequence.
Improved the speed of the --add and --addfragments options when the number of sequences is large.
Changed the default parameter when calling LAST.
Fixed several bugs.
Fixed several bugs.
Added new options, --adjustdirection and --adjustdirectionaccurately,
which adjust the direction of nucleotide sequences, according to the first sequence.
--adjustdirection is based on 6 mer counting and faster.
--adjustdirectionaccurately is based on DP and slower.
The former works well in most cases, unless the sequences are highly diverged.
Changed the behavior of --thread -1:
# threads := # of physical cores + 1, if hyperthreading is on.
# threads := # of physical cores, if hyperthreading is off.
A new option, --thread -1, automatically uses an appropriate number of threads (ie, # of threads := # of physical cores).
Linux and Mac only.
A new option, --addfragments, to add short sequences to an existing alignment.
The usage and details will be available later.
Experimental support for multithreading on Intel Mac, in addition to Linux.
Changed the error message for the case where the MAFFT_BINARIES environmental variable is incorrectly set (Mac only).
Modified some error messages the main script returns.
Changed the behavior of the --auto option.
When the number of sequences is > 10,000, FFT-NS-1 is selected.
FFT-NS-1 is faster than the default (FFT-NS-2).
Two different group-to-group algorithms,
are selectable for progressive alignment options, including parttree.
The --alga algorithm is a conventional one.
The --algq algorithm counts existing gaps differently and
the resulting alignment has more gaps.
In this version, by default,
--algq is used in the parttree options,
--alga is used in the other options.
Changed the group-to-group alignment algorithm in the --parttree and --dpparttree options,
which are for large alignment consisting of 50,000 or more sequences.
The new version tends to generate shorter alignment than previous versions.
According to a benchmark, the previous version is more accurate than the new version.
However, the alignment by the previous version sometimes becomes too long.
To emulate previous versions, add --algq
mafft --algq --parttree input > output
mafft --algq --dpparttree input > output
Extended the upper limit of the number of sequences for FFT-NS-1 and FFT-NS-2: 20,000 → 100,000
Extended the upper limit of the number of sequences for iterative refinement options: 4,000 → 6,000
Fixed a bug in handling very short sequences.
Fixed a memory allocation bug that causes a crash when null sequences are given.
Corrected the default installation directory of mxscarna_mod.
Fix a bug in the --add option.
Changed the default location of subprograms from /usr/local/lib/mafft/ to /usr/local/libexec/mafft/.
Fixed incorrect descriptions on the CHECK step in readme.
Modified core/Makefile to be compatible with MacOSX.
(There is no change in binary packages.)
Corrected a formatting error in the --phylipout option.
Added the --out option to specify an output file, instead of stdout.
Fixed an incorrect target directory of manpages in Makefile.
Fixed several uninitialized variables and deleted unused variables.
Beta support for the PHYLIP interleaved format, --phylipout.
Name length n in a CLUSTAL format output can be controlled by --clustalout --namelength n.
Fixed a problem in a newick tree when the --anysymbol and --treeout options are simultaneously set.
Fixed the installation directory of mafft-profile and mafft-distance in Makefile.
Name length in a tree (generated by --treeout) has been extended from 20 to 60.
Several modifications just for experimental features.
Fixed a problem with directory name containing space.
Fixed a bug in Makefile of v6.715.
If you have the source of v6.715, please replace it with v6.716.
Fixed a platform-specific bug in the mafft script.
Modified the readme file on how to install without root.
Approximate distance matrix in the phylip format.
mafft-distance -p -i input > output
Some updates only for the online version.
Changed a stderr message.
Enabled the combination of --treein and --seed.
Fixed a non-standard usage of fprintf in pairlocalalign.c.
Fixed an OS-specific bug in --treein.
This bug affected the Windows version.
Fixed non-standard usage of make in extensions/.
Support for Mac ppc64 and x86_64 binaries.
Modified extensions/Makefile so that it passes down CXX and CXXFLAGS to mxscarna_src.
Compilation options can be specified as command-line arguments of make. make CXX="g++" CXXFLAGS="-m32 -fast"
Fixed a potential overflow problem at the second progressive step of FFT-NS-*.
Slightly improved performance for alignment of long and highly conserved sequences.
Re-support for long and highly similar sequences.
Versions 6.619-6.704 required a huge RAM space when long (>∼1,000,000) and highly similar sequences were given.
To process such sequences with small RAM,
the corresponding code was reverted to that of version 6.611.
Corrected a typo in stderr message.
Support ambiguous amino acid codes, 'Z', 'B', and 'J'.
'U' is not supported.
Fixed a bug in --memsave.
Fixed a bug in --globalpir --retree 0 --treeout.
Changed the default setting: --ep 0.123 → --ep 0.0.
Adjusted parameters of the FFT alignment algorithm,
to suppress misalignments such as
in an alignment of long genomic DNAs.
Made minor modifications to input and output formats.
Fixed a bug (v6.500-v6.620) of mafft-xinsi and mafft-qinsi in the --quiet mode.
v6.620, 2008/12/10 6:00 PM JST
Fixed a bug (v6.619) by which L-INS-i, E-INS-i and G-INS-i always abort.
Enabled L-INS-i, E-INS-i and G-INS-i to handle long sequences (<30,000aa/nt). They may require a huge RAM space.
Fixed a bug (v5-v6.611) at the implementation of the combination of FFT and the memsave mode. This fixation affects the alignments of closely-related and long genomic sequences.
Fixed a bug (v6.605-v6.611) in mafft-homologs.rb.
Fixed some bugs and memory leaks that may be related to a problem that mafft-xinsi --scarnapair sometimes aborts on Windows. This problem is not yet completely solved.
Changed the distance measures in the *-INS-* strategies. The accuracies of L-INS-i and E-INS-i have been slightly improved, while they have become slightly slow.
Fixed a bug (v6.605) in mafft-homologs.rb.
Applied security fixations to the
mafft and mafft-homologs.rb scripts, according to the debian team's suggestion.
Changed compile options of the binary package for Mac.
Included MXSCARNA (Tabei et al. 2008) for computing pairwise RNA alignment used in X-INS-i.
Modified the LaRA part in the X-INS-i.
It depends on a specially adjusted version of LaRA (courtesy of M.Bauer).
Modified PREFIX in Makefile to make it easy to change the default installation directory.
Fixed a bug in the --treein option.
Disabled the --topin option, because it had a bug.
Added the --averagelinkage and --minimumlinkage options.
The mccaskill routine has become compatible with gcc4.3.
Fixed some problems in Makefile.
Fixed some problems in interactive mode.
Added an experimental batch script, mafft.bat for Windows.
Accepts short (<6 residues) sequences.
Adopts a length-dependent correction of 6-mer distance (unpublished).
Fixed a bug at --kappa and --fmodel.
Added --kimura x and --kappa y.
When DNA sequences are aligned,
the K80 model (Kimura 1980) with κ = y is
used to construct the scoring matrix.
Evolutionary distance among the sequences is assumed to be x PAM.
Default: --kimura 200 --kappa 2
Accepts short (<6 residues) sequences.
Modified the mafft script so that it works with mawk and other awk compatible languages.
The McCaskill-MEA part has become g++4.x compatible.
Reordered the source codes for RNA alignment.
A new option, X-INS-i, for RNA alignment was added.
X-INS-i is a framework based on the Four-way Consistecy objective function
to build a multiple structural alignment
by combining pairwise structural alignments
given by an external program.
At present, the external program can be selected from
MXSCARNA, LaRA and FOLDALIGN (the local and global options).
Although MXSCARNA and LaRA are multiple alignment programs themselves,
only their pairwise structural alignment functions are used.
CONTRAfold (Do et al. 2006) is selectable for calculating RNA base pairing probability.
It have to be installed into /usr/local/lib/mafft/.
Fixed bugs in Makefile.
A new option, Q-INS-i, for RNA alignment was added.
It uses a new objective function,
Four-way consistency (Katoh and Toh, submitted) calculated from
predicted secondary structure.
Algorithm Q perhaps improves the accuracy of FFT-NS-2, -1 and L-INS-1.
We have to do more tests. --algq
A rigorous and fast UPGMA algorithm proposed by
An approximate but faster O(N log N) tree-building algorithm (Katoh and Toh; in press),
applicable to huge datasets with ~50,000 sequences --parttree or --dpparttree
A length-dependent correction of 6-mer distance has been introduced (unpublished).
The accuracy of FFT-NS-1 was greatly enhanced as a result.
Pairwise alignment score,
instead of the number of substitutions
with the Poisson correction,
is used in the second phase of FFT-NS-2 and FFT-NS-i.
The effect of this is small in our tests.
User-defined aa scoring matrix
User-defined aa frequency
Tree output (parttree only)
The --auto option selects nearly the most accurate method as possible. Not yet tested.
Support for Mac Universal Binary.
Default of mafft-profile → FFT on
mafft-profile supports memsave.
Fixed a misdescription of fftnsi in homepage.
Fixed a bug in mafft-homologs.rb to correctly recognize the version of mafft script.
v5.830 crashes saying 'hairetsu ga kowareta!' in
the memsave mode when
inserting a long (>32767) gap. Fixed in v.5.850.
Options for handling a large dataset are automatically
chosen in v5.850.
Improved the speed of the FFT part.
if( tmpint==0 ) break;
Version 5.8 can handle larger data than the previous versions
The previous versions aborted with
the 'LENGTH OVER' error when
the alignment length (incl. gaps)
5 × the length of the longest input sequence (excl. gaps).
This limitation has been removed in ver.5.8.
Thus a large dataset
(2,000 sequences × 5,000 residues (incl. gaps)
can be aligned by the FFT-NS-2 option
even when many gaps are needed.
See tips for details.
Problems in the memory saving mode have been fixed.
Version 5.7 has memory saving mode (--memsave)
that enables the FFT-NS-x strategies to align
long genomic DNA sequences (20kb or more).