Multiple ncRNA alignment
Version 6.5 has two new options, Q-INS-i and X-INS-i,
in which secondary strucre information of RNA is considered.
These methods are suitable for a global alignment of
highly diverged ncRNA sequenes.
For relatively conserved RNAs, such as SSU and LSU rRNA,
the advantage of these methods is small.
Benchmark results can be seen here.
- Q-INS-i:
-
Applicable to up to ∼200 sequences × ∼1,000 nt
- Uses the Four-way Consistecy objective function (Katoh and Toh, submitted) for incorporating structural information.
- X-INS-i:
- Applicable to up to ∼50 sequences × ∼1,000 nt.
- X-INS-i is a framework
based on the Four-way Consistecy objective function
to build a
multiple structural alignment
by combining pairwise structural
alignments given by an external program.
At present,
the external program can be selected from
MXSCARNA, LaRA and FOLDALIGN (the local and global options).
- We are ready to support other external programs
as the source of pairwise structural alignments.
RNA structural alignment is incompatible with gcc 4.8.x.
Even if compilation succeeds, the result can be incorrect.
If you have this version of gcc only, use pre-compiled package (2013/Jan)
Download
-
MAFFT
Download the mafft-*-with-extensions-src.tgz package.
Includes the codes
from the Vienna RNA package, MXSCARNA and ProbConsRNA.
License notice
- MXSCARNA (Included in the mafft-*-with-extensions-src.tgz package)
The X-INS-i option uses, by default, MXSCARNA as the source of pairwise structural alignments.
Although MXSCARNA is a multiple alignment program itself,
here we use its pairwise structural alignment function, SCARNA.
- FOLDALIGN (Optional)
Rename the foldalign executable to
foldalign210 and copy it into the /usr/local/lib/mafft/ directory
or the directory the MAFFT_BINARIES environment variable points.
- CONTRAfold (Optional)
Copy the contrafold executable into the /usr/local/lib/mafft/ directory
or the directory the MAFFT_BINARIES environment variable points.
CONTRAfold can be used for computing the base-pairing probability
(not for structural alignment) replacing the McCaskill algorithm.
Installation
(Installation of MAFFT)
% gunzip -cd mafft-x.x-with-extensions-src.tgz | tar xfv -
% cd mafft-x.x-with-extensions/core/
% make clean
% make
% su
# make install
# exit
% cd ../
(Installation of MXSCARNA included in this package)
% cd mafft-x.x-with-extensions/extensions/
% make clean
% make
% su
# make install
# exit
% cd ../
(Installation of FOLDALIGN)
# cp /somewhere/foldalign /usr/local/lib/mafft/foldalign210
# chmod guo+rx /usr/local/lib/mafft/foldalign210
(Installation of CONTRAfold)
# cp /somewhere/contrafold /usr/local/lib/mafft/
# chmod guo+rx /usr/local/lib/mafft/contrafold
If you want to install the programs into other
directories than
/usr/local/lib/mafft/, see the readme file.
Usage
Q-INS-i
% mafft-qinsi input > output
To use the CONTRAfold algorithm, instead of the McCaskill algorithm,
% mafft-qinsi --contrafold input > output
X-INS-i
By default, MXSCARNA is selected as the source of pairwise structural alignment
(X-INS-i-scarnapair):
% mafft-xinsi input > output
which is equivalent to
% mafft-xinsi --scarnapair input > output
To use LaRA (X-INS-i-larapair),
% mafft-xinsi --larapair --laraparams lara.params input > output
To use the local alignment option of FOLDALIGN (X-INS-i-foldalignlocalpair),
% mafft-xinsi --foldalignlocalpair input > output
To use the global alignment option of FOLDALIGN (X-INS-i-foldalignglobalpair),
% mafft-xinsi --foldalignglobalpair input > output
To use CONTRAfold, instead of the McCaskill algorithm,
to compute the base-pairing probability,
% mafft-xinsi --contrafold --scarnapair input > output
The --contrafold option can be combined with
any of the --*pair options.
In total, there are
4 types of structural alignment algorithms ×
2 types of base-pairing probabilities = 8 possible variants of X-INS-i.
As the difference in accuracy among them
is small
in a benchmark test,
the fastest combination (SCARNA and McCaskill)
is selected by default, at present.