cbrc
MAFFT version 7

Multiple alignment program for amino acid or nucleotide sequences

Sep.25–29: A maintenance is planned.  This service will be mostly available but may be sometimes unstable in this period.

This service is experimental, 2017/Aug.  Upper limit of data size and other settings may be changed after trying actual cases.

Multiple alignment of a large number of short and highly similar sequences

Typical data size is up to ∼200,000 sequences × ∼5,000 sites (including gaps), but depends on similarity. 
Input
Upload DNA or protein sequences (FASTA format) in a plain text file: Example

or paste sequences (FASTA format) here: 

  Help

UPPERCASE / lowercase:

Direction of nucleotide sequences: Help Updated!

 
 

Output order:

Notify when finished (optional; recommended when submitting large data):
Email address:

Advanced settings

Strategy:
Progressive methods with chained guide trees: Help

Tree-based progressive methods: Help





Partially iterative refinement methods (for less than 100,000 sequences): Help


Memory usage (effective for FFT-NS-1, FFT-NS-2 and mafft-sparsecore): Help

Parameters:
Scoring matrix for amino acid sequences:
Scoring matrix for nucleotide sequences:
↑ Switch it to '1PAM / κ=2' when aligning closely related DNA sequences.
Gap opening penalty: (1.0 - 5.0)
Offset value: (0.0 - 1.0)

Score of N in nucleotide data: Example
↓ Long stretches of Ns tend to be gapped (excluded from the alignment).

Experimental option (2016/Apr/26)
↑ Try this if Ns should be aligned with usual letters.

Plot LAST hits (DNA only):


Threshold:

Reference

Katoh, Rozewicki, Yamada 2017 (Briefings in Bioinformatics, in press)
MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization
Access the recommendation on F1000Prime