MAFFT version 7

Multiple alignment program for amino acid or nucleotide sequences

Options specifically for SARS-CoV-2  2022/Mar

  • Uses the reference data corresponding to the MSA selected above.
  • Sets the same flags (--compactmapout, --maxambiguous and --addtotop) as the calculation in GISAID. 
The resulting alignment can be concatenated to the entire or a part of GISAID's MSA to incorporate your new sequences into the MSA.  The GISAID MSA has to be downloaded separately from the original site.

Ambiguous letters:
and replace succesive ns (nucleotide) or Xs (protein) in new sequences with a single n or X.

Keep alignment length:

With this option, insertions at the new sequenes are deleted, to keep the alignment length the same as the input alignment.

↑ Failed when the "allow unusual symbols" option was on, Jan/18 –.  Fixed Jan/20, 2022.


Scoring matrix for amino acid sequences:
Scoring matrix for nucleotide sequences:
↑ Switch it to '1PAM / κ=2' when aligning closely related DNA sequences.
Gap opening penalty: (1.0 – 5.0)
Offset value: (0.0 – 1.0)
↑ If long gaps are not expected, set it as 0.1 or larger value.

Score of N in nucleotide data: Example
↓ Long stretches of Ns tend to be gapped (excluded from the alignment).

Experimental option (2016/Apr/26)

↑ Try this if Ns should be aligned with usual letters.