Add new sequences to an existing alignment using MAFFT

Due to a configuration change on May 22, 2025, the performance of this service should have improved slightly during periods of high load. Please let us know if you notice any side effects.

To extract a short region (<5000 bases/amino acids) from a set of long unaligned sequences, try another function (2022/Oct).

For SARS-CoV-2, use this version (2022/May).

Add fragmentary sequence(s) to existing alignment or sequence Help

Existing alignment: Example1 (protein)
Gaps (-) will be preserved.

or upload a plain text file: Clear
Zipped file is acceptable.

Fragmentary sequence(s) to be added to the above alignment: Example1 (protein)
Gaps (if any) will be removed.

or upload a plain text file: Clear
Zipped file is acceptable.

Allow unusual symbols (Selenocysteine "U", Inosine "i", non-alphabetical characters, etc.) Help

UPPERCASE / lowercase:
Same as input
Amino acid → UPPERCASE / Nucleotide → lowercase

Direction of nucleotide sequences:
Same as input
Adjust direction according to the first sequence (accurate enough for most cases) Beta
Adjust direction according to the first sequence (only for highly divergent data; very slow) Beta

Output order:
Same as input
Aligned

Sequence title:
Same as input
Insert "New|" at the head of title of each new sequence

Title length in Clustal format (only first word is used as title):
(10 – 100)

Job name (optional):
(basic Latin alphabet, number and space only)

Notify when finished (optional; recommended when submitting large data):
Email address:

Advanced settings

Ambiguous letters:
Remove sequences that have ambiguous letters more than:

and replace succesive ns (nucleotide) or Xs (protein) in new sequences with a single n or X.

Keep alignment length:
Yes
With this option, insertions at the fragmentary sequenes are deleted, to keep the alignment length the same as the input alignment.

--compactmapout: Output the positions of insertions in added sequences and in the reference alignment. Updated (2021/Dec)

↑ Failed when the "allow unusual symbols" option was on, Jan/18 –. Fixed Jan/20, 2022.

Strategy:
Auto (--multipair or --6merpair; depends on data size)

--6merpair (Fast)
--multipair --weighti 0 (Intermediate)
--multipair (Accurate)

Parameters:
Scoring matrix for amino acid sequences:
Scoring matrix for nucleotide sequences:

↑ Switch it to '1PAM / κ=2' when aligning closely related DNA sequences.

Gap opening penalty: (1.0 – 5.0)
Offset value: (0.0 – 1.0)

↑ If long gaps are not expected, set it as 0.1 or larger value.

Score of N in nucleotide data: Example

↓ Long stretches of Ns tend to be gapped (excluded from the alignment).

(nzero) N has no effect on the alignment score.

(nwildcard) N is treated like a wildcard. Experimental option (2016/Apr/26)

↑ Try this if Ns should be aligned with usual letters.

Multiple alignment program for amino acid or nucleotide sequences

Add fragmentary sequence(s) to existing alignment or sequence Help

Advanced settings