cbrc
MAFFT version 7

Multiple alignment program for amino acid or nucleotide sequences

Supported in versions 6.840 and higher

Unusual characters

If there are unusual characters (e.g., U as selenocysteine in protein sequence), use the --anysymbol option:
% mafft --anysymbol input > output

It accepts any printable characters (U, O, #, $, %, etc.; 0x21-0x7e in the ASCII code), execpt for > (0x3e) and ( (0x28).  Unusual characters are scored as unknown (not considered in the calculation), unlike in the --text mode

When the input data is:

>
SampleSequenceWithUnusualCharacter
>
Sample#Sequence_With%Various^Unusual*Characters
>
SAMPLESEQUENCE

The result will be:

>
Sample-Sequence-With---------Unusual-Character-
>
Sample#Sequence_With%Various^Unusual*Characters
>
SAMPLE-SEQUENCE--------------------------------

Upper/lower case is preserved.  The --anysymbol option is internally equivalent to the --preservecase option.

For aligning non-biological sequences, use the --text mode, in which unusual characters are also considered in the alignment calculation.