Detecting Signature Characters in Gene Sequence Alignments for Taxon Diagnoses

DeSignate is an innovative tool for detecting diagnostic molecular characters (= signature characters) and their positions for complementing taxon diagnoses. The analysis is based on a novel representation of the gene sequence data, which enables a ranking of all alignment positions (= molecular character) according to their classification and diagnostic relevance.

DeSignate is also able to detect diagnostic character combinations (= combined alignment positions). The tool guides the user step-by-step through the analysis and presents the results without need to post-process the output data.

Which molecular characters are suitable for taxon diagnoses?

In taxon diagnoses, only signature characters that unambiguously distinguish the query from the reference group are of interest, i.e., their character states (e.g., nucleotides and deletions) are uniform at homologous alignment positions in the query group.

Two types of signature characters are distinguished: (1) at binary positions the character state of the reference group are uniform but different from the character state in the query group; (2) at asymmetric positions the character states of the reference group are not uniform but different from the character state in the query group.

You used DeSignate for your scientific work?

The scientific manuscript of this tool was published at BMC Bioinformatics. The paper is open access and can be found at here. Please cite this article as described on the bottom of the above BMC Bioinformatics website.

DeSignate is an open source tool that can be downloaded and modified. The source code is available on GitHub.