1. In linguistics and text encoding

A character is an independent unit of a writing system. The Unicode Standard states that characters are “the abstract representations of the smallest components of written language that have semantic value” (2015, ch. 2, p. 15). In this usage, a character is more or less synonymous with a letter (German Buchstabe) or grapheme.

The Unicode Standard contrasts characters with glyphs, which “represent the shapes that characters can have when they [i.e. the characters] are rendered or displayed.” (2015, ch. 2, p. 15)


Ill. 1. The character “a” represented by various glyphs.


2. In phylogenetics

In phylogenetics, a character refers to an individual feature that is used to characterise the taxa (see taxon). For instance, if the taxa are different species of butterflies, the wing span could be a feature, and the colour of the wings another, and each of them would be encoded as a character. A character can take on a number of different states. In the case of the wing colour, the states could be different colours: light blue, dark blue, green, and so on. If the taxa are represented as DNA sequences, the characters correspond to different positions, or loci (singular locus), in the DNA sequences, and the character states are A,T,G,C. A character matrix refers to a table containing a number of character sequences where each row corresponds to a taxon and each column corresponds to a character. A character may also be missing which is often encoded as '?'.

There are various data formats for representing character data, a popular one being the nexus format.

In other languages

DE: Zeichen, Kennzeichen
FR: caractères, signe graphique
IT: carattere, segno grafico


