Primary Phylogenetic Data Objects¶
Phylogenetic data in DendroPy is represented by one or more objects of the following classes:
- Taxon
- A representation of an operational taxonomic unit, with an attribute, label, corresponding to the taxon label.
- TaxonNamespace
- A collection of Taxon objects representing a distinct definition of taxa (for example, as specified explicitly in a NEXUS “TAXA” block, or implicitly in the set of all taxon labels used across a Newick tree file).
- Tree
- A collection of Node and Edge objects representing a phylogenetic tree. Each Tree object maintains a reference to a TaxonNamespace object in its attribute, taxon_namespace, which specifies the set of taxa that are referenced by the tree and its nodes. Each Node object has a taxon attribute (which points to a particular Taxon object if there is an operational taxonomic unit associated with this node, or is None if not), a parent_node attribute (which will be None if the Node has no parent, e.g., a root node), a Edge attribute, as well as a list of references to child nodes, a copy of which can be obtained by calling child_nodes. In addition, advanced operations with tree data often make use of a Bipartition object associated with each Edge on a Tree (see “Bipartitions” for more information).
- TreeList
- A list of Tree objects. A TreeList object has an attribute, taxon_namespace, which specifies the set of taxa that are referenced by all member Tree elements. This is enforced when a Tree object is added to a TreeList, with the TaxonNamespace of the Tree object and all Taxon references of the Node objects in the Tree mapped to the TaxonNamespace of the TreeList.
- CharacterMatrix
- Representation of character data, with specializations for different data types: DnaCharacterMatrix, RnaCharacterMatrix, ProteinCharacterMatrix, StandardCharacterMatrix, ContinuousCharacterMatrix, etc. A CharacterMatrix can treated very much like a dict object, with Taxon objects as keys and character data as values associated with those keys.
- DataSet
- A meta-collection of phylogenetic data, consisting of lists of multiple TaxonNamespace objects (DataSet.taxon_namespaces), TreeList objects (DataSet.tree_lists), and CharacterMatrix objects (DataSet.char_matrices).
- TreeArray
- A high-performance container designed to efficiently store and manage (potentially) large collections of structures of (potentially) large trees for processing.


