One common strategy to understand a complex thing is to
Separate it into smaller parts
In other words, decomposition, usually called
Analysis
Living and dead things
Friends, enemies, people we don’t know
Continents
Countries
Species
According to some authors, animals are classified as
Jorge Luis Borges (1942) “The Analytical Language of John Wilkins”, under “Celestial Emporium of Benevolent Knowledge”
Let’s say we have a large sef of things
Maybe “all living organisms”
Typically we separate this set into two or more subsets
Each organism must be in one and only one subset
The set \(U\) of all things is separated in several subsets \(K_1,…, K_n\) called equivalence classes \[U=K_1 ∪ K_2 ∪ … ∪ K_n\]
This means that every organism must belong to some equivalence class
Everything is classified
All equivalence classes are disjoint \[K_i ∩ K_j = ∅\quad\text{if }i≠j\]
This means that every organism belongs only to one equivalence class
There is only one classification for each thing
\(x\) is either in \(K_i\) or \(K_j\) but not in both
We have a set \(U\) that we want to analyze
We have \(n\) substets of \(U,\) each one called \(K_i\)
If \(K_1∪K_2∪…∪K_n=U,\) we say that \(\{K_i\}\) covers \(U\)
If \(K_i∩K_j=∅\) whenever \(i≠j,\) we say that \(\{K_i\}\) is disjoint
If \(\{K_i\}\) has these two conditions, we say that it is a partition
Continents
Countries
Species
We can repeat the process again and again
Each class \(K_i\) can be split into \(m_i\) subsets called \(P_j\)
\[K_i=P_1 ∪ P_2 ∪ … ∪ P_{m_i}\] \[P_i ∩ P_j = ∅\quad\text{if }i≠j\]
and so on
This recursive partioning is called Taxonomy ## Hierarchical classification In a taxonomy each equivalence class is divided into smaller equivalence classes. For example
There is a hierarchy of classes, with different levels
Classes of the same level are disjoint
Classes of different levels can be subsets
Bloom’s taxonomy
a set of three hierarchical models used to classify educational learning objectives into levels of complexity and specificity
cognitive
affective
sensory
Dewey Decimal Classification for libraries
000 – Computer science, information & general works
100 – Philosophy & psychology
200 – Religion
300 – Social sciences
400 – Language
500 – Pure Science
600 – Technology
700 – Arts & recreation
800 – Literature
900 – History & geography
Originally each hierarchy level (a.k.a. rank) was named
Today there are more intermediate ranks
(literally “system of two names”)
Each organism is labeled with two words: genus and species
This is a good approach for any definition
“X is like Y but with Z difference”
Hierarchical classifications are often represented by trees
Trees have root, branches, internal nodes and leaves
Edges (branches) connect nodes
Each node (except the root) has one unique parent node
A node can have several descendants. If a node has no descendants, we call it a leave
Taxonomic trees are similar to phylogenetic trees
But “genus” is not “common ancestor”
Each node in a phylogenetic tree is a species
Moreover, an organism has more than one ancestor
There is no “official” taxonomy
People are still figuring out many cases
NCBI has an taxonomy tree that is often used in practice
This tree does change in time
Each node has
Using NCBI taxid prevents many errors