Corresponding author: Nguyen Van Sinh (

Academic editor: P. Stoev

Dichotomous keys are the most popular type of identification keys. Studies have been conducted to evaluate dichotomous keys in many aspects. In this paper we propose an index for quantitative evaluation of dichotomous keys (E_{dicho}). The index is based on the evenness and allows comparing identification keys of different sizes.

Van Sinh N, Wiemers M, Settele J (2017) Proposal for an index to evaluate dichotomous keys. ZooKeys 685: 83–89.

A taxonomic key is a method used to identify organisms. Dichotomous keys are the most popular type of identification keys. Dichotomous keys are single entry identification keys. They consist of nested questions or couplets, and each question provides two choices or leads (Thesis and Antithesis). These choices contain descriptions of key characteristics of an organism. The paired statements or choices consider the differences between items. After choosing the statement that best matches the object, the user proceeds to another pair of statements until the name of the taxon is identified. There may be several keys for a group of taxa. This prompts the question, which key has a better performance, provided that all the used characters are good ones which allow an unambiguous identification? How can we evaluate quantitatively the performance of the keys? As a key is intended for identification of each of the taxa in the group, the key will achieve the highest performance when the mean number of steps to their identification is minimal. If the number of steps to identification of the taxa in a key become more even, the mean number of steps to their identification is decreasing, and the mean number of steps to identification of the taxa is minimal when the number of steps to their identification are most even (Fig.

Schematic presentation of 5 dichotomous keys for a group of 8 taxa.

We use Pielou’s evenness index as a prototype for our index. Pielou’s evenness index (J) can be calculated using the following formula (

where:

- H’ is the Shannon diversity index. This measure was originally proposed by

In which _{i}

- H_{max} is the maximum value of H’ and equal to:

As result, Pielou’s evenness index can be calculated according to the following formula:

If the number of steps we have to pass to come to a decision (a taxon) is N_{i} and the total steps when we identify all the taxa is N, the proportion of the steps to identify the

_{i}_{i}/N

As can be inferred from the scheme of a dichotomous key (Fig.

We call the index for dichotomous keys E_{Dicho} (because of its origin from evenness index). As a result, E_{Dicho} is equal:

Where: S is the number of taxa of the key, and _{i}

Many attempts have been undertaken in order to evaluate identification keys (e.g.

Several studies have been conducted to evaluate dichotomous keys in practice of key use (_{Dicho}) can both evaluate the speed and the quality of the determination of a dichotomous key, provided that all else (e.g. choice of characters) being equal.

The E_{Dicho} index in its nature is an evenness index, therefore it has all the properties of a normal evenness index and is constrained between 0 and 1. The higher the variation in the number of steps we have to pass to come to the determination of the taxa, the lower is the E_{Dicho} index, and the asymptotic lowest value is 0. The highest value of 1 can be achieved in case of all the taxa having the same number of identification steps (Fig. _{Dicho} of the version ‘1.I’ is smaller than that of the version ‘1.V’, because the variation in the length of path of identification steps in the version ‘1.I’ is higher. Thus, the higher the E_{Dicho} index is, the “better” is the dichotomous key in the aspect of identification speed and in the aspect of right determination.

Let us consider five dichotomous keys as shown in the Figure

Here, the number of taxa (S) equals 8. The number of steps or paired statements (Thesis + Antithesis) for identification of each taxon, the total number of steps for identification of all the taxa, and the proportion of steps to identify each taxon are the data for calculation of H’_{Dicho} of the dichotomous key and are presented in Table

The calculation of H’_{Dicho} and E_{dicho} of five versions of the dichotomous key is presented in Table

Schematic presentation of a dichotomous key.

The data for calculation of H’_{Dicho} for the keys in Figure

Key version | The number of steps for identification of each taxon | The total number of steps for identification of all the taxa | The proportion of steps to identify each taxon |
---|---|---|---|

1.I | 1,2,3,4,5,6,7,7 | 35 | 1/35,2/35,3/35,4/35,5/35,6/35,7/35,7/35 |

1.II | 1,2,3,4,6,6,6,6 | 34 | 1/34,2/34,3/34,4/34,6/34,6/34,6/34,6/34 |

1.III | 1,2,4,4,5,5,5,5 | 31 | 1/31,2/31,4/31,4/31,5/31,5/31,5/31,5/31 |

1.IV | 2,2,3,3,4,4,4,4 | 26 | 2/26,2/26,3/26,3/26,4/26,4/26,4/26,4/26 |

1.V | 3,3,3,3,3,3,3,3 | 24 | 3/24,3/24,3/24,3/24,3/24,3/24,3/24,3/24 |

Calculation of H’_{Dicho} and E_{dicho}.

Key version | H’_{Dicho} |
E_{Dicho}= H’_{Dicho}/ln(8) |
---|---|---|

1.I | -{(1/35).ln(1/35)+(2/35).ln(2/35)+(3/35).ln(3/35)+(4/35).ln(4/35)+ |
0.937 |

1.II | -{(1/34).ln(1/34)+(2/34).ln(2/34)+(3/34).ln(3/34)+(4/34).ln(4/34)+ |
0.943 |

1.III | -{(1/31).ln(1/31)+(2/31).ln(2/31)+(4/31).ln(4/31)+(4/31).ln(4/31)+ |
0.959 |

1.IV | -{(2/26).ln(2/26)+(2/26).ln(2/26)+(3/26).ln(3/26)+(3/26).ln(3/26)+ (4/26).ln(4/26)+(4/26).ln(4/26)+(4/26).ln(4/26)+(4/26).ln(4/26)} | 0.983 |

1.V | -{(3/24).ln(3/24)+(3/24).ln(3/24)+(3/24).ln(3/24)+(3/24).ln(3/24)+ (3/24).ln(3/24)+(3/24).ln(3/24)+(3/24).ln(3/24)+(3/24).ln(3/24)} | 1.000 |

By using computer software it is possible to create many dichotomous keys for a group of taxa with the same set of pairs of dichotomous characters. It would be desirable to have a sound basis for choosing one or another key version. The E_{Dicho} index developed here is suitable for a quantitative evaluation of dichotomous keys. It can serve well as the mathematical basis for the task of choosing the dichotomous key with the best performance. Because the index is based on the evenness, it can be used to compare the identification keys of different sizes.

This work has been supported by the VAST04.06/16-17 project and the IEBR-UFZ joint research LEGATO project.