The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance

The purpose of this review is to clarify the concepts of bias, precision and accuracy as they are commonly defined in the biostatistical literature, with our focus on the use of these concepts in quantitatively testing the performance of point estimators (specifically species richness estimators). We first describe the general concepts underlying bias, precision and accuracy, and then describe a number of commonly used unscaled and scaled performance measures of bias, precision and accuracy (e.g. mean error, variance, standard deviation, mean square error, root mean square error, mean absolute error, and all their scaled counterparts) which may be used to evaluate estimator performance. We also provide mathematical formulas and a worked example for most performance measures. Since every measure of estimator performance should be viewed as suggestive, not prescriptive, we also mention several other performance measures that have been used by biostatisticians or ecologists. We then outline several guidelines of how to test the performance of species richness estimators: the detailed description of data simulation models and resampling schemes, the use of real and simulated data sets on as many different estimators as possible, mathematical expressions for all estimators and performance measures, and the presentation of results for each scaled performance measure in numerical tables with increasing levels of sampling effort. We finish with a literature review of promising new research related to species richness estimation, and summarize the results of 14 studies that compared estimator performance, which confirm that with most data sets, non-parametric estimators (mostly the Chao and jackknife estimators) perform better than other estimators, e.g. curve models or fitting species-abundance distributions.