Limitations of Biodiversity Databases: Case Study on Seed-Plant Diversity in Tenerife, Canary Islands

Databases on the distribution of species can be used to describe the geographic patterns of biodiversity. Nevertheless, they have limitations. We studied three of these limitations: (1) inadequacy of raw data to describe richness patterns due to sampling bias, (2) lack of survey effort assessment (and lack of exhaustiveness in compiling data about survey effort), and (3) lack of coverage of the geographic and environmental variations that affect the distribution of organisms. We used a biodiversity database (BIOTA-Canarias) to analyze richness data from a well-known group (seed plants) in an intensively surveyed area (Tenerife Island). Observed richness and survey effort were highly correlated. Species accumulation curves could not be used to determine survey effort because data digitalization was not exhaustive, so we identified well-sampled sites based on observed richness to sampling effort ratios. We also developed a predictive model based on the data from well-sampled sites and analyzed the origin of the geographic errors in the obtained extrapolation by means of a geographically constrained cross-validation. The spatial patterns of seed-plant species richness obtained from BIOTA-Canarias data were incomplete and biased. Therefore, some improvements are needed to use this database (and many others) in biodiversity studies. We propose a protocol that includes controls on data quality, improvements on data digitalization and survey design to improve data quality, and some alternative data analysis strategies that will provide a reliable picture of biodiversity patterns.