The small subunit ribosomal RNA (SSU rRNA) gene is a widely used molecular marker to study the diversity of life. Sequencing of SSU rRNA gene amplicons has become a standard approach for the investigation of the ecology and diversity of microbes.
However, a well-curated database is necessary for correct classification of these data. While available for many groups of Bacteria and Archaea, such reference databases are absent for most eukaryotes.
The primary goal of the EukRef project (eukref.org ) is to close this gap and generate well-curated reference databases for major groups of eukaryotes, especially protists. Here we present a set of EukRef-curated databases for the excavate protists-a large assemblage that includes numerous taxa with divergent SSU rRNA gene sequences, which are prone to misclassification.
We identified 6121 sequences, 625 of which were obtained from cultures, 3053 from cell isolations or enrichments and 2419 from environmental samples. We have corrected the classification for the majority of these curated sequences.
The resulting publicly available databases will provide phylogenetically based standards for the improved identification of excavates in ecological and microbiome studies, as well as resources to classify new discoveries in excavate diversity.