In-Silico Detection of Oral Prokaryotic Species With Highly Similar 16S rRNA Sequence Segments Using Different Primer Pairs

Although clustering by operational taxonomic units (OTUs) is widely used in the oral microbial literature, no research has specifically evaluated the extent of the limitations of this sequence clustering-based method in the oral microbiome. Consequently, our objectives were to: 1) evaluate in-silico the coverage of a set of previously selected primer pairs to detect oral species having 16S rRNA sequence segments with ≥97% similarity; 2) describe oral species with highly similar sequence segments and determine whether they belong to distinct genera or other higher taxonomic ranks. Thirty-nine primer pairs were employed to obtain the in-silico amplicons from the complete genomes of 186 bacterial and 135 archaeal species. Each fasta file for the same primer pair was inserted as subject and query in BLASTN for obtaining the similarity percentage between amplicons belonging to different oral species. Amplicons with 100% alignment coverage of the query sequences and with an amplicon similarity value ≥97% (ASI97) were selected. For each primer, the species coverage with no ASI97 (SC-NASI97) was calculated. Based on the SC-NASI97 parameter, the best primer pairs were OP_F053-KP_R020 for bacteria (region V1-V3; primer pair position for Escherichia coli J01859.1: 9-356); KP_F018-KP_R002 for archaea (V4; undefined-532); and OP_F114-KP_R031 for both (V3-V5; 340-801). Around 80% of the oral-bacteria and oral-archaea species analyzed had an ASI97 with at least one other species. These very similar species play different roles in the oral microbiota and belong to bacterial genera such as Campylobacter, Rothia, Streptococcus and Tannerella, and archaeal genera such as Halovivax, Methanosarcina and Methanosalsum. Moreover, ~20% and ~30% of these two-by-two similarity relationships were established between species from different bacterial and archaeal genera, respectively. Even taxa from distinct families, orders, and classes could be grouped in the same possible OTU. Consequently, regardless of the primer pair used, sequence clustering with a 97% similarity provides an inaccurate description of oral-bacterial and oral-archaeal species, which can greatly affect microbial diversity parameters. As a result, OTU clustering conditions the credibility of associations between some oral species and certain health and disease conditions. This significantly limits the comparability of the microbial diversity findings reported in oral microbiome literature.

keywords: computational biology, DNA primers, genes, high-throughput nucleotide sequencing