In silico evaluation and selection of the best 16S rRNA gene primers for use in next-generation sequencing to detect oral bacteria and archaea

Background: Sequencing has been widely used to study the composition of the oral microbiome present in various health conditions. The extent of the coverage of the 16S rRNA gene primers employed for this purpose has not, however, been evaluated in silico using oral-specific databases. This paper analyses these primers using two databases containing 16S rRNA sequences from bacteria and archaea found in the human mouth and describes some of the best primers for each domain. Results: A total of 369 distinct individual primers were identified from sequencing studies of the oral microbiome and other ecosystems. These were evaluated against a database reported in the literature of 16S rRNA sequences obtained from oral bacteria, which was modified by our group, and a self-created oral archaea database. Both databases contained the genomic variants detected for each included species. Primers were evaluated at the variant and species levels, and those with a species coverage (SC) ≥75.00% were selected for the pair analyses. All possible combinations of the forward and reverse primers were identified, with the resulting 4638 primer pairs also evaluated using the two databases. The best bacteria-specific pairs targeted the 3-4, 4-7, and 3-7 16S rRNA gene regions, with SC levels of 98.83–97.14%; meanwhile, the optimum archaea-specific primer pairs amplified regions 5-6, 3-6, and 3-6, with SC estimates of 95.88%. Finally, the best pairs for detecting both domains targeted regions 4-5, 3-5, and 5-9, and produced SC values of 95.71–94.54% and 99.48–96.91% for bacteria and archaea, respectively. Conclusions: Given the three amplicon length categories (100-300, 301-600, and >600 base pairs), the primer pairs with the best coverage values for detecting oral bacteria were as follows: KP_F048-OP_R043 (region 3-4; primer pair position for Escherichia coli J01859.1: 342-529), KP_F051-OP_R030 (4-7; 514-1079), and KP_F048-OP_R030 (3-7; 342-1079). For detecting oral archaea, these were as follows: OP_F066-KP_R013 (5-6; 784-undefined), KP_F020-KP_R013 (3-6; 518-undefined), and OP_F114-KP_R013 (3-6; 340-undefined). Lastly, for detecting both domains jointly they were KP_F020-KP_R032 (4-5; 518-801), OP_F114-KP_R031 (3-5; 340-801), and OP_F066-OP_R121 (5-9; 784-1405). The primer pairs with the best coverage identified herein are not among those described most widely in the oral microbiome literature.

Palabras clave: 16S rRNA gene, primer, coverage, bacteria, archaea, database