Redundancy of 16S rRNA Genes in Genomes from Oral Bacteria and Archaea

Objectives: This investigation evaluated the number of 16S rRNA genes present in the complete genomes of bacteria and archaea inhabiting the human mouth. Methods: We analysed 710 complete genomes (518 bacteria, 192 archaea) taken from the NCBI nucleotide database and extracted their 16S rRNA genes using Edgar’s algorithm. The number of 16S rRNA genes/genome and repeated sequences/genome (variants) were calculated at the strain level. The same data were generated for the higher taxonomy levels and the mean number of 16S rRNA genes was obtained using the NumPy and pandas modules from Python. Results: The oral bacteria genomes had a mean of 4.5 intragenomic 16S rRNA genes, an average of 2.6 of which were variants. Of 507 bacterial strains analysed, 126 had four genes/genome, 91 had six, and 83 had five; conversely, 30 strains had one and 73 ≥7. The maximum was 11 genes/genome in five Bacillus anthracis strains. More than half the bacteria strains had either one or two gene variants/genome. The oral archaea genomes had a mean of 1.96 intragenomic 16S rRNA genes, with an average of 1.43 variants. Of the 177 archaeal strains analysed, 80 had one gene/genome, 58 had three, 33 had two and seven had ≥4. The maximum was five genes/genome in Methanococcus maripaludis and Sulfolobus acidocaldarius. Most archaeal strains had either one or two gene variants/genome. Conclusions: The number of intragenomic 16S rRNA genes in the oral bacteria ranged from one to 11, with four or more being most common. The number of genes/genome in the oral archaea was between one and five, with more than half of the strains having at least two. Intragenomic 16S rRNA gene redundancy must be considered to ensure the accurate interpretation of the data on microbial diversity.

keywords: bioinformatics