Training atomic neural networks using fragment-based data generated in virtual reality

The ability to understand and engineer molecular structures relies on having accurate descriptions of the energy as a function of atomic coordinates. Here, we outline a new paradigm for deriving energy functions of hyperdimensional molecular systems, which involves generating data for low-dimensional systems in virtual reality (VR) to then efficiently train atomic neural networks (ANNs). This generates high-quality data for specific areas of interest within the hyperdimensional space that characterizes a molecule’s potential energy surface (PES). We demonstrate the utility of this approach by gathering data within VR to train ANNs on chemical reactions involving fewer than eight heavy atoms. This strategy enables us to predict the energies of much higher-dimensional systems, e.g., containing nearly 100 atoms. Training on datasets containing only 15k geometries, this approach generates mean absolute errors around 2 kcal mol−1. This represents one of the first times that an ANN-PES for a large reactive radical has been generated using such a small dataset. Our results suggest that VR enables the intelligent curation of high-quality data, which accelerates the learning process.

keywords: virtual reality