A new ultra-fast and low-power machine learning algorithm for processing large volumes of data is created at CiTIUS

The new solution improves by orders of magnitude on one of the benchmark methods for data classification using *Machine Learning* techniques. The work has just been published in the journal “IEEE Transactions on Pattern Analysis and Machine Intelligence”, the leading publication among the 140 most cited scientific journals in the field of Computer Science–Artificial Intelligence.

Hardly anyone is unaware by now that Artificial Intelligence (and more specifically “Machine Learning”), is experiencing its moment of maximum splendor. The research carried out in this field in recent years, together with the enormous computing power achieved by computers and the vast amount of data available to train algorithms, has revolutionized our lives and every sector of the economy.

None of this would have been possible without the curiosity and effort of the scientific community, which over the past decades has contributed to the progressive sophistication of machine learning techniques, an increasingly precise research area that makes it possible to train machines to solve very diverse problems without programming them specifically for each situation.

Many strategies have been proposed to achieve this; one of the best known and most powerful nowadays is the so‑called “Support Vector Machines” (SVM), created by scientists Isabelle Guyon, Bernhard Schölkopf and Vladimir Vapnik in the 1990s. This is a contribution of paramount importance, recognized, among other awards, with the BBVA Foundation Frontiers of Knowledge Award in 2020.

Essentially, Support Vector Machines (SVM) are a method for classifying data sets, with an accuracy practically identical to that of humans and, in some cases, even higher. SVMs are one of the best-performing classifiers today, and they have amply demonstrated their effectiveness in recognizing data of very different types: from texts, voices and human faces to cancer cells or fraudulent uses of a credit card.

However, there is no infallible method, nor one that is best in every case or for all circumstances. SVMs have proven to be considerably slow when tackling problems in which the number of data points is very large, which is particularly problematic when working in Big Data environments; on the other hand, their memory consumption is sometimes untenable, which can in practice rule out this kind of solution.

Now, research carried out at CiTIUS (Centro Singular de Investigación en Tecnoloxías Intelixentes da Universidade de Santiago de Compostela – USC) has made it possible to overcome these limitations with the development of a new “Fast Support Vector Classifier(FSVC), which offers numerous advantages over the standard method. First, it is much fasterbetween 10 and 100 times— than traditional approaches. In addition, this new classifier operates with much less memory, «thanks to which it is able to deliver optimal solutions with far less powerful and less expensive computers», explains Manuel Fernández Delgado, who led the research.

The CiTIUS researchers highlight this aspect as one of the essential contributions of the work: «memory savings are very important», says Ziad Akram, a predoctoral researcher at CiTIUS and first author of the article, «since by improving efficiency we can solve, with much more modest equipment, problems for which we would normally need a supercomputer». «All this translates into a huge reduction in cost and energy consumption», stresses his colleague Eva Cernadas, a principal researcher at the centre and co‑author of the paper.

Another of the architects of the work, Senén Barro, points out that «one of the keys was to develop an analytical solution for the design of classifiers, which avoids using iterative learning methods on data sets, the main cause of the computational inefficiency and resource consumption of machine learning». The CiTIUS scientific director explains that «with this new approach, it is as if we could memorize a huge set of cases (faces, for example) all at once, without having to see them over and over again until they are imprinted on our memory». «The speed and the savings in memory and computing capacity are enormous, which means saving money and, even more importantly, reducing the carbon footprint», Barro concludes.

References
Z. A. Ali Hammouri, M. F. Delgado, E. Cernadas and S. Barro, "Fast SVC for large-scale classification problems". IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2021.3085969.