A Framework for the Automatic Description of Healthcare Processes in Natural Language: Application in an Aortic Stenosis Integrated Care Process

In this paper, we propose a framework for the automatic generation of natural language descriptions of healthcare processes using quantitative and qualitative data and medical expert knowledge. Inspired by the demand of novel ways of conveying process mining analysis results of healthcare processes [1], our framework is based on the most widely used Data-To-Text (D2T) pipeline [2] (a sub-discipline of the Natural Language Generation field aimed at generating texts from numerical input data) and on the usage of process mining techniques. Backed by a general model that handles process data this framework is able to quantify attributes in time during process life-span, recall temporal relations and waiting times between events and its possible causes and compare case (patient) attributes between groups, among other features. Through integrating fuzzy quantification techniques, our framework is able to represent relevant quantitative process information with some degree of uncertainty present on it and describe it in natural language involving uncertain terms. A real application over the Aortic Stenosis Integrated Care Process of the University Hospital of Santiago de Compostela is presented, showcasing the potential of our framework for providing natural language descriptions of healthcare processes addressed to medical experts. Following the standards of D2T systems, manual human validation was conducted for the generated natural language descriptions by fifteen medical experts in Cardiology. Validation results are very positive, since a global average of 4.07/5.00 was achieved for questions related to understandability, usefulness and impact of the natural language descriptions on the medical experts work. More precisely, results indicate i) that the modality which conveyed the information most efficiently was natural language ii) a very clear preference of texts over the usual graphic representation of process information as the way for conveying information to experts (4.28/5.00), and iii) natural language descriptions provide relevant and useful information about the process, allowing for its improvement.

keywords: Healthcare processes, Process mining, Process understanding, Natural language generation, Data to text systems, Fuzzy linguistic terms