Machine learning-based prediction of suicide risk: A long-term OCD cohort follow-up

Background: Obsessive Compulsive Disorder (OCD) is associated with an increased suicide risk in comparison with general population. Among the suicide risk factors we can highlight psychiatric comorbidities, childhood trauma, obsessive-compulsive severity and a combination of other risk factors related to social and family environment. This study aims to develop and validate a reliable machine learning algorithm to predict suicide risk in OCD patients, based on a combination of clinical and sociodemographic variables. Method: A cohort of 199 OCD patients was followed for an average of 17.8 years (follow-up time range of between 2–28 years) in an OCD-specialized unit. Suicide-related behaviors were documented in the medical records of each participant at the time of occurrence. For the present study, a specialized psychiatrist systematically reviewed all clinical records to extract detailed information on the type of suicide-related behaviors experienced by each participant. Clinical data, including comorbidities, Y-BOCS and CTQ, were collected and used to train supervised machine learning models. Results: The best-performing model included three predictive variables: family history of suicide, affective comorbidities, and substance use. This model achieved a sensitivity of 71.4%, specificity of 74.4%, F1-score=62.7% and area under ROC curve of 0.80, demonstrating moderate predictive capability. OCD severity and childhood trauma did not significantly enhance prediction performance. Conclusion: This study highlights the potential of supervised machine learning in identifying suicide risk in OCD patients based on commonly-collected clinical variables. The presence of any of the three predictors that conform the best-performing model (family history of suicide, affective comorbidities, and substance use) is sufficient to detect the presence of suicide-behavior risk. Those predictors are routinely assessed in clinical practice, making this model a feasible instrument for early risk detection. Further research with larger and more diverse cohorts is needed to refine predictive accuracy and integrate additional biomarkers for improved suicide risk stratification.

keywords: Child trauma, OCD, Supervised Machine Learning, prediction of risk suicide,