Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/9577
Title: Detection of suicidal ideation in clinical interviews for depression using natural language processing and machine learning: Cross-sectional study
Authors: Li, Tim M. H. 
Chen, Jie 
Law, Framenia O. C. 
Li, Chun Tung 
Chan, Ngan Yin 
Chan, Joey W. Y. 
Chau, Steven W. H. 
Liu, Yaping 
Li, Shirley Xin 
Zhang, Jihui 
Prof. LEUNG Kwong Sak 
Wing, Yun-Kwok 
Issue Date: 2023
Source: JMIR Medical Informatics, 2023, vol. 11, article no. e50221.
Journal: JMIR Medical Informatics 
Abstract: Background: Assessing patients’ suicide risk is challenging, especially among those who deny suicidal ideation. Primary care providers have poor agreement in screening suicide risk. Patients’ speech may provide more objective, language-based clues about their underlying suicidal ideation. Text analysis to detect suicide risk in depression is lacking in the literature. Objective: This study aimed to determine whether suicidal ideation can be detected via language features in clinical interviews for depression using natural language processing (NLP) and machine learning (ML). Methods: This cross-sectional study recruited 305 participants between October 2020 and May 2022 (mean age 53.0, SD 11.77 years; female: n=176, 57%), of which 197 had lifetime depression and 108 were healthy. This study was part of ongoing research on characterizing depression with a case-control design. In this study, 236 participants were nonsuicidal, while 56 and 13 had low and high suicide risks, respectively. The structured interview guide for the Hamilton Depression Rating Scale (HAMD) was adopted to assess suicide risk and depression severity. Suicide risk was clinician rated based on a suicide-related question (H11). The interviews were transcribed and the words in participants’ verbal responses were translated into psychologically meaningful categories using Linguistic Inquiry and Word Count (LIWC). Results: Ordinal logistic regression revealed significant suicide-related language features in participants’ responses to the HAMD questions. Increased use of anger words when talking about work and activities posed the highest suicide risk (odds ratio [OR] 2.91, 95% CI 1.22-8.55; P=.02). Random forest models demonstrated that text analysis of the direct responses to H11 was effective in identifying individuals with high suicide risk (AUC 0.76-0.89; P<.001) and detecting suicide risk in general, including both low and high suicide risk (AUC 0.83-0.92; P<.001). More importantly, suicide risk can be detected with satisfactory performance even without patients’ disclosure of suicidal ideation. Based on the response to the question on hypochondriasis, ML models were trained to identify individuals with high suicide risk (AUC 0.76; P<.001). Conclusions: This study examined the perspective of using NLP and ML to analyze the texts from clinical interviews for suicidality detection, which has the potential to provide more accurate and specific markers for suicidal ideation detection. The findings may pave the way for developing high-performance assessment of suicide risk for automated detection, including online chatbot-based interviews for universal screening.
Type: Peer Reviewed Journal Article
URI: http://hdl.handle.net/20.500.11861/9577
ISSN: 2291-9694
DOI: 10.2196/50221
Appears in Collections:Applied Data Science - Publication

Show full item record

Page view(s)

38
Last Week
1
Last month
checked on Dec 20, 2024

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.