Fall 2025 theses and dissertations (non-restricted) will be available in ERA on November 17, 2025.

Leveraging Natural language Processing and Machine Learning Techniques to find Frailty Deficits from Clinical Dataset

Loading...
Thumbnail Image

Institution

http://id.loc.gov/authorities/names/n79058482

Degree Level

Master's

Degree

Master of Science

Department

Department of Mathematical and Statistical Sciences

Specialization

Statistical Machine Learning

Supervisor / Co-Supervisor and Their Department(s)

Citation for Previous Publication

Link to Related Item

Abstract

Introduction Frailty is a syndrome that is often associated with aging. It can be identified through specific frailty scales or a comprehensive assessment by a healthcare provider. In Alberta, it appears that there are no specific billing or diagnostic codes for frailty. So, healthcare providers may use specific assessments or codes related to conditions such as muscle weakness or decreased physical activity to identify frailty. Purpose This project aims to leverage Natural Language Processing algorithms to extract frailty keywords from structured and Unstructured clinical datasets to identify frailty deficits and classify patients into frail and non-frail classes using Machine Learning algorithms. Methods The dataset included 450 patients over the age of 60, medical information related to diseases, and clinical frailty scales. We first clean medical notes using NLP techniques and removing negation terms, then extract keywords from clinical notes and structured datasets, and finally, we use resampling techniques to deal with imbalanced clinical datasets, and we feed these extracted keywords into machine learning classifiers to classify patients as frail or not frail. Results There are many different types of machine learning classifiers that have been used for this task, Random Forest and Decision Three with 0.95 performed better than LR, KNN, NB, SVM, and neural network models. Conclusion Natural Language Processing algorithms can effectively extract frailty keywords using Electronic Medical Record (EMR) notes. Moreover, comparing the results shows that using both structured and unstructured data gives better results than using only structured data.

Item Type

http://purl.org/coar/resource_type/c_46ec

Alternative

License

Other License Text / Link

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

en

Location

Time Period

Source