Skip to content

Methodology research comparing statistical and ML methods of competing risks analysis

Notifications You must be signed in to change notification settings

james113001/Competing-Risks-Thesis

Repository files navigation

Competing-Risks-MSc-Thesis

Abstract

Survival analysis is one of the top means of studying heart disease or cancer. However, it is known that the progression of disease is complex, often interacting with the progression of other disease. For this reason, competing risks analysis are becoming more prevalent in medicine. However, until recently only two methods have been used, each with their own flaws. First is the cause-specific hazard function, which censors competing risks to estimate the cumulative incidence for the event of interest. This is not ideal given that it assumes competing events are independent, and is likely to provide an upward bias in predictions. Second is the subdistribution hazard, which accounts for the competing event and aims to reduce the bias inherent to the cause-specific hazard model. Both models assume proportional hazards, which also cannot always be true. Thus, nonparametric machine learning models have been proposed to overcome these rigid model assumptions and provide quality predictions.

The aim of this methodology research is to compare the predictive performance of some of these models, Random Survival Forest, Bayesian Additive Regression Trees, and DeepHit, against the classical methods of Cox PH and the subdistribution function on differing data set sizes. Five data sets were procured, with four being from R libraries and one being simulated. The models were assessed with the concordance index and the Brier score. Results indicate that although most models perform similarly across both metrics, RSF particularly stands out as a well-rounded model. Unfortunately, BART needs to be re-examined due to the limitations of this study. Further work can be done to test on larger data sets, tuning model parameters. Finally models capable of calculating variable importance will also be worth comparing with another metric.

image

Descriptive table of Melanoma data

About

Methodology research comparing statistical and ML methods of competing risks analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published