On Fair Performance Comparison between Random Survival Forest and Cox Regression: An Example of Colorectal Cancer Study

Random Forest (RF), a mostly model-free and robust machine learning method, has been successfully applied to right-censored survival data, under the name of Random Survival Forest (RSF). However, RF/RSF has its distinct strategies in classification and prediction. First, it is an ensemble classifier...

Olles dieđut

Furkejuvvon:
Bibliográfalaš dieđut
Váldodahkkit: Sirin Cetin (Dahkki), Ayse Ulgen (Dahkki), Isa Dede (Dahkki), Wentian Li (Dahkki)
Materiálatiipa: Girji
Almmustuhtton: Ital Publication, 2021-03-01T00:00:00Z.
Fáttát:
Liŋkkat:Connect to this object online.
Fáddágilkorat: Lasit fáddágilkoriid
Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_32c881c82fd84d3eb1a3dd9292afe1fe
042 |a dc 
100 1 0 |a Sirin Cetin  |e author 
700 1 0 |a Ayse Ulgen  |e author 
700 1 0 |a Isa Dede  |e author 
700 1 0 |a Wentian Li  |e author 
245 0 0 |a On Fair Performance Comparison between Random Survival Forest and Cox Regression: An Example of Colorectal Cancer Study 
260 |b Ital Publication,   |c 2021-03-01T00:00:00Z. 
500 |a 2704-9833 
500 |a 10.28991/SciMedJ-2021-0301-9 
520 |a Random Forest (RF), a mostly model-free and robust machine learning method, has been successfully applied to right-censored survival data, under the name of Random Survival Forest (RSF). However, RF/RSF has its distinct strategies in classification and prediction. First, it is an ensemble classifier and its performance is an average of multiple rounds of data fitting. Second, the training set is a bootstrap (sampling with replacement) generated set with repeated used of roughly 2/3 of all samples and testing set consists of those not used (out of bag samples). Both features are not intrinsic to Cox regression or other single classifiers. Not considering these two features could potentially lead to a partial comparison between the performance of the two methods. By using a colorectal survival dataset, we illustrate the problems of using k-fold cross-validation, using only one resampling without an ensemble average, and using the whole dataset for both fitting and testing, in Cox regression, when comparing with RSF. We provide a more accessible R code for simple calculation of discordance index (D-index) and unweighted integrated Brier score (IBS) for Cox regression, and unweighted IBS for RSF.   Doi: 10.28991/SciMedJ-2021-0301-9 Full Text: PDF 
546 |a EN 
690 |a random survival forest 
690 |a cox regression 
690 |a machine learning 
690 |a brier's score 
690 |a discordance index 
690 |a colorectal cancer. 
690 |a Neoplasms. Tumors. Oncology. Including cancer and carcinogens 
690 |a RC254-282 
690 |a Public aspects of medicine 
690 |a RA1-1270 
655 7 |a article  |2 local 
786 0 |n SciMedicine Journal, Vol 3, Iss 1, Pp 66-76 (2021) 
787 0 |n https://www.scimedjournal.org/index.php/SMJ/article/view/343 
787 0 |n https://doaj.org/toc/2704-9833 
856 4 1 |u https://doaj.org/article/32c881c82fd84d3eb1a3dd9292afe1fe  |z Connect to this object online.