There are several methods for building prediction models. The wealth of currently available modeling techniques usually forces the researcher to judge, a priori, what will likely be the best method. Super learning (SL) is a methodology that facilitates this decision by combining all identified prediction algorithms pertinent for a particular prediction problem. SL generates a final model that is at least as good as any of the other models considered for predicting the outcome. The overarching aim of this work is to introduce SL to analysts and practitioners. This work compares the performance of logistic regression, penalized regression, random forests, deep learning neural networks, and SL to predict successful substance use disorders (SUD) treatment. A nationwide database including 99,013 SUD treatment patients was used. All algorithms were evaluated using the area under the receiver operating characteristic curve (AUC) in a test sample that was not included in the training sample used to fit the prediction models. AUC for the models ranged between 0.793 and 0.820. SL was superior to all but one of the algorithms compared. An explanation of SL steps is provided. SL is the first step in targeted learning, an analytic framework that yields double robust effect estimation and inference with fewer assumptions than the usual parametric methods. Different aspects of SL depending on the context, its function within the targeted learning framework, and the benefits of this methodology in the addiction field are discussed. © 2017 Acion et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
|Título:||Use of a machine learning framework to predict substance use disorder treatment success|
|Autor:||Acion, L.; Kelmansky, D.; Laan, M.D.V.; Sahker, E.; Jones, D.; Arndt, S.|
|Filiación:||Instituto de Cálculo, Facultad de Ciencias Exactas Y Naturales, Universidad de Buenos Aires, CONICET, Buenos Aires, Argentina|
Iowa Consortium for Substance Abuse Research and Evaluation, University of Iowa, Iowa City, IA, United States
Division of Biostatistics, University of California, Berkeley, CA, United States
Counseling Psychology Program, Department of Psychological and Quantitative Foundations, College of Education, University of Iowa, Iowa City, IA, United States
Department of Psychiatry, Roy J and Lucille A Carver College of Medicine, University of Iowa, Iowa City, IA, United States
Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA, United States
|Palabras clave:||adult; algorithm; area under the curve; Article; artificial neural network; controlled study; decision making; drug dependence; education; employment status; female; Hispanic; human; length of stay; machine learning; major clinical study; male; methodology; prediction; receiver operating characteristic; sensitivity analysis; substance abuse; super learning; treatment outcome; adolescent; computer assisted diagnosis; drug dependence; factual database; middle aged; prognosis; regression analysis; socioeconomics; young adult; Adolescent; Adult; Area Under Curve; Databases, Factual; Diagnosis, Computer-Assisted; Female; Humans; Length of Stay; Machine Learning; Male; Middle Aged; Neural Networks (Computer); Prognosis; Regression Analysis; ROC Curve; Socioeconomic Factors; Substance-Related Disorders; Treatment Outcome; Young Adult|
|Título revista:||PLoS ONE|
|Título revista abreviado:||PLoS ONE|
- Acion, L., Ramirez, M.R., Jorge, R.E., Arndt, S., Increased risk of alcohol and drug use among children from deployed military families (2013) Addiction, 108 (8), pp. 1418-1425. , https://doi.org/10.1111/add.12161, PMID: 23441867
- Alang, S.M., Sociodemographic Disparities Associated With Perceived Causes of Unmet Need for Mental Health Care (2015) Psychiatr Rehabil J, 38 (4), p. 293. , https://doi.org/10.1037/prj0000113, PMID: 25664758
- Glasheen, C., Pemberton, M.R., Lipari, R., Copello, E.A., Mattson, M.E., Binge drinking and the risk of suicidal thoughts, plans, and attempts (2015) Addictive Behaviors, 43 (42). , https://doi.org/10.1016/j.addbeh.2014.12.005, 9. PMID: 25553510
- Sahker, E., Acion, L., Arndt, S., National analysis of differences among substance abuse treatment outcomes: College student and nonstudent emerging adults (2015) Journal of American College Health, 63 (2), pp. 118-124. , https://doi.org/10.1080/07448481.2014.990970, PMID: 25470217
- Gowin, J.L., Ball, T.M., Wittmann, M., Tapert, S.F., Paulus, M.P., Individualized relapse prediction: Personality measures and striatal and insular activity during reward-processing robustly predict relapse (2015) Drug and alcohol dependence, 152, pp. 93-101. , https://doi.org/10.1016/j.drugalcdep.2015.04.018, PMID: 25977206
- Launay, C., Rivière, H., Kabeshova, A., Beauchet, O., Predicting prolonged length of hospital stay in older emergency department users: Use of a novel analysis method, the Artificial Neural Network (2015) European Journal of Internal Medicine, 26 (7), pp. 478-482. , https://doi.org/10.1016/j.ejim.2015.06.002, PMID: 26142183
- Pflueger, M.O., Franke, I., Graf, M., Hachtel, H., Predicting general criminal recidivism in mentally disordered offenders using a random forest approach (2015) BMC psychiatry, 15 (1), p. 1
- Van Der, L.M.J., Polley, E.C., Hubbard, A.E., (2007) Super Learner. Statistical Applications in Genetics and Molecular Biology, 6 (1)
- Polley, E.C., Rose, S., Van Der Laan, M.J., Super learning (2011) Targeted Learning: Springer, pp. 43-66
- Grant, B.F., Goldstein, R.B., Saha, T.D., Chou, S.P., Jung, J., Zhang, H., Epidemiology of DSM-5 alcohol use disorder: Results from the National Epidemiologic Survey on Alcohol and Related Conditions III (2015) JAMA psychiatry, 72 (8), pp. 757-766. , https://doi.org/10.1001/jamapsychiatry.2015.0584, PMID: 26039070
- Arndt, S., Acion, L., White, K., How the states stack up: Disparities in substance abuse outpatient treatment completion rates for minorities (2013) Drug and alcohol dependence, 132 (3), pp. 547-554. , https://doi.org/10.1016/j.drugalcdep.2013.03.015, PMID: 23664124
- Sahker, E., Toussaint, M.N., Ramirez, M., Ali, S.R., Arndt, S., Evaluating racial disparity in referral source and successful completion of substance abuse treatment (2015) Addictive behaviors, 48, pp. 25-29. , https://doi.org/10.1016/j.addbeh.2015.04.006, PMID: 25935719
- Wells, K., Klap, R., Koike, A., Sherbourne, C., Ethnic disparities in unmet need for alcoholism, drug abuse, and mental health care (2001) American Journal of Psychiatry, 158 (12), pp. 2027-2032. , https://doi.org/10.1176/appi.ajp.158.12.2027, PMID: 11729020
- Compton, W.M., III, Cottler, L.B., Jacobs, J.L., Ben-Abdallah, A., Spitznagel, E.L., The role of psychiatric disorders in predicting drug dependence treatment outcomes (2003) American Journal of Psychiatry, 160 (5), pp. 890-895. , https://doi.org/10.1176/appi.ajp.160.5.890, PMID: 12727692
- Sahker, E., McCabe, J.E., Arndt, S., Differences in successful treatment completion among pregnant and non-pregnant American women (2015) Archives of Women's Mental Health, 19 (1), pp. 79-86. , https://doi.org/10.1007/s00737-015-0520-5, PMID: 25824855
- Simpson, D.D., Joe, G.W., Motivation as a predictor of early dropout from drug abuse treatment (1993) Psychotherapy: Theory, research, practice, training, 30 (2), p. 357
- Treatment Episode Data Set-Discharges (TEDS-D) -Concatenated, 2006 to 2011 (2014) Inter-University Consortium for Political and Social Research ICPSR, , United StatesDepartment of Health and Human Services. Substance Abuse and Mental Health Services Administration. Office of Applied Statistics distributor
- LeDell, E.E., (2015) Scalable Ensemble Learning and Computationally Efficient Variance Estimation: University of California, , Berkeley
- Zarkin, G.A., Dunlap, L.J., Bray, J.W., Wechsberg, W.M., The effect of treatment completion and length of stay on employment and crime in outpatient drug-free treatment (2002) Journal of Substance Abuse Treatment, 23 (4), pp. 261-271. , PMID: 12495788
- Garnick, D.W., Lee, M.T., Horgan, C.M., Acevedo, A., Workgroup WCPS. Adapting Washington Circle performance measures for public sector substance abuse treatment systems (2009) Journal of Substance Abuse Treatment, 36 (3), pp. 265-277. , https://doi.org/10.1016/j.jsat.2008.06.008, PMID: 18722075
- Evans, E., Li, L., Hser, Y.-I., Client and program factors associated with dropout from court mandated drug treatment (2009) Evaluation and program planning, 32 (3), pp. 204-212. , https://doi.org/10.1016/j.evalprogplan.2008.12.003, PMID: 19150133
- TOPPS-II Interstate Cooperative Study Group. Drug treatment completion and post-discharge employment in the TOPPS-II Interstate Cooperative Study (2003) Journal of Substance Abuse Treatment, 25 (1), pp. 9-18. , PMID: 14512103
- Sahker, E., Yeung, C., Loh, Y., Park, S., Arndt, S., Asian American and Pacific Islander Substance Use Treatment Admission Trends (2017) Drug and Alcohol Dependence, 171, pp. 1-8. , https://doi.org/10.1016/j.drugalcdep.2016.11.022, PMID: 27988403
- Marzell, M.S.E., Pro, G., Arndt, S., A brief report on Hispanic youth marijuana use: Trends in substance abuse treatment admissions in the United States (2016) Journal of Ethnicity and Substance Abuse, pp. 1-10
- (2012) The NSDUH Report: Need for and Receipt of Substance Use Treatment among Hispanics, , Substance Abuse and Mental Health Services Administration Rockville, MD
- (2012) The NSDUH Report: Need for and Receipt of Substance Use Treatment among American Indians or Alaska Natives, , Substance Abuse and Mental Health Services Administration Rockville, MD
- (2013) For and Receipt of Substance Use Treatment among Blacks, , Substance Abuse and Mental Health Services Administration
- Acion, L., Peterson, J.J., Temple, S., Arndt, S., Probabilistic index: An intuitive non-parametric approach to measuring the size of treatment effects (2006) Statistics in Medicine, 25 (4), pp. 591-602. , https://doi.org/10.1002/sim.2256, PMID: 16143965
- Friedman, J., Hastie, T., Tibshirani, R., (2001) The Elements of Statistical Learning: Springer Series in Statistics Springer, Berlin
- Breiman, L., (2001) Random Forests. Machine Learning, 45 (1), pp. 5-32
- Bengio, Y., Learning deep architectures for AI (2009) Foundations and Trends® in Machine Learning, 2 (1), pp. 1-127
- Wolpert, D.H., Stacked generalization (1992) Neural networks, 5 (2), pp. 241-259
- Van Der Laan, M.J., Rose, S., (2011) Targeted Learning: Causal Inference for Observational and Experimental Data: Springer Science & Business Media
- R: A language and environment for statistical computing (2016) R Foundation for Statistical Computing, Vienna, Austria, , R Core Team
- Aiello, S.K., Tom; Maj, Petr; with contributions from the H2O. Ai team. H2o: R Interface for H2O (2015) R Package Version 3.8, , 2.2
- LeDell, E.E., H2oEnsemble: H2O Ensemble Learning (2016) R Package Version 01.8
- DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L., Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach (1988) Biometrics, pp. 837-845. , PMID: 3203132
- Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., PROC: An open-source package for R and S+ to analyze and compare ROC curves (2011) BMC Bioinformatics, 12 (1), p. 1
- (2016) Research Instructions for NIH and Other PHS Agencies, , https://grants.nih.gov/grants/how-to-apply-application-guide/forms-d/research-forms-d.pdf, NationalInstitutes of Health cited 2017 01/24/2017 Available from
- Rose, S., Mortality risk score prediction in an elderly population using machine learning (2013) American journal of epidemiology, 177 (5), pp. 443-452. , https://doi.org/10.1093/aje/kws241, PMID: 23364879
- Brierley, P., Vogel, D., Axelrod, R., Network Heritage ProviderRoundHealth Prize 1 Milestone Prize: How we did it-Team (2011) Market Makers
- Rose, S., Van Der Laan, M.J., (2011) Why TMLE? Targeted Learning: Springer, pp. 101-118
- Rose, S., Targeted learning for variable importance (2016) Handbook of Big Data, 411
---------- APA ----------Acion, L., Kelmansky, D., Laan, M.D.V., Sahker, E., Jones, D. & Arndt, S.
. Use of a machine learning framework to predict substance use disorder treatment success. PLoS ONE, 12(4).
---------- CHICAGO ----------Acion, L., Kelmansky, D., Laan, M.D.V., Sahker, E., Jones, D., Arndt, S.
"Use of a machine learning framework to predict substance use disorder treatment success"
. PLoS ONE 12, no. 4
---------- MLA ----------Acion, L., Kelmansky, D., Laan, M.D.V., Sahker, E., Jones, D., Arndt, S.
"Use of a machine learning framework to predict substance use disorder treatment success"
. PLoS ONE, vol. 12, no. 4, 2017.
---------- VANCOUVER ----------Acion, L., Kelmansky, D., Laan, M.D.V., Sahker, E., Jones, D., Arndt, S. Use of a machine learning framework to predict substance use disorder treatment success. PLoS ONE. 2017;12(4).