In the era of precision medicine, an important issue is to stratify cancer patients into subgroups for precise treatment. Owing to clinical responses of patients to drugs being rare or absent, large-scale drug sensitivity screens in cell lines were used to identify clinically meaningful gene-drug interactions. However, the prediction of drug response of cancer patients remains to be improved. Iorio and colleagues published valuable results and data (GDSC), which included the drug sensitivity scores and omics data of 1000+ cell lines treated by 265 drugs. They fitted comprehensive logic models, but they limited the number of variables to four at most.
To predict the clinical response of cancer patients, we first performed variable selection and transfer learning, then we trained the parameters of the classifiers (models) using omics data in GDSC. Next, we compared the trained classifiers using external data sets, which included gene expression data and survival of cancer patients treated by nine drugs. The validations show that k-nearest neighbor outperforms several penalized regression models, and support vector machines.
Finally, we show that machine learning-based methods performed well in the prediction of response to immunotherapy, e.g., for atezolizumab (a PD-L1 inhibitor), a classifier trained by data in urothelial cancer achieved ~0.70-0.75 AUC on external test sets in urothelial and renal cancers.
Keywords: cancer, drug, immunotherapy, machine learning, omics data, regression, variable selection.
Prof. Shieh’s team has worked on computational approaches to reveal prognostic/prediction biomarkers for various cancer types. In the past three years, her team has focused prediction of the response of cancer patients to targeted and chemotherapies, and immunotherapies. They developed machine-learning methods using multi-omics data.