Can Machine Learning Algorithms using General Labs Predict Clinical Remission and Mucosal Healing for Patients on Vedolizumab?

Proposal
1136

Title of Proposed Research

Lead Researcher

Peter Higgins

Affiliation

University of Michigan

Funding Source

Grant Support: AKW's research is funded by a VA HSR&D CDA-2 Career Development Award 1IK2HX000775. PDRH's research is supported by NIH R01 GM097117.

Potential Conflicts of Interest

Higgins, Waljee, and Zhu currently are inventors of the following patent held by the Regents of the University of Michigan:
Algorithms to predict clinical response, adherence, and shunting with thiopurines
Assignee: The Regents of the University of Michigan (Ann Arbor, MI)
Family ID: 40028417
Appl. No.: 11/804,366
Filed: May 18, 2007

Data Sharing Agreement Date

13 August 2015

Lay Summary

Background:
Vedolizumab has been proven to be an effective therapy for patients with inflammatory bowel disease. However, it has been observed to be slow-acting. Machine learning applied to common lab results, including the CBC with differential and the comprehensive chemistry panel (plus age in days at time of blood draw), can predict the clinical remission in patients with inflammatory bowel disease (1). The ability to predict which patients will respond to vedolizumab therapy early on in treatment would be an essential tool to save patients both time and money.

Objective:
Our primary objective is to develop a machine learning model to predict at week 6 which Crohn's patients will reach clinical remission as determined by CDAI at week 52 and which ulcerative colitis patients will reach mucosal healing at week 52.

Study Design:

For C13007:
We will build a Machine Learning model using as predictors, all lab results from weeks 0 and 6, and as the outcome, clinical remission results from week 52. This model will determine the probability that a patient will reach clinical remission at week 52 based on the change in their blood work from week 0 to 6.

We will also build a parallel model using as predictors, all lab results from weeks 0 and 6, and as the outcome, corticosteroid-free remission results from week 52.

For C13006:
We will build a Machine Learning model using as predictors, all lab results from weeks 0 and 6, and as the outcome, mucosal healing results from week 52. This model will determine the probability that a patient will achieve mucosal healing at week 52 based on the change in their blood work from week 0 to 6.

Participants:
Subjects randomized to active vedolizumab (both Q4w and Q8w) in the C13006 trial and the C13007 trial.

Main Outcome Measure(s):
1)   The ability of Machine Learning to predict at week 6 which patients will achieve mucosal healing at week 52 using data from C13006.
2)   The ability of Machine Learning to predict at week 6 which patients will reach clinical remission at week 52 using data from C13007.

Statistical Analysis:

We want to measure the predictive value of the models. We will determine the cut point that produces the largest sum of the positive predictive value (PPV) and the negative predictive value (NPV).

We will use a Student's T test to compare the following:
1.)   The sum of the PPV and NPV for the UC model estimated by 100 bootstrap replicates in the 30% training set vs. the null hypothesis that the sum of the PPV and NPV is less than or equal to 1.5 in the C13006 trial.
2.)   The sum of the PPV and NPV for the Crohn's model estimated by 100 bootstrap replicates in the 30% training set vs. the null hypothesis that the sum of the PPV and NPV is less than or equal to 1.5 in the C13007 trial.
We will also report the AuROC, sensitivity, specificity, NPV, and PPV for each of the models above.

Study Data Provided

[{ "PostingID": 2347, "Title": "TAKEDA-C13006", "Description": "A Phase 3, Randomized, Placebo-Controlled, Blinded, Multicenter Study of the Induction and Maintenance of Clinical Response and Remission by MLN0002 in Patients With Moderate to Severe Ulcerative Colitis" },{ "PostingID": 2348, "Title": "TAKEDA-C13007", "Description": "A Phase 3, Randomized, Placebo-Controlled, Blinded, Multicenter Study of the Induction and Maintenance of Clinical Response and Remission by Vedolizumab (MLN0002) in Patients With Moderate to Severe Crohn's Disease" }]

Statistical Analysis Plan

We want to measure the predictive value of the models. We will determine the cut point that produces the largest sum of the positive predictive value (PPV) and the negative predictive value (NPV).

We will develop the two models on a randomly selected training sample of 70% of the study population. We will test the models on the remaining randomly selected 30% of the study population. We will measure the sum of the PPV and NPV, and bootstrap generate new models on a new randomly selected 70% of the study population. We will repeat this process for each model 100 times to arrive at a mean and standard deviation for the sum of the PPV and NPV.

We will test the null hypothesis that the sum of the PPV and NPV is equal to 1.5. We will consider the alternative hypothesis that the sum of the PPV and NPV is greater than 1.5. We will consider a 2-sided alpha of less than 0.05 as statistically significant. We have no preliminary data with vedolizumab to calculate the power of this study to find a significant result. However, based on our previous studies with thiopurines, we expect a mean sum of NPV and PPV greater than 1.7, and a standard deviation of 0.2 or less from the bootstrap replicates.

In the C13007 trial, given the possibility of drop out and missing data reducing the N of patients on vedolizumab for 52 weeks from 155 to 100, and the above assumptions, would produce a power estimate of 0.99.

Similarly in the C13006 study, given the possibility of drop out and missing data reducing the N of patients on Vedolizumab for 52 weeks from 161 to 100, and the above assumptions, would produce a power estimate of 0.99.

We will use a Student's T test to compare the following:
1.)The sum of the PPV and NPV for the UC model estimated by 100 bootstrap replicates in the 30% training set vs. the null hypothesis that the sum of the PPV and NPV is less than or equal to 1.5 in the C13006 trial.
2.)The sum of the PPV and NPV for the Crohn's model estimated by 100 bootstrap replicates in the 30% training set vs. the null hypothesis that the sum of the PPV and NPV is less than or equal to 1.5 in the C13007 trial.
We will also report the AuROC, sensitivity, specificity, NPV, and PPV for each of the models above.

Publication Citation

https://onlinelibrary.wiley.com/doi/epdf/10.1111/apt.14510