Peter E. Lipsky, MD
Outcome measures for SLE are complex, not relevant to the practicing clinician and poorly responsive to trial interventions. One way to develop more effective and appropriate outcome measures is to analyze the available clinical trial data based and attempt to deconstruct the data captured in the current measures and reconstruct them in a manner that more effectively separates responders from non responders. The approach will involve an analysis of data from the BLISS trials. The data that had been collected in these trials will be dissected to determine whether novel combinations of the data can be employed as novel outcome measures to be assessed in future clinical trials. The statistical analysis will be informed by an iterative process with a composite of experienced clinicians and trialists.
[{ "PostingID": 1416, "Title": "GSK-HGS1006-C1056", "Description": "A Phase 3, Multi-Center, Randomized, Double-Blind, Placebo-Controlled, 76-Week Study to Evaluate the Efficacy and Safety of Belimumab (HGS1006, LymphoStat-B™), a Fully Human Monoclonal Anti-BLyS Antibody, in Subjects with Systemic Lupus Erythematosus (SLE)" },{ "PostingID": 1417, "Title": "GSK-HGS1006-C1057", "Description": "A Phase 3, Multi-Center, Randomized, Double-Blind, Placebo-Controlled, 52-Wk Study to Evaluate the Efficacy and Safety of Belimumab (HGS1006, LymphoStat-B™), a Fully Human Monoclonal Anti-BLyS Antibody, in Subjects With Systemic Lupus Erythematosus (SLE)" }]
Clinical trials for systemic lupus erythematous (SLE) and lupus nephritis (LN) present many challenges including the extreme heterogeneity in the severity of the disease itself and the heterogeneity in its multisystem organ system manifestations. Currently the SLEDAI, BILAG or a combination of the two within a SLE responder index (SRI), or BILAG based Composite Lupus Assessment (BICLA) have been used to assess disease activity in SLE randomized controlled trials (RCTs). However, for the most part, neither of these assessment tools is used in every day clinical practice, which translates into very few rheumatologists who know how to apply the resultant data effectively. Both disease activity measures also have associated problems that make their application in clinical trials problematic. Part of the problem associated with the use of these outcome measures in RCTs results from the fact the SLEDAI and BILAG were both developed by assessing evidence accumulated in observational data sets to be used as disease activity measures, and were not prospectively validated in RCTs. Specifically, the SLEDAI is not designed to detect partial improvement, which may be clinically important; weighting of the SLEDAI components is not optimal as certain descriptors are weighted heavily, whereas other potentially clinically important items are not. Similarly, the BILAG suffers from the fact that discrimination of active disease, especially BILAG B grade organ system involvement, is somewhat subjective. There is ambiguity concerning certain SLEDAI features, including an agreement on the assessment window that is best (i.e., 10 days, 28 days, 30 days, etc.). Another issue is related to the unclear nomenclature or descriptors (e.g., lupus headache vs. CNS lupus) as well as accurate definitions of worsening or flare. Additionally, patient global assessments have often been excluded from RCTs. While the BILAG may theoretically be more sensitive, it is more complicated and may have reproducibility issues, often requiring adjudication to maintain accuracy. Intra-observer and inter-observer rater variability is quite problematic. It is not clear that even regular users are consistent in their scoring assignments. Reduction in disease activity is usually the most important endpoint in RCTs. However in SLE, reduction in concomitant glucocorticoid use is also an important endpoint. Unfortunately, changes in background therapy as well as achieving reduction in daily steroid doses, may dilute measurement of disease activity by SLEDAI and/or BILAG. One consideration would be a responder index that incorporates steroid dose changes with points gained for dose reductions or lost for dose increases. While the BILAG does take steroid use into account, the issue is not fully addressed. Further increases in doses of background steroid and/or immunosuppressive therapy may serve to “rescue” the placebo patients thereby minimizing differences between active and control treatment. Based on the complexity, subjectivity, and idiosyncrasies of the current outcome measures, a sensitive, validated endpoint that mirrors a physician's and patient's expectation of improvement, and one developed with prospective evidence from RCTs, would be of value in new clinical development programs for potential SLE therapeutics.Current LN outcome measures focus on either complete remission or response defined by low levels of proteinuria. However, this may not be attainable for certain patients based on the severity of the inflammation and chronic changes within the glomeruli as well as interstitium. Even though it may be a great improvement for a LN patient to experience a marked decrease in proteinuria (e.g., from 10 grams to 1 gram per 24 hours), this is not considered a complete response - therefore, consideration of a partial response might be an appropriate outcome. An additional consideration is the timing of the response. Would eGFR be an appropriate early response? Should early vs. late responders be defined separately? Moving forward, there is a need for clinical and histological identifiers that are followed long term (>1.5 years) in LN patients, and that are predictive and prognostic and would help to guide treatment and measure outcomes. It is important to evaluate the entire patient (e.g., all extra-renal disease activity) and to include patient reported outcomes. Renal disease may be asymptomatic, but may still cause symptoms of fatigue and malaise. Having a tool that provides a continuous scale for renal manifestations as opposed to dichotomous cut-offs should be considered, as this would provide information on how the patient's disease status is changing over time. Future disease assessment tools for SLE should be easy to understand and implement. An assessment tool focused on a continuous measure may be able to show “levels of improvement or response” rather than only a binary outcome. The current outcome measures require a substantial amount of cross checking with laboratory data and individual study sites. Subjective domains increase variability between evaluators; therefore, it is preferable to have instruments that rely on specific objective data. Another issue is the length of time to capture proof-of-concept information. Having a measure that can provide an “early read” in less time (e.g., 3 or 6 months) that correlates well with 1-year outcomes would be beneficial. Validated biomarkers for long-term prognosis may help to shorten the required duration of studies. Unfortunately at present, there are no good biomarkers or pharmacodynamics measures available in either SLE or LN. ProposalThe Lupus Research Institute (LRI) and OMERACT have formed a partnership to undertake a study to review data available from recent RCTs in SLE to develop new outcome measurement tools for testing in future trials. The clinical data will be deconstructed and used to identify opportunities for new tools, which could help improve the potential success rate of future SLE trials. Phase IPhase one of the initiative will focus on a review of data from completed SLE RCTs, with the specific intent of dissecting the data collected, assessing how efficacy was measured, reviewing the reported outcomes (e.g., physician reported outcomes, patient reported outcomes) as well as, the collected laboratory values to determine whether potential aspects of the data or novel combinations of data may be utilized in a manner to improve the currently available clinical trial outcome measures. The focus of Phase 1 will be to:• Review data from completed SLE RCTs to determine whether by deconstructing the outcome measures a unique combination of outcome measures distinguishes responders to active therapy plus standard of care from those receiving placebo plus standard of care • Review data from completed trials reporting both significant benefit and those that failed to achieve their primary outcome; initially beginning with data from trials that were successful • Analyze changes over time within individuals; assessing whether there were changes in background therapy which confounded response• Determine whether organ specific measures versus global measures may be more helpful to define response (dependent upon study design) • Determine whether individual components of response correlate with outcomes • Review change in patient reported outcomes (e.g. are patient reported outcomes useful in assessing responder status; what do patient reported outcomes add to the responder assessment; when should patient reported outcomes be included?); if measurement is included within the clinical trial:o Assess the patient global assessment of disease activity o Assess patient reported HRQOL including specific domains of SF-36 and transition questiono Assess change in patient reported fatigue • Assess whether the physician reported global measures of disease activity and flare discriminate between the effects of study drug and comparative arms• Assess the measure of corticosteroid doses and whether attainment of a “clinically meaningful” definition of reduction, such as less than or equal to 7.5 mg prednisone or a 50% decrease QD, corresponds with response or is an adequate response measure in its own right. The statistical analysis plan will be accomplished through an iterative process. The investigators will seek the input of an advisory committee to generate specific questions based on results from the initial data analyses that appear to divide responders vs. non-responders. Interim results of the initial statistical analyses will determine the need for additional analyses to potentially be undertaken.Statistical Analysis Plan- Phase 1Overall Objectives:1. To develop new ‘candidate' outcome measures for SLE RCTs that will account for:a. Heterogeneity of the disease patterns and organ system involvementb. Possibility of combining information on reduction in i. Disease activity; and,ii. Steroid doses2. To validate the performance of the alternative new ‘candidate outcomes' and compare them:a. With each other, as well as;b. With the existing outcome measures currently used in RCT's (SRI, BICLA, SLEDAI, BILAG etc.)3. To explore whether different outcome measures may be most responsive to change in different sub-populations of SLE patients.4. To repeat steps 1-3 focusing on the lupus nephritis (LN). Statistical Analyses MethodsObjective 1 This will involve comprehensive, multivariable re-analyses of the individual patients data from the completed relevant RCTs. Primary focus will be on the ‘positive' trials that found a statistically significant and clinically important effect of the ‘experimental treatment', compared to placebo + standard of care (SOC). For objective (1a), available data on individual components of the established instruments (SLEDAI, BILAG etc.) will be modeled as separate independent variables, in a multivariable model, to assess their independent ability to discriminate between the active treatment group and the controls. This will also permit testing whether a revised (regression-based) re-weighting of individual items and/or combining items from different existing scales may improve responsiveness to change (i.e. disease activity reduction due to treatment with established effectiveness). If necessary, additional adjustments will be made for patient characteristics that show imbalance between the trial arms. Additional weighting, proportional to the reliability of individual items (based on literature) will be considered. The ‘optimal' re-weighting formula will be derived by generalized cross-validation, to avoid biases and account for shrinkage of regression coefficients in an independent sample. Stability of the optimal re-weighting formula will be assessed through bootstrap analyses, based on 1,000 independent re-samples of the original data. All multivariable analyses will use state-of-the-art statistical methods to deal with such problems as (i) missing data and attrition (multiple imputation and inverse probability of censoring weights), (ii) possibly non-linear relationships between some quantitative variables and the response, including potential thresholds (flexible fractional polynomials and regression spline models), or (iii) effect modification (interaction testing). For objective (1b), first the flexible methods will be used to model possible non-linear relationship between steroid dose and SLE disease activity level (including measures developed for objective 1(a)), using data either from the relevant RCTs or from other studies (including observational studies) if available from the collaborators (details to be discussed later). The results will allow us to develop a scoring system that will ‘map' a given reduction in steroid dose into an expected change in disease activity, so that the expected ‘balance' of the resulting changes can be assessed using the comparable metric of changes in disease activity. Because of the uncertainty regarding the temporal relationship between changes in steroid dosage and (subsequent) changes in SLE disease activity, alternative metrics of recent steroid use will be considered, including lagged effects (with different clinically plausible latencies) and possibly cumulative effects. Methods outlined above will be used to compare the performance of the alternative metrics and assess the robustness of the resulting estimates and conclusions.Objective 2Alternative outcome measures developed within objective 1, as well as existing, currently used measures of SLE disease activity will be used in the analyses of an independent RCT (other than the RCT whose data were used to develop the new outcome measures). The focus will be on the ability of alternative outcome measures to discriminate between patients treated with the drug with established effectiveness versus controls (e.g. the ‘usual care' group). Furthermore, the robustness of the conclusions and main results will be assessed in customized simulations, where additional variables, not recorded in the RCT but potentially relevant (based on clinical experts' opinions and literature), will be simulated and their impact on the results will be explored. (Similar general simulation-based approach may be used, if judged relevant, to assess potential impact of some limitations of the available data, e.g. measurement errors, missing data etc.) As far as possible, given available data, further validations will involve comparing how alternative outcome measures correlate with patient- and/or physician-reported overall ratings of disease activity or its changes over time. If available, validations in some of the ‘negative' trials (that did not find a difference between the experimental treatment versus the usual care) will be attempted, to explore the ‘specificity' of the alternative outcome measures. Objective 3 Will involve assessing potential additive or multiplicative interactions with patient characteristics judged to act as clinically plausible effect modifiers. These analyses will be first conducted separately for each of the outcome measures found to be most ‘discriminatory' (i.e. most responsive to change) in the analyses addressing objectives 1 and 2. Based on the results, we may identify sub-populations of patients for whom different outcome measures may be most responsive. Then, methods similar to those outlined for objective 2 will be used to validate the performance of such subpopulation-specific outcomes in independent trial(s). To ensure adequate control of type I error rate, both customized bootstrap procedures, that will account for the joint complexity of the proposed multi-stage analyses, and simulation-corrected criteria for ‘statistical significance' will be employed.