We will replicate (as much as possible) the original analytical plan from the TORCH trial (NEJM 2007;356:775-89), with the principal modification being reclassification of the study cohort (GOLD defined moderate-to-severe COPD) by spirometric Z-scores (calculated from reference equations published by the Global Lung Initiative [GLI]; Eur Respir J 2012;40(6):1324-43). We otherwise define the primary and secondary end points, as well as adverse events, as per the original TORCH study design. The TORCH study population included 6,112 participants with a mean age of 65 years.

The primary end point in the TORCH trial was the difference in time to death from any cause between the combination therapy group and the placebo group, using a log-rank test and expressed as a hazard ratio. The Cox proportional hazards model was a supportive secondary analysis. All reported data analyses were prespecified. Assuming a 17% mortality rate in the placebo group at 3 years, the TORCH trial estimated that 1510 patients would be needed for each study group to detect a reduction in mortality of 4.3 percentage points in the combination therapy group, as compared with the placebo group (hazard ratio for death, 0.728), at a two-sided alpha level of 0.05 with 90% power.

The secondary end points included frequency of exacerbations, health status (total score on the St. George's Respiratory Questionnaire), and postbronchodilator spirometry (FEV1, which will be expressed in our proposal as a Z-score [J Gerontol Med Sci 2016;71:929-934]). The frequency of exacerbations in the TORCH trial was analyzed using a generalized linear model (assuming a negative binomial distribution, which accounted for variability among participants in the number and frequency of exacerbations), with the number of exacerbations as the outcome and the logarithm of time during which treatment was received as an offset variable. Health status based on total scores on the St. George's Respiratory Questionnaire and the FEV1 were analyzed as changes from baseline values with the use of repeated measures analysis of covariance (ANCOVA). Estimated differences between treatment groups at each visit were averaged with equal weights to determine the overall treatment effect during the 3-year study period. All efficacy analyses were performed according to the intention-to-treat principle. Comparisons other than those between the combination regimen and placebo and between the combination regimen and salmeterol alone were exploratory.

The adverse events included an evaluation of the time to first fracture, eye disorder, and pneumonia, and were compared among the study groups with the use of Kaplan-Meier estimates and the log-rank test. In the safety substudy group, bone mineral density for the total hip and lumbar spine was analyzed by repeated measures ANCOVA, and the development of cataracts was analyzed with the use of logistic regression.

The major difference of our analytical plan is the re-classification of GOLD defined moderate-to-severe COPD by spirometric Z-scores. Originally, the TORCH study recruited patients with a pre-bronchodilator ratio of FEV1 to FVC equal to or less than 0.70 and an FEV1 of less than 60% of predicted. We plan to reclassify these participants based on spirometric Z-scores, as described below.

Using spirometric reference equations from the Global Lung Initiative (GLI) (Eur Respir J 2012;40(6):1324-43), including variables for age, height, sex, and ethnicity, Z-scores are first calculated for FEV1, FVC, and FEV1/FVC. The diagnostic algorithm is then based on a single Z-score threshold of -1.64, defining the lower limit of normal (LLN) as the 5th percentile of distribution, and used as follows: normal spirometry was defined by FEV1/FVC and FVC both greater than or equal to LLN, COPD by FEV1/FVC less than LLN, and restrictive-pattern by FEV1/FVC greater than or equal to LLN but FVC less than LLN. The severity of COPD is further evaluated by the FEV1, as follows: FEV1 Z-scores greater than or equal to -1.64 as mild COPD; FEV1 Z-scores less than -1.64 as moderate-to-severe COPD. This diagnostic algorithm has strong validation, based on associations with multiple clinical outcomes and CT-measured emphysema and gas trapping (Am J Respir Crit Care Med 2015; 192(7):817-25 and Am J Respir Crit Care Med 2016 193(7):727-35). Protocols regarding the GLI calculation of spirometric Z-scores can be found at:

http://www.lungfunction.org/ and www.ers-education.org/lungfunction

Based on the above methodology, we propose to first determine the frequency distribution of Z-score defined spirometric subgroups, including normal spirometry, restrictive-pattern, and COPD (mild and moderate-to- severe). Thereafter, we plan to maintain the randomization scheme of the original trial so as to preserve the benefits of randomization. However, we also plan to create a covariate categorizing participants according to whether their status for GOLD defined moderate-to-severe COPD had been misidentified (i.e., those with Z-score defined normal spirometry, restrictive-pattern, or mild COPD may have been misidentified as GOLD defined moderate-to-severe COPD). This additional control variable would be added to a multivariable Cox regression analysis. It is unlikely that this control variable would be associated with the treatment variable, because of randomization, but it will probably be associated with the survival outcome. As such, it should give a more powerful estimate of the survival effect.

Based on our prior work in aging populations (Am J Respir Crit Care Med 2015; 192(7):817-25; J Gerontol Med Sci 2012. 2012;67:264-275), we postulate that at least 20% of the TORCH study cohort (included GOLD defined moderate-to-severe COPD) is likely to be reclassified as Z-score defined normal spirometry, restrictive-pattern, or mild COPD, whose response to combined salmeterol and fluticasone may include no treatment benefit or harm. We note, in particular, that the use of combined salmeterol and fluticasone in Z-score defined spirometric restrictive-pattern may have no benefit and may substantially increase the risk of adverse effects. Our prior work in COPDGene has shown, for example, that Z-score defined spirometric restrictive-pattern is not associated with CT-measured gas trapping or emphysema (i.e., do not have COPD [Am J Respir Crit Care Med 2016 193(7):727-35]), and other work has additionally shown that spirometric restrictive-pattern is associated with adverse cardiovascular outcomes (NEJM 1976;294:1071-1075; Eur Respir J 2010;36:1002-1006; and Chest 2008;134:712-718).

A supplemental analysis will additionally redefine the TORCH cohort to consist of only those study participants who had moderate-to-severe COPD according to the spirometric Z-score criteria. This redefinition will lose the benefits of randomization. To attempt to remedy this loss, descriptive characteristics of the three treatment groups will be compared, and if differences are detected (i.e., with p-values less than 0.10), then these variables will be added as additional covariates to the Cox regression model. Steps will be taken to avoid overfitting the model. It is recognized that, even upon taking these steps, it is possible that unmeasured variables may introduce unremedied confounding. A similar analytical approach will be used to explore the all-cause mortality outcome among the smaller group identified as having moderate-to-severe COPD by GOLD criteria but not by the spirometric Z-score criteria. While acknowledging its limitations, it is thought that this stratified analytical approach will likely provide some further insight as to whether the salmeterol plus fluticasone intervention would have proved more effective if the study cohort of moderate-to-severe COPD participants had been more accurately identified.