Risk,of,bias,and,reporting,practices,in,studies,comparing,VO2max,responses,to,sprint,interval,vs.continuous,training:A,systematic,review,and,meta-analysis

时间:2023-06-12 11:30:02 公文范文 来源:网友投稿

Jacob T.Bonafiglia,Hashim Islam,Nicholas Preobrazenski,Brendon J.Gurd

School of Kinesiology and Health Studies,Queen’s University,Kingston,Ontario,K7L 3N6,Canada

Abstract Background:It remains unclear whether studies comparing maximal oxygen uptake(VO2max)response to sprint interval training(SIT)vs.moderate-intensity continuous training(MICT)are associated with a high risk of bias and poor reporting quality.The purpose of this study was to evaluate the risk of bias and quality of reporting in studies comparing changes in VO2max between SIT and MICT.Methods:We conducted a comprehensive literature search of 4 major databases:AMED,CINAHL,EMBASE,and MEDLINE.Studies were excluded if participants were not healthy adult humans or if training protocols were unsupervised,lasted less than 2 weeks,or utilized mixed exercise modalities.We used the Cochrane Collaboration tool and the CONSORT checklist for non-pharmacological trials to evaluate the risk of bias and reporting quality,respectively.Results:Twenty-eight studies with 30 comparisons(3 studies included 2 SIT groups)were included in our meta-analysis(n=360 SIT participants:body mass index(BMI)=25.9±3.7 kg/m2,baseline VO2max=37.9±8.0 mL/kg/min;n=359 MICT participants:BMI=25.5±3.8 kg/m2,baseline VO2max=38.3±8.0 mL/kg/min;all mean±SD).All studies had an unclear risk of bias and poor reporting quality.Conclusion:Although we observed a lack of superiority between SIT and MICT for improving VO2max(weighted Hedge’s g=-0.004,95%confidence interval(95%CI):-0.08 to 0.07),the overall unclear risk of bias calls the validity of this conclusion into question.Future studies using robust study designs are needed to interrogate the possibility that SIT and MICT result in similar changes in VO2max.

Keywords:Bias;Cardiorespiratory fitness;CONSORT;Moderate-intensity continuous training;Sprint interval training

There is a growing awareness of a“reproducibility crisis”in preclinical and clinical research1,2that has widespread societal and financial ramifications.1,3Many research groups1,4,5attribute poor reproducibility to shortcomings in key aspects of study design(e.g.,randomization,blinding,and outcome reporting),as inadequacies in these methodological areas compromise internal validity and produce biased results.6-12Importantly,quality of reporting is intimately linked with bias:clinical trials that do not report information on bias-mitigating methodologies(e.g.,allocation concealment)produce inflated effect sizes compared with trials with adequate reporting.6Therefore,interpreting the internal validity of original research requires the assessment of both methodological rigor and quality of reporting.13,14

Systematic reviews provide an opportunity to evaluate the overall methodological rigor and quality of reporting in studies investigating a given research question.The Cochrane Collaboration bias assessment tool13and the Consolidated Standards of Reporting Trials(CONSORT)checklist15are robust tools for assessing the risk of bias16and reporting quality,respectively.Recent reports found that many systematic reviews in sports and exercise medicine research either do not evaluate the risk of bias17or use inferior assessment tools.18Furthermore,although several reviews have highlighted poor reporting quality in sports medicine research,19-21we are unaware of a study that has systematically evaluated the quality of reporting in exercise medicine research.

A current hot topic in exercise medicine research is determining which mode of exercise training best improves maximal oxygen uptake(VO2max)22,23—a research question with important clinical implications considering the association between VO2maxand all-cause morbidity and mortality.24Given that a perceived lack of time is a commonly cited barrier to participating in regular,structured physical activity,25a large body of work has developed investigating the potency of sprint interval training(SIT)—time-efficient exercise involving repeated supramaximal bouts of exercise interspersed with brief periods of rest—to improve VO2max.A recent meta-analysis demonstrated that SIT elicits similar improvements in VO2maxcompared with traditional endurance training(herein referred to as moderate-intensity continuous training(MICT)).26However,neither this meta-analysis nor any recent meta-analysis examining the effects of SIT on VO2max26-28has evaluated the risk of bias or quality of reporting within studies,which limits our confidence in the conclusion that SIT and MICT lead to similar improvements in VO2max.Consistent with the sports medicine literature,19-21we speculate that there is a high or unclear risk of bias and poor reporting quality among studies comparing changes in VO2maxfollowing SIT and MICT.

The purpose of this systematic review was to test the hypothesis that studies comparing changes in VO2maxbetween SIT and MICT have a high or unclear risk of bias and poor quality of reporting.A secondary purpose was to determine whether bias impacts the overall treatment effect for VO2maxresponses to SIT vs.MICT.Specifically,we planned to perform 2 meta-analyses:one with all studies that met our inclusion criteria,and a second only including studies judged to have a low risk of bias.18The expectation was that this 2 meta-analysis approach would provide greater insight about our confidence in the conclusions derived from current and past meta-analyses.26For instance,differences in overall treatment effects between these 2 meta-analyses would indicate that a high or unclear risk of bias impacted the comparison of VO2maxresponses following SIT and MICT.18Systematic evaluations of methodological rigor and reporting quality are likely required for many topics in sports and exercise science,as these issues appear to be widespread.19-22We chose to evaluate studies comparing VO2maxresponses to SIT and MICT,as this topic is clinically relevant,24addresses potential barriers to completing regular physical activity,25and has a large number of studies that can be included in our analysis—as demonstrated by past systematic reviews.26-28Although our systematic review focuses on this specific topic,our discussion provides simple and feasible recommendations applicable to all areas of exercise medicine research.

The present systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)checklist;29and a completed checklist can be found in the Supplementary material(Sheet 4).The study selection process was conducted using Covidence systematic review software(Veritas Health Innovation,Melbourne,Australia).

2.1.Eligibility criteria

Studies were included in the present systematic review if they met all of the following inclusion criteria:(1)used human adult participants between the ages of 18 and 65,(2)directly measured VO2max(or peak)using indirect calorimetry(i.e.,a metabolic cart),(3)reported VO2maxin relative units(mL/kg/min)or in absolute units(mL/min or L/min)with body mass(kg)so that relative VO2maxcould be manually calculated,(4)reported mean and standard deviation(SD)for changes in VO2max(post-training minus baseline)or VO2maxat baseline and post-training,or presented data in a manner that could be extracted using WebPlotDigitizer(WebPlotDigitizer,Pacifica,CA,USA),30(5)employed a SIT protocol that was“all-out”or supramaximal(e.g.,>100%VO2maxor maximal work rate)and interspersed with periods of rest or active recovery,(6)employed a MICT protocol that was continuous and submaximal(e.g.,<80% VO2maxor maximal work rate),and(7)conducted supervised training for a minimum of 2 weeks.Studies were excluded if they:(1)did not meet all of the inclusion criteria,(2)included non-healthy participants with a specific disease(e.g.,cancer,hypertension,type 2 diabetes,etc.);note that we did not consider obesity a disease,and we included studies with obese or overweight participants that were otherwise healthy,(3)included endurance-trained athletes;however,studies in strength-trained athletes were not excluded,(4)employed mixed training protocols(e.g.,MICT plus resistance training),or(5)were not in English,not an original research article,or presented previously published VO2maxdata.

2.2.Literature search and study selection

We conducted a comprehensive literature search in AMED,CINAHL,EMBASE,and MEDLINE on August 14th,2019,and a second up-to-date search took place on August 18th,2020.The searches included 3 main terms:SIT,MICT,and VO2max.A list of synonyms/related terms for each main term were combined with“OR”(see Supplementary material Sheet 2 for a full list of search terms),and a final single search combined the 3 separate lists with“AND”.Titles and abstracts were extracted from the database searches,and duplicates were automatically removed in Covidence(Covidence,Melbourne,VIA,Australia).

Study selection followed a 2-step process and was independently completed by 2 reviewers(JTB and NP).Both reviewers met in person to justify their decisions during the study selection process and to resolve any initial disagreements.Although a third reviewer(BJG)was available to settle any lasting disagreements,all initial disagreements were resolved during the in-person meetings.First,titles and abstracts were screened to identify studies that appeared to meet eligibility criteria.The 2 reviewers(JTB and NP)also screened relevant previously published systematic reviews23,26-28,31in an attempt to identify eligible articles that were not retrieved from the initial literature search.Second,full texts were downloaded for articles that passed the title and abstract screening to determine their eligibility.Third,the 2 reviewers assigned a reason for each study excluded during the full text screening.The final analysis included studies that passed both levels of study selection.

2.3.Assessment of risk of bias and quality of reporting

We assessed the risk of bias using the 7 sources of bias and related information outlined in the Cochrane Collaboration Tool.13The risk of each source of bias was judged as“high”,“low”,or“unclear”.In brief,studies that reported an adequate methodology for protecting against a given source of bias(e.g.,blinding outcome assessors to protect against detection bias)were judged as having a“low”risk of bias.Conversely,studies that reported an inadequate methodology(e.g.,randomized participants based on birth month32)were judged as having a“high”risk of bias.Studies that did not report information regarding a given methodology were judged as having an“unclear”risk of bias except in cases of reporting bias where studies were judged as having a“high”risk of bias if they did not report publicly registering their trial or if they did not report their methods in a public database/registry.

We assessed quality of reporting by completing the CONSORT checklist for non-pharmacological trials.33Each CONSORT item was rated as“yes”(reported)or“no”(not reported),and the elaboration and explanation document14was used to help determine the rating for each item.Two reviewers(JTB and NP)independently completed the risk of bias and quality of reporting assessments.Both reviewers met in person to justify their decisions during the assessment of risk of bias and reporting quality process to resolve any initial disagreements.Although a third reviewer(BJG)was available to settle any lasting disagreements,all initial disagreements were successfully resolved by JTB and NP.

2.4.Data extraction

Means and SDs for relative VO2max(mL/kg/min)were extracted either by recording values directly from tables/text or by using WebPlotDigitizer(Version 4.4;WebPlotDigitizer,Pacifica,CA,USA)—a data extraction approach with high inter-rater reliability and validity34—when VO2maxdata only appeared in figures.Mean changes in VO2maxwere either directly extracted from articles or calculated by subtracting the mean baseline value from the mean post-training value.We extracted relative VO2maxbecause many studies did not report VO2maxin absolute units35-44and because increasing relative VO2maxby~3.50 mL/kg/min confers an~8%-14% reduction in all-cause morbidity and mortality.45We also extracted summaries of training protocols and additional participant characteristic data,including self-reported physical activity classification(as reported in papers),age,height,body mass,and body mass index calculated using height and body mass data.For physical activity classification,we recorded the terminology(e.g.,“recreationally active”,“inactive”,etc.)that was reported in each study and any details about eligibility cut-offs for physical activity levels(when applicable).Two reviewers(JTB and NP)independently extracted data using a standardized sheet and compared results to verify that correct data were extracted.

2.5.Data synthesis

We calculated an effect size(Cohen’s d)for each study to compare changes in VO2maxbetween SIT and MICT using Eqs.(1)and(2):46,47

where delta(Δ)refers to mean changes in VO2maxfollowing SIT or MICT.SDpooledvalues were calculated using the SD of change scores(SDΔ)where possible(only 3 included studies reported SDΔvalues48-50).For the remaining studies we calculated SDΔaccording to Chapter 16.1.3.2 of the Cochrane Handbook:51a correlation of repeated measures of r=0.89(calculated using VO2maxdata from 10 of our previously-published training studies;52-61n=274 participants)and the reported SD for baseline and post-training values were used.Two studies49,62included 2 SIT groups,and separate effect sizes were calculated for each group.Because effect sizes were calculated by subtracting the mean change in VO2maxfollowing MICT from the mean change in VO2maxfollowing SIT(Eq.(1)),positive effect sizes indicated a larger increase in VO2maxafter SIT whereas negative effect sizes indicated a larger increase in VO2maxafter MICT.As Cohen’s d is biased upward for sample sizes less than 20,63,64the effect sizes for each study were corrected by converting Cohen’s d values to Hedges’g using Eq.(3):65

The precision of Hedges’g effect size estimates were determined by calculating the standard error(SEg)for each Hedges’g value using Eq.(4)such that 95% confidence intervals(95%CIs)could be constructed around each Hedges’g estimate(95%CI=g±(1.96×SEg)):63,64

To determine whether baseline fitness impacts the comparison of VO2maxresponses to SIT vs.MICT,we dichotomously grouped studies using an arbitrary threshold of 35 mL/kg/min for baseline VO2maxvalues(calculated as average between SIT and MICT groups).Effect sizes were pooled across all studies within these 2 groups and collapsed across all groups to determine an overall effect by calculating a weighted average Hedge’s g and its corresponding SEgand 95%CI using Eqs.(5-7),64where IVW refers to the inverse variance weight and SEg*refers to the standard error of the weighted average effect size:

We also performed a linear regression to determine whether baseline VO2maxpredicted the Hedges’g values.To further investigate the impact of sex,we completed 2 additional meta-analyses using male or female participants only.Hedges’g effect sizes were classified as small(0.2),medium(0.5),and large(0.8),as per Cohen’s conventions.46A publicly available spreadsheet66was used to calculate an I2statistic in order to quantify the degree of inconsistency(i.e.,heterogeneity)of the overall meta-analysis.67The degree of inconsistency was considered“low”,“moderate”,or“high”if the I2statistic was 25%,50%,or 75%,respectively.67Egger’s tests are commonly used to detect possible publication bias:the suppression of null or adverse findings in meta-analyses of controlled trials(e.g.,equivalency or superiority of placebo).68We did not investigate the presence of publication bias because we compared 2 experimental conditions(MICT vs.SIT)rather than comparing the efficacy of experimental conditions against a control.Additionally,we believe that heterogeneity in MICT/SIT intensities,frequencies,and durations69,70confounds the ability to interpret Egger’s test results as evidence of publication bias.

2.6.Sensitivity analysis:Does bias impact our meta-analysis?

As recommended by B¨uttner and colleagues,18we planned to perform a sensitivity analysis to determine whether bias impacted our meta-analysis.In brief,a second meta-analysis including only studies identified as having a low risk of bias would be compared to the primary meta-analysis.A difference in the overall estimated effects between these 2 meta-analyses could suggest that biased results impacted the primary metaanalysis.However,as described below,this sensitivity analysis could not be performed because every study included in our meta-analysis was judged to have an unclear risk of bias.We therefore provide an informative discussion on each source of bias included in the Cochrane Collaboration tool and outline recommendations for future work instead.

3.1.Study selection

Fig.1 presents a flow diagram of the study selection process.The literature search retrieved 3859 articles,and Covidence removed 1198 duplicates.Of the 2661 articles that entered title and abstract screening,2505 of these articles were deemed irrelevant and so were subsequently excluded.Full texts were then downloaded for 156 articles;132 articles were excluded as they did not meet eligibility criteria.Included were 24 articles from the literature search and 3 additional articles35,71,72identified from previously published systematic reviews.Therefore,27 articles were part of the final analysis.Participant characteristics and physical activity classifications are presented in Table 1.Several studies did not report information about physical activity eligibility cut-offs,and no study objectively measured physical activity levels(Table 1).

Fig.1.Flow diagram of the study selection process and number of comparisons included in the meta-analysis.MICT=moderate-intensity continuous training;SIT=sprint interval training;VO2max=maximal oxygen uptake.

3.2.Risk of bias assessment

Table 2 presents the risk of bias in the 27 included studies.In general,we observed an unclear risk of bias among studies comparing changes in VO2maxbetween SIT and MICT.All 27 studies did not report methods related to adequate allocation concealment,participant blinding,or a priori identification of a primary outcome(s),and therefore had an unclear risk of selection and performance bias and a high risk of outcome reporting bias.(Note that the inability to blind participants is an inherent limitation associated with exercise training studies.73)Two studies had a high risk of“other bias”as 1 study38imputed group means for missing data—an approach that risks reducing variability and artificially increasing the probability of detecting significance74—and the other43did not randomize group allocation.Overall,every study included in our meta-analysis was judged to have an unclear risk of bias,and we therefore could not complete a sensitivity analysis to determine whether bias impacted our overall treatment effect.

3.3.Quality of reporting assessment

Table 3 contains the evaluation for select CONSORT items related to key aspects of study design(e.g.,randomization procedures,blinding,outcome reporting,and sample size calculations),and the Supplementary material(Sheet 3)presents the evaluation,including all CONSORT items.The high number of“✘”symbols(indicating no reporting or inadequate reporting)in Table 3 highlights an overall poor quality of reporting:209“✘”symbols vs.51“✓”symbols.Additionally,no study adequately reported CONSORT Items 9 and 10,which relate to allocation concealment and randomization implementation procedures,respectively.

Table 1Characteristics of studies comparing changes in VO2max following SIT and MICT included in the meta-analysis.

Table 1(Continued)

Table 1(Continued)

3.4.Data synthesis

Three49,62,75of the 27 studies contained 2 SIT interventions that were both compared to MICT.Therefore,30 comparisons were included in the meta-analysis with a total sample size of 360 SIT and 359 MICT participants.Fig.2 presents changes in VO2max,effect sizes(Hedge’s g),sample sizes,and percent contribution toward weighted effect size(%weight)for each comparison.Fig.2 is sorted by mean baseline VO2max:less(Fig.2A)or greater(Fig.2B)than 35 mL/kg/min(see Table 1 for mean baseline values).In both baseline fi tness groups,the vast majority of 95%CIs crossed zero(11/14 in Fig.2A;10/16 in Fig.2B).Baseline VO2maxdid not appear to infl uence Hedges’g values,as the 95%CI for both weighted effect sizes crossed zero(Fig.2),and the linear regression was not signifi-cant(r2=0.07,r=-0.28,p=0.12).Fig.2 also presents the overall weighted effect size when we pooled across all studies regardless of baseline VO2max(blue diamond).The mean overall changes in VO2maxwere+4.6 mL/kg/min following SIT and+4.4 mL/kg/min following MICT,and the weighted average Hedge’s g was 0.06(SE=0.08)with a 95%CI crossing zero(-0.08 to 0.07).Collectively,these results highlight a lack of superiority between SIT and MICT for improving VO2maxregardless of baseline VO2max.Additionally,the I2statistic(72%)indicated substantial inconsistency of effect sizes across comparisons.

Fig.3 presents forest plots separated by sex,which includes a smaller subset of studies as few included only males35,36,42-44,50,59,62,71,76-79or only females,39,72,75,80and only one study reported sex-specifi c VO2maxdata.52The 95%CIs for the overall Hedge’s g values did not cross zero for either sex(Fig.3).These meta-analyses reveal a possible sexspecifi c response:females appear to respond more favorably to SIT while males respond more favorably to MICT.However,this interpretation should be made with caution as these meta-analyses were completed with a smaller subset of studies and because both weighted effect sizes were small.Overall Hedge’s g values were 0.55(favoring SIT)for females and-0.32(favoring MICT)for males.

The novel fi nding of our systematic review was that studies comparing changes in relative VO2maxbetween SIT and MICT—including several of our own52,59,62—had an overall unclear risk of bias and poor quality of reporting.Our metaanalysis revealed that SIT and MICT similarly improve VO2max(Hedge’s g=0.05,95%CI:-0.10 to 0.20),and this fi nding is consistent with the previous meta-analysis by Gist et al.26(Cohen’s d=0.04,95%CI:-0.17 to 0.24).However,the overall unclear risk of bias warrants cautious interpretation of these meta-analyses.Specifically,if the presence of bias produced inaccurate effect sizes for each study included in these meta-analyses,then the overall effect size and associated interpretation may also be inaccurate.It is also likely that the substantial inconsistency in these meta-analyses(Fig.2 and the meta-analysis by Gist and colleagues26)is attributable to differences in the frequency,intensity,and duration of MICT and SIT protocols across studies.We refer the reader to recent reviews that discuss this issue in greater detail.69,70Our uncertainty in knowing whether or not bias-protecting methodologies were implemented emphasizes the importance of transparent and full reporting as outlined in the CONSORT guidelines.15Although many journals endorse the CONSORT guidelines in the attempt to improve quality of reporting,81the success of this approach is incumbent on editors,peerreviewers,and authors to ensure that submitted manuscripts adhere to these guidelines.Collectively,our findings highlight several major concerns with studies comparing VO2maxresponses between SIT and MICT and support the need for rigorous risk of bias and reporting quality assessments in future systematic reviews of exercise medicine research.16,18

Table 2Risk of bias in studies comparing changes in VO2max between SIT and MICT.

The poor quality of reporting(Table 3)meant we had to assign an“unclear”risk of bias for most studies(Table 2)as it is possible that studies protected against sources of bias but failed to report doing so.Devereaux et al.82contacted authors of large clinical randomized controlled trials(RCTs)and found that many authors claimed to have performed bias-mitigating methodologies despite not reporting this information in their publications.Although this finding supports the idea that a lack of reporting does not necessarily reflect a lack of methodological rigor in large clinical RCTs,we are unaware of similar evidence in applied exercise science research.In contrast,strong meta-epidemiological evidence demonstrates that studies failing to report measures taken to mitigate bias(e.g.,allocation concealment)produce inflated/biased effect sizes.6This meta-epidemiological evidence supports a“guilty until proven innocent”approach whereby one should not assume a study protected against biased results unless the bias-mitigating methodologies were explicitly reported.83However,additional empirical data supporting this approach is lacking as the majority of meta-epidemiological analyses do not separate studies with poor reporting from those that report performing an inadequate methodology.7-9,84To determine whether a lack of reporting per se is associated with biased results,future work should compare effect sizes from studies with poor reporting vs studies that report performing an inadequate methodology.

Table 3Checklist of select CONSORT items to assess quality of reporting in studies comparing changes in VO2max between SIT and MICT.

Unfortunately,we could not perform a sensitivity analysis to determine the infulence of bias on our meta-analysis,18as no study was judged to have a low risk of bias.If we could have performed this sensitivity analysis,a different result(e.g.,the overall 95%CI lay fully on one side of 0 indicating superiority of either MICT or SIT)might have suggested that biased results impacted our meta-analysis(Fig.2).The impact of bias was demonstrated by Pildal et al.9who found that approximately two-thirds of meta-analyses reporting an overall treatment effect lost this effect when only including studies that reported an adequate allocation concealment method.This finding supports the recent recommendations by B¨uttner and colleagues18that,when applicable,future meta-analyses should conduct a sensitivity analysis to determine whether or not overall effects change when only including studies with a low risk of bias.

We provide a discussion below that describes each source of bias covered in the Cochrane Collaboration tool and makes recommendations to improve the overall methodological rigor of future studies comparing clinical outcomes between SIT and MICT.We also discuss sample size calculations as this aspect of study design was largely overlooked in the studies included in our meta-analysis(CONSORT Item 7A;Table 3).

4.1.Selection bias

Selection bias can occur when investigators assign participants to a given intervention group non-randomly.85Protecting against selection bias requires generating an unpredictable random allocation sequence and concealing this sequence from all investigators involved in enrolling participants:a process referred to as“allocation concealment”.86-88It is unclear whether or not studies included in our meta-analysis overlooked protecting against selection bias(Table 2)because most studies failed to report methodologies related to random sequence generation(CONSORT Item 8A),allocation concealment(CONSORT Item 9),and the implementation of these randomization procedures(CONSORT Item 10)(Table 3).It is imperative that future studies clearly report these procedures to allow researchers to easily assess the adequacy of these methods.Because there are many methods that fail to conceal allocation(e.g.,sealed envelopes or randomizing via birth month),32,89,90clear and transparent reporting is the only definitive way to demonstrate that a given study adequately protected against selection bias.For an example of clear reporting of adequate methods for reducing the risk of selection bias,we refer the reader to our recent study,in which we utilized a third party to generate a random allocation sequence using Microsoft Excel and conceal this sequence until group assignment.91

Fig.2.Forest plot of the meta-analysis comparing changes in relative maximal oxygen uptake(VO2max)following sprint interval training(SIT)and moderateintensity continuous training(MICT)separated by baseline VO2max:(A)<35 mL/kg/min;(B)>35 mL/kg/min;(C)the overall effect with all studies included.Because effect sizes were calculated as(SIT minus MICT)divided by pooled standard deviation,negative values reflect larger changes in VO2max following MICT whereas positive values reflect larger changes following SIT.The red diamonds represent the overall weighted effect size(Hedge’s g)for each baseline fitness group,and the blue diamond represents the overall weighted effect size including all studies.The horizontal points of the diamonds represent the upper and lower bounds of the 95% confidence intervals(95%CIs).The meta-analysis revealed a low-moderate degree of inconsistency(I2=38%).Overall changes in VO2max presented as averages,and overall number of participants presented as total sums.Comparison of 1/2 SIT(a)or“all-out”SIT(b)vs.MICT(see Table 1 for details).

4.2.Performance bias

Performance bias can occur when participants and/or personnel administering the interventions are not blinded to participants’group assignments.13Protecting against performance bias requires blinding participants as well as personnel to participants’group assignments.13,14All 27 studies included in our meta-analysis had an unclear risk of performance bias(Table 2),and this finding likely reflects the difficulty of blinding participants and personnel in exercise training studies.73,92In situations where participants and/or personnel cannot be blinded,the CONSORT statement for non-pharmacological trials33recommends that researchers report any attempts to limit performance bias(CONSORT Item 11C).However,only 2 studies38,50reported an attempt to reduce the possible impact of performance bias(Supplementary material,Sheet 3).Future studies comparing changes in VO2maxbetween SIT and MICT should report attempts to reduce the risk of performance bias,such as concealing the study’s hypothesis to participants and/or personnel.14,93

Fig.3.Forest plot on subset of studies that reported sex-specific data or included only female or male participants.Forest plot depicts meta-analysis comparing changes in relative maximal oxygen uptake(VO2max)following sprint interval training(SIT)and moderate-intensity continuous training(MICT)separated by sex:(A)females and(B)males.Because effect sizes were calculated as(SIT minus MICT)divided by pooled standard deviation,negative values reflect larger changes in VO2max following MICT whereas positive values reflect larger changes following SIT.The red diamonds represent the overall weighted effect size(Hedge’s g)for each baseline fitness group,and the horizontal points of the diamonds represent the upper and lower bounds of the 95%confidence intervals(95%CIs).Overall changes in VO2max presented as averages,and overall number of participants presented as total sums.Comparison of 1/2 SIT(a)or“all-out”SIT(b)vs.MICT(see Table 1 for details).

4.3.Detection bias

Detection bias,also known as observer or ascertainment bias,14can occur when investigators responsible for assessing outcomes(herein referred to as“outcome assessors”)are aware of participants’group assignments.94Protecting against detection bias requires that outcome assessors are blinded to participants’group assignments.The majority of studies included in our meta-analysis(25/27)had an unclear risk of detection bias(Table 2)as few studies reported whether or not outcome assessors were blinded(CONSORT Item 11A)(Table 3).Given this lack of clarity,we will highlight 2 possible methods for blinding VO2maxoutcomes assessors that could be reported in original manuscripts(if applicable).First,given evidence that encouragement affects obtained VO2maxvalues,95the individual performing the VO2maxcould be blinded to prevent them from providing unequal encouragement(e.g.,more encouragement for SIT participants to align with assessor’s belief that SIT is superior).Second,despite the objective nature of obtaining VO2maxvalues(e.g.,highest 30-s average),VO2maxdatafiles can be coded to prevent outcome assessors from manipulating or fabricating data.For an example of clear reporting of methods for blinding outcome assessors,we refer the reader to our recent study,in which we utilized a third party to code samples and data files.91

4.4.Attrition bias

Attrition bias can occur when participants are lost to followup in a non-random fashion between groups(i.e.,more dropouts in one group compared to another).96When dropout rates are high and/or systematically different between groups,protecting against attrition bias can involve adopting an intentionto-treat(ITT)analysis whereby imputation methods generate data for dropouts so that all randomized participants are included in statistical analyses.14The majority of studies included in our meta-analysis(20/27)had low rates of dropout and thus had a low risk of attrition bias(Table 2).Of the 6 studies with an unclear risk of attrition bias,4 studies43,77,97,98did not report the number of dropouts(CONSORT Item 13A),reasons for dropout(CONSORT Item 13B),or the number of participants analyzed(CONSORT Item 16;Supplementary material);2 studies49,78did not adopt an ITT analysis despite having dropout rates exceeding 20%,a rate that may introduce bias.99Future studies should protect against attrition bias by reporting the number of dropouts and reasons for them and by considering adopting an ITT analysis if dropout rates are high.

4.5.Reporting bias

Reporting bias occurs when authors selectively report the results for certain outcomes and withhold the results for others.11Protecting against reporting bias requires that researchers report a study’s methods,including a list of primary and secondary outcomes,in a public registry or protocol publication before starting data collection.100All 27 studies included in our meta-analysis had a high risk of reporting bias as the majority of studies(26/27)did not report their methodologies in a public registry.Although Kiviniemi et al.76registered their protocol(CONSORT Item 23;Supplementary material),it was unclear whether or not these authors selectively reported outcomes,as a list of primary and secondary outcomes was not included in their registration file(clinicaltrials.gov:NCT01344928).Collectively,these findings highlight an overall high risk of reporting bias,which emphasizes the need for future work to report the primary outcome(s)a priori in a public registry(see Refs.101-103for examples of public registries).

4.6.Sample size calculations

Small sample sizes risk generating a type II error(i.e.,a false negative).104An a priori sample size calculation estimates the sample size needed to statistically detect an expected effect at a pre-determined level of statistical power(i.e.,probability of not making a type II error;typically 80%).105Performing a priori sample size calculations and subsequently ensuring enrollment reaches the indicated sample size only helps reduce the risk of type II errors if calculations are completed accurately.Sample size calculations should utilize equations and assumptions that match the planned statistical analysis106,107and should be based on either a clinically meaningful change or an expected effect size and variance derived from previous studies using similar designs,populations,and methods for outcome assessment.14Assessing the accuracy of a given sample size calculation requires that researchers report and justify the associated statistical parameters,which include the desired statistical power/type II error risk,alpha level/type I error risk,and the expected effect size and variance.14Failing to perform an accurate a priori sample size calculation precludes researchers’ability to determine whether a reported non-significant result reflects a true finding or a type II error.104

The majority of studies(21/27)included in our meta-analysis either failed to report or inadequately reported whether or not they performed an a priori sample size calculation(CONSORT Item 6A)(Table 3).Thus,it is unclear whether or not these studies may have made a type II error when concluding that SIT and MICT are equally effective at improving VO2max.In theory,future studies could perform an a priori calculation to determine the sample size needed to detect a significant difference between SIT and MICT.However,because our overall effect size indicated a lack of superiority between SIT and MICT(Fig.2),future studies could consider conducting a non-inferiority trial to determine whether SIT and MICT are equally beneficial at improving VO2max(see Refs.108,109for information on non-inferiority trials).

4.7.Limitations

Our systematic review focused on VO2maxresponses to SIT and MICT,and it is therefore unclear whether our findings are generalizable to other areas of exercise medicine research.A popular topic in studies involving endurance athletes is determining the potency of supplementing habitual exercise training with SIT or high-intensity interval training.110-114Although Garc′ıa-Pinillos et al.115recently evaluated the methodological rigor of studies using high-intensity interval training to augment training load for endurance runners,these authors used inferior assessment tools(i.e.,the PEDro Scale and Downs and Black Quality Index)18and did not assess reporting quality.This is one of many examples of topics in exercise science research that warrant a systematic evaluation of methodological rigor and reporting quality using the Cochrane Collaboration tool and CONSORT checklist,respectively.

Our systematic review and meta-analysis found an unclear risk of bias owing to poor reporting quality in studies comparing changes in VO2maxbetween SIT and MICT.Given these apparent methodological issues,future studies are encouraged to implement bias-reducing methodologies,as outlined in the Cochrane Collaboration tool,and to follow the reporting recommendations outlined in the CONSORT checklist for nonpharmacological trials.Furthermore,future systematic reviews in exercise medicine research should evaluate and(if possible)account for the risk of bias and reporting quality when synthesizing results in a meta-analysis.Although we focused on studies examining changes in VO2maxfollowing SIT and MICT in humans,the methodological and reporting principles highlighted in this review are applicable to all disciplines within exercise and sports medicine research.

Acknowledgments

This project was supported by an operating grant from the Natural Science and Engineering Research Council of Canada(NSERC;grant number:402635)to BJG.JTB was supported by a NSERC Vanier Canada Graduate Scholarship,HI was supported by NSERC PGS-D,and NP was supported by NSERC CGS-M.

Authors’contributions

JTB and NP conducted the literature review.All authors contributed to the study conception and design and the writing of the first draft,commented on previous versions of the manuscript.All authors have read and approved the final version of the manuscript,and agree with the order of presentation of the authors.

Competing interests

The authors declare that they have no competing interests.

Supplementary materials

Supplementary materials associated with this article can be found in the online version at doi:10.1016/j.jshs.2021.03.005.

推荐访问:comparing studies responses