## Introduction

Critical power (CP) delimitates the boundary between heavy- and severe-intensity domains and represents the highest intensity of exercise in which some physiological and metabolic responses achieve a steady state (Black et al., 2017). During exercise in the severe domain, oxygen consumption (VO_{2}) kinetics presents a slow component, increasing the O_{2} cost of exercise and leading VO_{2} to the maximum value (VO_{2max}) before exhaustion (Jones et al., 2011). Conversely, in the extreme intensity domain (i.e., supra-severe exercises), although VO_{2} response seems to be faster, exercise exhaustion precedes the attainment of VO_{2max} (Burnley and Jones, 2007). Therefore, the maximal intensity at which VO_{2max} can be achieved before exhaustion (I_{HIGH}) indicates the boundary between severe- and extreme-intensity domains. However, the mechanisms that define exercise tolerance between these domains remain uncertain.

CP and the finite work capacity above CP (*W'*) can be derived from mathematical models based on the power-duration relationship within the severe intensity domain (Hill, 1993; Poole et al., 2016). While the hyperbolic model (CP_{hyp}) provides CP from the asymptote of the hyperbola, and the curvature constant denotes *W'*, these variables may also be obtained by a linear relationship between work × time (CP_{linear}) or power × inverse of time (CP_{1/time}) (Hill, 1993; Monod and Scherrer, 1965; Moritani et al., 1981). These models allow for the prediction of time-to-exhaustion (Tlim) at intensities above CP, which theoretically would correspond to the moment of *W'* depletion (Chidnok et al., 2013).

Exercise tolerance during the severe intensity domain is compromised by the magnitude of the slow component of VO_{2} (VO_{2SC}) kinetics, which would be explained by the inefficiency of muscle fibers to maintain exercise intensity and increasing O_{2} cost, leading VO_{2} to VO_{2max} (Burnley and Jones, 2007; Jones et al., 2011). Murgatroyd et al. (2011) found a positive correlation between *W'* and the magnitude of VO_{2SC} (r^{2} = 0.76). Therefore, in the severe intensity domain, *W'* depletion coincides with the moment of attainment of VO_{2max}, leading to exercise interruption (Murgatroyd et al., 2011). However, in the extreme intensity domain, VO_{2SC} is not pronounced, and VO_{2max} is not reached, because the exhaustion precedes VO_{2max} attainment (Burnley and Jones, 2007). Thus, *W'* is not completely depleted in the extreme domain, and the exercise tolerance prediction could be impaired (Alexander et al., 2019). Nevertheless, the prediction of exercise tolerance in intensities at which occurs the transition of severe to extreme intensity domain was not completely investigated. Therefore, it is not known whether there is an exact point at which the prediction of exercise tolerance begins to fail.

In the extreme intensity domain, Alexander et al. (2019) reported a higher slope of the linear relationship between power and the inverse of time (1/time), culminating in lower *W'* in the extreme compared to the severe domain in knee extension exercise. In the study by Alexander et al. (2019), exercise tolerance in the extreme domain was overestimated by variables of the power-time relationship of the severe domain (CP and *W*’), except for the work rate that was considered the ‘transition point’ between the severe and the extreme intensity domain. However, those findings are restricted to knee extension exercise. In addition, the “transition point” between the severe and extreme domains was estimated as bouts with time to exhaustion shorter than 2 min (Alexander et al., 2019). Therefore, a valid measure to discriminate between the severe and extreme intensity domains (i.e., VO_{2max} attainment) could be insightful to assess the accuracy of exercise tolerance prediction in the transition point between these domains and support understanding the physiological determinants of whole-body exercise tolerance around these domains.

The ability of CP and *W'* to predict exercise tolerance at intensities within the severe domain has been verified (Chidnok et al., 2013; Jones et al., 2008; Morgan et al., 2019; Nimmerichter et al., 2020). However, there is a gap in the literature about the prediction of exercise tolerance at intensities close to the “transition point” between severe and extreme intensity domains (Alexander et al., 2019; Caputo and Denadai, 2008; Charkhi Sahl Abad et al., 2021). Thus, the main aim of this study was to assess the prediction of cycling exercise tolerance in the boundary between severe and extreme intensity domains by different CP models. The main hypothesis was that CP models would accurately predict exercise tolerance in the severe intensity domain, but this predictability could be reduced in the extreme intensity domain.

## Methods

*Participants*

Nineteen male subjects (mean ± standard deviation [SD]; age: 23.0 ± 2.7 years; body mass: 77.8 ± 6.2 kg; body height: 175.3 ± 5.3 cm; and peak oxygen uptake [VO_{2peak}]: 49.4 ± 5.6 mL∙kg^{−1}∙min^{−1}) classified as recreationally trained cyclists (De Pauw et al., 2013) participated in the study and gave their written informed consent. The study was approved by the Institutional Ethics Committee for Research on Human Subjects from the State University of Santa Catarina and was performed according to the Declaration of Helsinki. Participants were instructed to continue normal daily activities during the study period.

*Procedures*

All participants visited the laboratory at least four times on different days for testing (Figure 1). On the first day, participants performed an incremental test to determine the peak power output (PPO) and VO_{2peak} of the incremental test. During their second, third, and fourth visits, in random order and on separate days, participants performed constant Tlim tests (95%, 100%, and 110% of PPO) to determine CP and *W'*. Before CP predictive trials, separated by a 1-h passive rest interval, participants completed two to three Tlim tests on separate days to determine the maximal intensity at which VO_{2max} can be achieved (I_{HIGH}) as well as the work rate 5% above it (I_{HIGH+5%}) as previously published (Turnes et al., 2016). The 60-min of passive rest proved to be sufficient to allow full recovery of *W'* and minimize any potential priming effect (Muniz-Pumares et al., 2019). All exercise tests were preceded by a standardized 10-min warm-up and a 5-min passive rest as described elsewhere (Turnes et al., 2016). All tests were separated by ≥24 h within 14 days and were performed at the same time of day to minimize the effects of diurnal biological variation on results. Participants were also asked to refrain from consuming caffeine and arrive at the laboratory for at least 2 h after the last meal before each trial.

*Materials*

All exercise tests were conducted using an electronically braked cycle ergometer (Lode Excalibur Sport, Groningen, The Netherlands). During all tests, the pulmonary gas exchange was measured breath-by-breath using an automated open-circuit gas analysis system (Quark PFT, COSMED, Rome, Italy). Before each test, the gas analyzer was calibrated using ambient air and gases containing 16% oxygen and 5% carbon dioxide. The turbine flow meter used for the determination of minute ventilation was calibrated with a 3-L calibration syringe (COSMED, Rome, Italy). The heart rate was also monitored throughout the tests (Polar, Kempele, Finland).

*Incremental Test*

The initial power output for the incremental test was set at 0.5 W∙kg^{−1} for 3 min and then increased by 0.5 W∙kg^{−1} every 3 min until voluntary exhaustion (Caputo and Denadai, 2008; Moseley and Jeukendrup, 2001). Participants were instructed to maintain their preferred cadence between 70 and 90 rotations per minute (rpm) for as long as possible. PPO was defined as the power output attained at exhaustion if the test was terminated at the end of a 3-min stage. If the test was terminated before the last stage had finished, PPO was calculated as the power of the previous stage plus the power increment multiplied by the duration of exercise in the final stage divided by 180 s (Kuipers et al., 1985). VO_{2peak} of the incremental test was defined as the highest average VO_{2} over a 15-s period (Robergs et al., 2010).

*Critical Power Protocol*

For the determination of CP and *W'*, three constant work rate tests in random order were performed. Power outputs for these trials were equivalent to 95, 100, and 110% of PPO, estimated to produce a Tlim between 3 and 9 min (Caputo and Denadai, 2008). VO_{2peak} during the constant work rate trials was defined for each test as the highest average VO_{2} over a 15-s period (Robergs et al., 2010). Time-to-exhaustion was recorded to the nearest second. Using three distinct two-parameter models, four combinations of CP and *W'* were estimated as follows:

CP

_{linear}: from the linear time-work model using three predictive trials (95, 100, and 110% of PPO) (Hill, 1993).CP

_{linear(95,110)}: from the linear time-work model using two predictive trials (95 and 110% of PPO) (Hill, 1993).CP

_{1/time}: from the linear power × inverse of the time model using three predictive trials (95, 100, and 100% of PPO) (Hill, 1993).CP

_{hyp}: from the hyperbolic 2-parameter model with three predictive trials (95, 100, and 110% of PPO) (Hill, 1993).

The prediction of the CP_{1/time(95,110)} model was omitted from further analysis because it provided identical values to CP_{linear(95,110)}.

Exercise tolerance of I_{HIGH} and I_{HIGH+5%} was predicted by all CP and *W'* models employing the CP_{hyp} equation.

*The Boundary between Severe and Extreme Exercise Intensities*

All participants performed two to three Tlim tests to determine I_{HIGH} (severe domain) and I_{HIGH+5%} (extreme domain), beginning at 125% of PPO (Turnes et al., 2016). When VO_{2max} could be reached or maintained during the first Tlim test, further subsequent constant Tlim tests at a 5% higher work rate were performed on separate days until VO_{2max} could not be reached. On the other hand, when VO_{2max} could not be reached or maintained during the first Tlim test, further Tlim tests were conducted at a 5% lower work rate. I_{HIGH} was defined for each participant as the highest power output at which the highest 15-s VO_{2} average (determined from rolling averages of 5-s samples) was equal to or higher than VO_{2max} (averaging the highest VO_{2peak} values from the incremental and CP predictive trials), minus one intraindividual standard deviation (SD) (4.0% ± 1.4%); i.e., SD derived for each participant’s VO_{2peak} from incremental and CP predictive trials tests (Turnes et al., 2016). I_{HIGH} was individually determined and considered the last intensity of the severe domain, while I_{HIGH+5%} was the first intensity of the extreme domain.

*Statistical Analysis*

The descriptive statistics are presented as means ± SD and statistical variables as mean point estimates with confidence intervals of 95% (95% CI). Agreements between actual and predicted Tlim at I_{HIGH} and I_{HIGH+5%} tests were assessed by Bland-Altman analyses with bias and 95% limits of agreement (LoA) (Bland and Altman, 1999). Linear regression was performed to verify homoscedasticity (constant dispersion of differences across the range of averages) or heteroscedasticity (increase or decrease in dispersion as the averages increase) between the actual and predicted measures on the Bland-Altman plots (Ludbrook, 2010). Additionally, ANOVA for repeated measures was used to compare CP and *W'* among the models and actual and predicted Tlim at I_{HIGH} and I_{HIGH+5%}. When an ANOVA significant main effect was observed, post hoc tests with corrections of Tukey’s were applied between CP and *W'* estimates, while comparisons between actual and predicted Tlim at I_{HIGH} and I_{HIGH+5%} were made using the Dunnet test. We conducted a sensitivity analysis in G*Power (version 3.1.9.7, Düsseldorf, Germany) to determine the smallest effect that one could have detected with high probability given n = 19, *p* < 0.05, and statistical power = 95%. In the current study, we obtained ANOVA F-values of 2.8 for CP and *W'*, 2.5 for Tlim, and 3.3 for VO_{2max}. Statistical analyses were performed with the software GraphPad Prism 8.1.2 (GraphPad Software, La Jolla, CA, USA). The statistical significance level was established at *p* < 0.05.

## Results

The PPO of the incremental test was 274 ± 35 W and the Tlim of CP predictive trials was 424 ± 48, 310 ± 37, and 223 ± 23 s for 95%, 100%, and 110% PPO, respectively. VO_{2max} (i.e., the average VO_{2peak} of incremental and CP predictive trials: 3.75 ± 0.41 L∙min^{−1}) was not significantly different from VO_{2peak} of I_{HIGH} (3.72 ± 0.46 L∙min^{−1}), but significantly higher than VO_{2peak} of I_{HIGH+5%} (3.50 ± 0.41 L∙min^{−1}; F_{(1.7, 31)} = 38.0; *p* < 0.0001). For I_{HIGH+5%} determination, no participants attained VO_{2max} during the test.

There were no significant differences among the models for CP (F_{(1.3, 24)} = 3.0; *p* = 0.088) or W' (F_{(1.3, 23)} = 3.8; *p* = 0.052, Table 1).

##### Table 1

The mean power output of I_{HIGH} and I_{HIGH+5%} was 344 ± 52 and 371 ± 53 W, respectively. Comparisons between actual and predicted Tlim at I_{HIGH} and I_{HIGH+5%} by CP and W' models showed no significant ANOVA main effects for I_{HIGH} (F_{(1.4, 26)} = 2.1; *p* = 0.157), but a significant main effect was found for I_{HIGH+5%} (F_{(1.7, 31)} = 4.6; *p* = 0.023). Pairwise comparisons demonstrated a significant difference between actual and predicted Tlim at I_{HIGH+5%} by the CP_{linear(95,110)} model (*p* = 0.030), with no significant differences for the CP_{linear} (*p* = 0.268), CP_{1/time} (*p* = 0.072), and CP_{hyp} (*p* = 0.512) models (Table 2).

##### Table 2

Bland-Altman plots for I_{HIGH} and I_{HIGH+5%} are presented in Figure 2. The bias ± 95% LoA in raw and percent units are presented in Table 2. Moderate heteroscedasticity was observed only for I_{HIGH+5%} in all models (Figure 2).

## Discussion

This study aimed to evaluate the prediction of cycling exercise tolerance in the boundary between severe and extreme intensity domains by different power-duration models. According to the main hypothesis, all models predicted Tlim within the severe domain (i.e., I_{HIGH}). Conversely, in partial disagreement with the hypothesis, exercise tolerance within the extreme intensity domain (i.e., I_{HIGH+5%}) was not statistically different from that predicted by CP and *W'* estimates derived from three predictive trials. However, the heteroscedasticity observed at I_{HIGH+5%} indicated that the prediction of exercise tolerance was impaired when increasing exercise intensity towards the extreme domain. Furthermore, since the model with only two predictive trials affected the estimate of exercise tolerance during the extreme intensity domain, caution is required when utilizing this model, especially for short exercise duration.

Critical power with three predictive trials was able to predict exercise tolerance at the beginning of the extreme domain (i.e., I_{HIGH+5%}), which partially refutes the main hypothesis that distinct physiological mechanisms could explain exhaustion in the severe and extreme intensity domains. This could theoretically be explained by the fact that the transitions between intensity domains are not exact points (i.e., thresholds), but a phase of modifications (Pethick et al., 2020). Alexander et al. (2019), during knee extension exercise, suggested that additional factors to those observed in the severe domain could explain exhaustion in the extreme domain. Those authors observed that exercises in the extreme intensity domain lasting ~55, ~37, and ~27 s were overestimated by CP and *W'* derived by severe-domain work rates. In addition, they showed exclusive *W'* for the extreme domain (1.7 ± 0.4 kJ), which was less than *W'* of the severe domain (5.9 ± 1.5 kJ). However, they did not report differences between predicted and actual Tlim at the work rate that would demarcate the transition from the severe to the extreme intensity domain (i.e., 60% 1 RM, average Tlim 85 s).

Likely, exercise tolerance prediction in work rates towards the boundary of severe and extreme intensity domains is still sensitive to variables estimated in the severe domain. Different from CP that demarcates a threshold between heavy and severe intensity domains and presents distinct responses in metabolic and neuromuscular variables in work rates slightly below and above CP, I_{HIGH} does not seem to indicate a threshold that presents substantial differences in these responses (Iannetta et al., 2022). Nevertheless, the increased heteroscedasticity observed herein suggests this predictability can be impacted at work rates in the upper zones of extreme domains, as observed by Alexander et al. (2019).

The findings demonstrated relative precision of CP and *W'* models to predict exercise tolerance in the severe intensity domain, with a bias between −0.7% and −4.8% for I_{HIGH}, which appears to be better than the 5.7% to 9.4% mean bias between models to predict 5-km running time-trial performance (Nimmerichter et al., 2017). However, these values were similar to the mean bias of 2.9% and 1.3% of the best individual fit of CP to predict 16.1-km (Morgan et al., 2018) and 20-min (Nimmerichter et al., 2020) cycling time trials, respectively. Interestingly, the CP_{linear(95,110)} model underestimated the actual exercise tolerance for I_{HIGH} by only −0.7%, with a lower bias than models with three predictive trials (Morgan et al., 2018; Nimmerichter et al., 2017, 2020). However, the prediction of exercise tolerance presented high individual variability (i.e., LoA), ranging from ±15.9% to ±22.9%, which is substantially superior to the LoA of ±4.6% to ±6.7% reported by Nimmerichter et al. (2020).

In contrast, in the extreme intensity domain, the concordance analysis showed that, on average, the models overestimated exercise tolerance at I_{HIGH+5%,} with an average bias between 2.4% and 6.6%. Furthermore, LoA values for I_{HIGH+5%} ranged from ±16.8% to ±21.7%. Therefore, despite a significant difference between actual and predicted exercise tolerance at I_{HIGH+5%} only for the CP_{linear(95,110)} model, the findings indicate that CP and *W'* models tended to underestimate exercise tolerance at the upper intensity of the severe domain and overestimate Tlim at the lower intensity of the extreme domain, with high interindividual variability in both.

Some factors may contribute to the high LoA values found herein in the severe intensity domain compared to previous reports (Morgan et al., 2018; Nimmerichter et al., 2020). First, it could be the type of performance to determine CP, since the time-trial test has a lower variation in test-retest reliability (Laursen et al., 2007) and lower SEE for both CP and *W'* (Karsten et al., 2017) compared to the Tlim test, which seems to result in greater precision to estimate these variables and consequently, better accuracy of performance prediction. Second, the number of predictive trials to determine CP and *W'* should be considered. There is a tendency to decrease SEE with the use of more predictive trials, depending on duration (Matunara et al., 2018), which also leads to better precision in estimating these variables due to the best mathematical modeling fit. Third, the training status of participants should be taken into account. Studies carried out with trained cyclists (Morgan et al., 2018; Nimmerichter et al., 2020) and runners (Nimmerichter et al., 2017) have described a low variability between predicted and actual performance values. This was not found in the present study, exhibiting an effect of familiarity with maximal effort tests on performance prediction in others (Morgan et al., 2018; Nimmerichter et al., 2017, 2020). It is possible that these factors were also determinants of the high LoA values found in the present study.

The linear regression analyses in the Bland-Altman plots (i.e., homoscedasticity or heteroscedasticity verification) for I_{HIGH} and I_{HIGH+5%} (Figures 2a and 2b) provide interesting information on the factors that would be involved in the tolerance to exercise at intensities above CP. While the agreement between actual and predicted Tlim for I_{HIGH} demonstrates homoscedasticity, the moderate heteroscedasticity was found in the analysis of agreement between actual and predicted Tlim for I_{HIGH+5%}. The homoscedasticity found in I_{HIGH} indicates that, even in participants in which CP models overestimated the actual Tlim, there was no tendency of the mean differences to increase as the time of the task increased. On the other hand, at the intensity 5% above (i.e., I_{HIGH+5%}), which represents the first intensity of the extreme domain in this study, the linear regression indicated that there was a significant influence of the task duration on the prediction error by CP models, in which participants who had higher Tlim also had the most overestimated prediction by models, which may denote an influence of the predominance of anaerobic metabolism in the prediction of exercise tolerance. As observed by Alexander et al. (2019) in single-joint exercise, there was a tendency for the CP and *W'* models to overestimate Tlim exercise in the extreme domain of cycling. This corroborates the hypothesis that different physiological mechanisms would be involved in exercise exhaustion and, therefore, reinforces the existence of a supra-severe intensity domain. However, there seems to be a zone of intensities at which full depletion of *W'* can still be used to predict exercise tolerance in the extreme domain, questioning which factors would be involved in determining this variable.

The present study is not free of limitations. The duration of predictive trials to determine CP and *W'* was relatively short (424 ± 48, 310 ± 37, and 223 ± 23 s for 95%, 100%, and 110% of PPO, respectively), which could affect the estimation of these variables. However, these times are included in the range recommended between 3 and 12 min by Muniz-Pumares et al. (2019) and utilized by others (Caputo and Denadai, 2008; Turnes et al., 2016). Furthermore, although the SEE of the power-duration relationship was acceptable for CP (~4.5%), it was relatively high for *W'* (14%), which is higher than the recommended 10% (Hill, 1993; Muniz-Pumares et al., 2019) and can influence the predictions made here. Although three predictive trials are sufficient to estimate CP and *W'* for 2-parameter models (Caputo and Denadai, 2008; Hill, 1993), the inclusion of more points in work-time modeling could reduce SEE. The small sample size is also a relevant limitation that should be recognized, especially for Bland-Altman and homoscedasticity/heteroscedasticity analyses.

## Conclusions

In conclusion, exercise tolerance at intensities near the boundary between severe and extreme intensity domains can be predicted by CP and *W'* derived from three predictive trials. However, heteroscedasticity analyses and the disagreement observed between the actual and the predicted exercise tolerance when increasing exercise intensity demonstrate that CP predictive ability is reduced at higher work rates of the extreme intensity domain.