Introduction

Muscle hypertrophy is defined as an increase in the cross-sectional area of a muscle due to the increase in muscle protein synthesis and contractile tissue [1, 2]. Muscle hypertrophy is a multifaceted phenomenon, that is founded on mechanical stimulation, as well as metabolic and endocrine processes that have been shown to impact gene transcription via different signaling pathways [2]. Mechanical loading, in particular, leads to a number of intracellular actions that ultimately regulate gene expression and protein synthesis via mTORC1 pathway activation [3,4,5]. Therefore, an effective strategy to promote muscle hypertrophy is represented by long-term resistance training (RT) in both men and women of different ages [6,7,8]. Variables such as exercise intensity, exercise frequency, rest periods and training volume can be manipulated in order to maximize the magnitude of the effect on muscle hypertrophy and strength [3]. Recent studies have shown that muscle hypertrophy is also associated with strength gains not only in young and middle aged adults, but also in older men and women [9]. This is of particular importance since declines in muscle mass and strength are observed due to aging [10, 11]. The age-related loss of muscle mass and strength can lead to physical disability and frailty [12, 13] and overall is associated with an increased risk of falls [14,15,16]. Multiple studies have found a link between low levels of muscle mass and low functional capacity [17, 18]. In addition, since muscle is a very metabolically active tissue, metabolic disorders associated with aging, such as diabetes, osteoporosis or decrease in testosterone and growth hormone levels may frequently occur [19, 20]. Therefore, muscle mass loss represents a significant problem for older adults.

Muscle mass loss is also associated to menopause, since a physiological hormonal change is present due to menstruation cessation [21]. In particular, a decline in estrogen concentration has detrimental effects on skeletal muscle mass and functionality, leading to reduced bone mass density, redistribution of fat to the visceral area and increased risk of cardiovascular events [21, 22]. Notably, post-menopausal women with reduced skeletal muscle mass have a 2.1 higher risk of falling and a 2.7 times greater risk of sustaining a fracture compared to women with preserved muscle mass [23].

RT can be used as a potential method of offsetting decline in muscle mass and strength, as improvements in muscle mass have been detected in postmenopausal, middle-aged and older women after RT [24,25,26,27,28].

To our knowledge, despite abundant evidence with regards to variance in response to RT in men and women, there are not enough original investigations able to provide specific guidelines for post-menopausal and elderly women in order to optimize maximal muscle gains. Thus, the main objective of this study is to review the existing literature to identify and analyze current evidence with regards to RT protocols aiming to induce muscle hypertrophy in the post-menopausal and elderly population.

Methods

The manuscript followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [29].

Search strategy and study selection

The databases PubMed (NLM), Web of Science (TS) and Scopus were used to perform a comprehensive search for relevant articles published between January 1, 2000 and November 11, 2020. The search strategy included terms in the search field “title” and/or “topic” and “abstract” of each database. The final searches were then executed using the appropriate specifications of each database using the PICOS format (See supplementary file).

Eligibility criteria

To be included studies: (1) had to include healthy women aged between 50 and 80 years of age, (2) with no physical, mental or neurological disorders, (3) interventions based solely on RT programs conducted in postmenopausal and older adult women, (4) pre and post-intervention results and (5) published in English. Publications were excluded if: (1) Reviews, meta-analysis, abstracts, scientific conference works, posters, citations, letters to the editor, books, statements, (2) non-peer reviewed journal articles, (3) commentaries, (4) together with studies reported in languages other than English.

The primary outcome was identified in a change in muscle mass, i.e. hypertrophy measured by dual-energy X-ray absorptiometry (DXA), magnetic resonance imaging (MRI), ultrasound imaging (USI), bioelectric impedance analysis (BIA) or other valid method able to detect changes in lean tissue.

Study record

Search results were uploaded to EndNote X 8.1 (Clarivate Analytics, Jersey, UK) and duplicates were removed. Two independent investigators (ET and AG) screened the titles and abstracts for relevance based on inclusion criteria for this systematic review. Full text of articles were also screened if title and abstract were not sufficient to determine eligibility. Disagreement of article inclusion was resolved by discussion and consensus with a third investigator (AB). The screening process has been summarized in a PRISMA flow diagram (Fig. 1). Three tables were created to extract relevant study data using a Microsoft Excel (Microsoft Corp, Redmond, Washington) spreadsheet. In the first table, information on the first author and year of publication, sample size, mean age and standard deviation, exercise intensity, duration of the intervention, and exercise frequency per week were shown. The second table consisted of test battery used, pre-intervention values, post-intervention values and potential discrepancy (possible incremental change) regarding lean body mass. The third table included data relevant to fat mass, when available. Authors were contacted via email if important data was missing from a particular study. If the contacted author did not respond to the questions asked about the specifics of a study, these were excluded from the review. The WebPlotDigitizer (version 4.2) software, was used to extrapolate information from figures, if relevant information for this review was not included in tables or the main text of the manuscripts.

Fig. 1
figure 1

PRISMA flow diagram describing the inclusion process of the retrieved articles

Risk of bias assessment

For risk of bias assessment, we used The Downs and Black checklist [30] which assesses the quality of original research articles in order to synthesize evidence from quantitative studies for public health purposes. This checklist contains 27 ‘yes’-or- ‘no’ questions across five domains. It provides both an overall score for study quality and a numeric score out of a possible 32 points. The five domains comprise questions concerning study quality, external validity, study bias, confounding and selection bias, and power of the study.

Two independent researchers completed the Downs and Black checklist (ET and AG) for included articles to determine the quality of each study. The maximum score a study can receive is 32, with higher scores denoting greater quality. The studies were then separated into groups and labeled as ‘high quality’ (score 23–32), ‘moderate quality’ (score 19–22), ‘lower quality’ (score 15–18) or ‘poor quality’ (≤ 14). Interclass correlation statistical method was used to determine inter-rater reliability. Quality of evidence was obtained by the study design and by the Downs and Black score (Supplementary File). Levels of evidence and grades of recommendation have been also included for each study. The guidelines from the Centre for Evidence-Based Medicine (CEBM, http://www.cebm.net, Last accessed 12/02/2021) regarding the grading for evidence and the guidelines of the American Society of Plastic Surgeons (https://www.plasticsurgery.org/documents/medical-professionals/health-policy/evidence-practice/ASPS-Scale-for-Grading-Recommendations.pdf, last accessed 12/02/2021) regarding grading of recommendation were adopted. A supplementary table has been provided with the results of the quality assessment (Supplementary File).

Data synthesis

The included studies were first synthesized through a narrative description of the features deriving from each study. Afterwards, the essential characteristics of the studies were represented in tables, where means, standard deviations and the percentage difference between pre- and post- condition were reported. Descriptive statistics of the studies was performed through Jamovi (version 1.6.3.0, The jamovi project, 2020).

Concerning the metanalytic synthesis, the considered outcomes were body lean mass and body fat mass. For each study, means and standard deviations were noted, together with the assessment method used to detect the outcomes (BIA, DXA, MRI, US).

Meta-analysis was performed through the package metafor of the R software (version 3.5.3), using the random effect model on the Standardized Mean Difference (SMD) between pre- and post- measurements. The effects were then represented through a forest plot, and to detect potential influences of publication bias, a funnel plot was performed. The heterogeneity of the studies was estimated through the Cochrane’s Q, and a moderator analysis with age (defined by two categories: below 65 and above 65 years of age), intervention length, number of weekly sessions and number of exercises proposed was planned. Finally, to detect the validity of the results included in the meta-analysis, two sensitivity analyses considering measurement tool and study quality were performed. In the first one, only results derived from DXA were included in the meta-analysis, while in the second one, the effects from poor quality studies were considered.

Results

Search outcomes

The electronic database search yielded 7816 articles (Pubmed = 3880, Scopus = 2159, Web Of Science = 1777). Ten additional articles were identified from other sources as potentially relevant. A total of 7459 irrelevant articles were excluded based on title and abstract and further 269 duplicate records were removed. Preliminary search results and duplicate removal provided a total of 88 articles. First steps in the initial assessment of articles were to screen in detail the titles and the abstracts to identify only relevant articles. Subsequently, these full-text articles were screened for relevance and during this process, 41 articles were removed since these were not eligible according to the inclusion criteria. Twenty one articles were excluded from the study, mainly due to inadequate exposure during the intervention, meaning that methods other than RT were incorporated in the study, or ineligible outcomes were detected which did not fit the inclusion criteria of the review. Finally, 26 articles met the inclusion criteria and were included in the study [9, 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55]. Amongst included studies, the average number of participants per study was 23 while the mean duration of the studies was 16 weeks. On average, participants underwent 3 RT sessions per week including 7.5 exercises, at an intensity of ⁓60% of their 1RM, performing between 9 and 16 repetitions per set. Articles differed greatly in terms of study design, intervention length, follow-up period, subjects’ age, observed outcomes and measurement of the main outcomes.

Table 1 provides a summary of the studies included in the review. All the included records were original research articles (n = 26). In total, data from 745 participants were pooled for this review. Studies ranged from 8 [31] to 78 participants [54]. The included articles were published over 20 years from 2000 to 2020.

Table 1 Table describes the training modalities of the retrieved studies

In regards to the measured outcomes, of the 26 articles included in the study, eighteen articles used DXA scanners [33, 34, 36, 38,39,40,41,42,43,44,45,46, 48,49,50, 52, 54] to determine whether muscle hypertrophy was evident or not after the intervention, four articles used a BIA [35, 51, 53, 55], two articles used MRI [9, 31] scans and two articles used USI [32, 37]. It is important to note that these screening tools were used to measure different regions of the body which had been exposed to the RT.

Of the different body regions exposed to RT, 23 studies implemented full-body RT [32,33,34,35,36,37,38,39,40,41, 43,44,45,46,47,48,49,50,51,52,53,54,55], and 3 implemented lower body RT [9, 31, 42]. Thus, only the areas of the body which underwent the RT intervention have been examined by the afore mentioned screening tools.

Table 2 provides a summary of the results of the primary outcomes of interest, while Table 3 provides measures of body fat mass when these were available. Pre and post-intervention values have been identified for each study, and differences between the two have been outlined, when possible. As for the testing methods, there was high heterogeneity in terms of reporting interventions. Some studies reported absolute values, others the percentage of lean bone-free muscle tissue while others reported it as a muscle mass index. Alternatively, some studies reported kilograms of muscle mass prior and after the intervention. Therefore, in order to quantify and normalize the effect of each intervention, percentage differences were determined. A mean increase of 4.8% of lean body mass and a mean decrease of 2.1% of fat body mass have been observed across the retrieved studies.

Table 2 The table describes testing methods and differences compared to baseline values regarding lean body mass of each study
Table 3 The table describes testing methods and differences compared to baseline values regarding fat mass of each study

Risk of bias assessment

Risk of bias assessment was completed for all included articles. The mean Downs and Black checklist score was 19.8 with the range between 14 and 27. Studies were then divided following different quality categories suggested by Tremblay et al. [56]. Six studies were placed into ‘high quality’ category, twelve studies were placed into ‘moderate quality’ category, six studies were placed into ‘lower quality’ category and two studies ware placed into poor quality category. The inter rater reliability coefficient between the assessors was 0.87, which as reported by Dawson and Trapp [57] corresponds to “very good agreement”. For detailed risk of bias reporting of each study, refer to Supplementary File. Level of evidence of the included studies ranged between 1B and 4, with 2 studies reaching level 1B, 6 studies reaching level 2B, 2 study reaching level 3B and 16 studies reaching level 4, suggesting an overall grade C of recommendation of the included manuscripts. A breakdown for each study is provided in Supplementary File.

Meta-analytic synthesis of results

The meta-analysis was performed on lean body mass and fat body mass. First of all, funnel plots did not show any publication bias (Fig. 2 shows lean body mass). Concerning lean body mass, a significant small-to-medium increase in the post-measurement was observed across k = 43 effects (SMD = 0.44; 95% CI 0.28; 0.60; p < 0.0001) (Fig. 3). Heterogeneity of the study resulted significant, with a Q(df = 42) = 109.95, p = 0.0001. Regarding fat body mass, no significant effect was detected across k = 17 effects (SMD = 0.27; 95% CI − 0.02; 0. 55; p = 0.07) (Fig. 4). Since the retrieved effects were homogeneous, we decided to calculate moderator analysis through mixed effects model. A moderator analysis was performed concerning age of the participants (defining two categories: below 65 and above 65 years of age), intervention length, number of weekly sessions and number of exercises proposed, highlighting no significant difference in the retrieved effects, neither for lean nor fat body mass.

Fig. 2
figure 2

Funnel plot for publication bias evaluation for lean body mass

Fig. 3
figure 3

Figure shows the forest plot of the meta-analytic results regarding lean body mass. 2 Twice a week, 3 three times a week, BL bilateral, DT-ReT detraining-retraining, DUP ondulating periodized, HS High supervised, HV high volume, LI-BFR low load blood flow restriction exercise, LV low volume, MH moderate to high intensity elastic band resistance exercise, MJ multi joint, MS multiple-set resistance training, MV muscle volume, NP non periodized, PR pyramid, RF rectus femoris, RT resistance training, SJ single joint, SS single-set resistance training, TD traditional, UL unilateral, VL vastus lateralis, VHS very high supervised; < 5% < 5% of truck fat gain

Fig. 4
figure 4

Figure shows the forest plot of the meta-analytic results regarding fat body mass. 2 Twice a week, 3 three times a week, ALT alternating upper and lower body, DUP periodized, MJ-SJ multi to single joint, NP non periodized, RT resistance training, SJ-MJ single to multi joint; < 5% < 5% of truck fat gain

The results of the sensitivity analyses showed that considering only results from DXA returned a SMD = 0.36 (95% CI 0.19–0.53, p < 0.0001), therefore very similar to our main results. The second sensitivity analysis was performed excluding the poor-quality studies, with a SMD = 0.45 (95% CI 0.27–0.63, p < 0.0001), indicating that poor-quality studies did not affect the average effect size.

Discussion

This review article aimed to identify and analyze manuscripts regarding RT and hypertrophic responses in a postmenopausal and elderly adult female population. Our main findings highlight that all the analyzed RT protocols were able to moderately increase muscle mass in the sampled populations, despite differences in intervention length and assessment procedures. These effects however, are small-to-moderate (SMD = 0.44). Interestingly, no difference in lean body mass increase was present regarding age, weekly frequency and intervention length. Therefore, we could not identify a minimum dose–response for lean muscle mass improvement in the retrieved studies regarding RT. A systematic review and meta-analysis by Schoenfeld et al. [58] aimed to identify the main training parameters in order to increase strength and hypertrophic adaptations in a general population. The study has concluded that intensity is determinant in strength increases while volume can be modulated over different spectrums to promote muscle hypertrophy. These findings may provide explanation to our main results, since the included protocols had similar intensities and volume (being that the majority of studies were performing between 8 and 12 repetitions with an intensity of around 60% of 1RM). Another meta-analysis [59] which evaluated the training frequency of RT programs on gains in muscular strength has concluded that increased frequency is linked to increases in strength. However, when age groups were analyzed, only young adults seemed to benefit from increased RT frequency, while older adults did not. However, the results of this latter study only took into account measures of strength and not muscle mass. Nevertheless, evidence of a dose–response relationships in the elderly (taking into account both male and female) exists, suggesting that 2 sessions per week, performing 2 to 3 sets of 8 exercises, is effective in promoting strength and to modify muscle morphology [60]. The suggested protocol almost overlaps the mean reported data present in Table 1, which could explain the homogeneity of the results regarding lean body mass improvements observed across the retrieved studies.

It is important to note that increased muscle mass does not necessarily imply a causal relation with strength improvements [61] since the mechanisms responsible for strength development and muscle hypertrophy are different in nature [62, 63]. For example strength improvements as a result of increased neural drive are observed well before muscle hypertrophy as the result of increased motor unit firing rate or agonist–antagonist co-activation [62], while muscle hypertrophy is mainly stimulated by metabolic stress and mechanical tension which then activate intracellular pathways inducing muscle growth[63]. Although not a primary outcome of this review, as reported in Table 2, increases in strength were also observed for bench press, chest press, leg press and knee extension exercises.

The small effects highlighted by the meta-analytic synthesis, seem to be in line with the most recent scientific evidence, since as a consequence of aging, increased anabolic resistance, diminished muscle regeneration, impaired muscle activation and a reduction of the number of motor units are frequently observed [64]. However, precisely for these reasons it is important to engage postmenopausal and elderly women in RT programs, in order to improve muscle mass and strength, to reduce the risk of injury, and improve quality of life during the aging process [65].

Other analyzed aspect in this review was fat mass, which did not show any difference as a consequence of RT. Despite the general agreement regarding improved muscle mass to RT, there are still controversial reportings regarding fat mass [66], since some authors advocate decreases [67], while others do not [35] in this specific population.

It is important to outline that only the study by Nascimento et al. [34] considered dietary intake along with the RT protocol. Although nutritional aspects of training and recovery go outside the scope of this study, we cannot neglect the importance of proper diet in muscle hypertrophy [68], especially for post-menopausal women in which a correct nutritional regimen is recommended [21, 69], since the frequent dyslipidemic profiles observed [66]. In addition to nutritional status, other important consideration needs to be outlined regarding hormonal replacement therapy that may be prescribed as a form of prevention for estrogen deficiency. None of the included participants were undergoing hormonal therapy. To be noted that six studies [32, 37, 40, 50, 52, 53] did not specify if the participants were using hormonal replacement therapy. Therefore, the results of this study may primarily be attributable to the effects of the RT interventions.

Concerning clinical application, Nelson [70] and colleagues have shown that adults who do not take part in regular RT lose on average 0.46 kg of muscle per year from the age of 50. Additionally, sedentary subjects have reported a 50% reduction in fast-twitch muscle fibers by the age of 80 [27, 71]. These review findings are particularly important for post-menopausal and older women who are more susceptible to sarcopenia than men [72, 73]. Thus, creating exercise prescription that will induce hypertrophy, specifically in postmenopausal and older women could represent an efficient strategy to counteract the effects of sarcopenia and would contribute to overall better quality of life in adult women [74].

The strengths of this review embedded a comprehensive search strategy, stringent predetermined inclusion and exclusion criteria and thorough analyses of each article included. We included articles that examined direct and indirect measures of muscle hypertrophy and focused on incremental changes in the musculature trained in postmenopausal and older adult women. The results of our study provide evidence that RT may possibly counteract the effects of sarcopenia.

Nevertheless, our study is not without limitations, including the various types of outcome measurements sampled with heterogeneous methods. Despite significant efforts to identify appropriate techniques for muscle mass quantification [75, 76], a consensus regarding a gold standard procedure still needs to be defined. A recent review article has proposed dual energy X‐ray absorptiometry as a reference technique [77], taking into account advantages and disadvantages across available muscle mass measures. Therefore, data interpretation should always consider the principles behind each measurement technique before comparisons. One of the main limits, which neither dual energy X‐ray absorptiometry nor other adopted methods as CT scans, bioimpedance analysis and ultrasound evaluations, are able to detect fatty infiltrations within muscles, which is a common process caused by aging [77], which could lead to overestimation of muscle mass [78] in elderly or obese people.

Other aspects which needs to be considered are the relatively short intervention duration (mean 16 weeks) in the majority of studies, and study designs, since only three studies were randomized control trials while the majority were case series, therefore leading to a general low level of evidence. In addition, despite the single study results seem to be not homogeneous, no substantial differences in intervention length and assessment methods were present. Therefore, the moderator analysis performed could not identify a minimum dose–response to RT.

Conclusion

Based on the data acquired through our systematic literature search, RT protocols are able to moderately increase muscle mass in post-menopausal and elderly woman but not to reduce fat mass. Exercise frequency, number of exercises per training session, protocol length and age of the participants do not seem to significantly moderate the evaluated effects. Research needs to concentrate on defining training parameters as volume and intensity in order to help sport professionals to more effectively program RT protocols which would in turn counteract the effects of sarcopenia.