Introduction

The general physical health of children in the Western world is excellent, but there is growing concern that an increasing number of children may be struggling with mental health problems. In response to this, a vast number of prevention programs have been developed and implemented in schools, municipal services, and health services.

Symptoms of mental ill-health in children may be either externalizing or internalizing in character. This distinction does not preclude that the same child may suffer from symptoms of both kinds, and that aggressive, acting-out behavior may indeed mask depressive feelings and anxiety. Even so, externalizing and internalizing problems are usually understood in different etiological terms, and met with different intervention strategies.

In general, prevention programs targeting externalizing problems in children build on behavioral and social learning principles. Major formats for delivery are parent training and school-based programs. Parent training programs aim to strengthen positive parenting and reduce coercion, which in turn will reinforce pro-social development in the child (e.g. DeGarmo et al. 2004). School-based prevention programs typically train children in self-regulation and social skills (e.g. Conduct Problems Prevention Research Group 1999), and/or train teachers in how to respond to acting-out children in ways that will promote positive development (e.g. Ialongo et al. 1999). School programs may be implemented in their own right, or as a complement to parent training, in a multimodal format (e.g. Eddy et al. 2003).

There are different strategies for delivery of prevention programs. Universal prevention targets entire populations. Selective prevention is offered to sub-populations with known risk factors, for instance children living in socio-economically disadvantaged neighborhoods, or children of parents with substance abuse. Notably, selective prevention is not based on the assessed risk of the individual child. This is however the case for indicated prevention, which may be offered to children with, for example, elevated symptom levels. Because of its focus on the individual, indicated prevention allows for tailoring the intervention to individual needs (Mrazek and Haggerty 1994).

In order to evaluate preventive effects, it is necessary to study what happens over time. To determine if the intervention decreases the likelihood for the unwanted future outcome, follow-up assessments, of both the intervention and the control group, are imperative. According to standards formulated by the Society for Prevention Research, the minimal post intervention interval before follow-up must be 6 months (Flay et al. 2005). Few prevention studies meet these standards. Typically, original studies as well as published systematic reviews of prevention programs have focused on the immediate effects on child behavior, measured directly post intervention.

In a systematic review of the effect of training programs for parents of children 0–7 years, Kaminski et al. (2008) included 48 controlled studies. They found a standardized mean difference (SMD) of 0.25, favoring intervention, from pre to post test. Lundahl et al. (2006) included 63 controlled studies on parent training programs, and found that the SMD was 0.42 post test, but decreased to 0.21 at follow-up, specified only as “months later”. Barlow and Parsons (2003) pooled five studies of training programs for parents of children age 0–3 years. The effect size was 0.44 for parental observations and 0.55 for independent observers. However, only two studies had follow-up data, according to which positive effects diminished and became insignificant.

Three systematic reviews have analyzed the effects of school-based interventions to prevent externalizing symptoms. Wilson and Lipsey (2007) conducted a broad meta-analysis and included 249 studies, with no explicit criteria for study quality, and found an effect size of 0.21 for universal programs, 0.29 for selective programs and 0.05 (n.s.) for multimodal programs, in a pre to post intervention test. Effects were largely the same for programs implementing behavioral, cognitive, and social skills components. Hahn et al. (2007) found a 15 % reduction in acting-out behavior, when pooling twelve studies that had externalizing problems as outcome measure. Effects at follow-up were not quantified but were reported to decrease with time. Mytton et al. (2006) included 34 randomized controlled trials (RCT) fulfilling Cochrane quality criteria and targeting aggressive and violent behavior. The post-test effect size was 0.41, with no tendency to decline in the seven studies that had a follow-up at 12 months.

It is striking that although a primary aim of most programs is to prevent serious externalizing problems in adolescence by offering interventions to children of preschool or early school age, none of the previous reviews have systematically investigated the lasting effects of these programs. Rather, the reviews, like the majority of the primary studies, focus on pre- to post intervention effects, with unsystematic reporting of follow-up assessments, at best. Likewise, previous reviews have often summarized intervention effects, without distinguishing between prevention strategies, or between prevention and clinical treatment trials.

Aiming to fill this gap, and in accordance with the Society for Preventive Research guidelines, our systematic review had a firm focus on studies with a follow-up period of at least 6 months post program termination. We also aimed to limit the review to prevention programs, and exclude interventions offered to children seeking clinical treatment for manifest problems. With this focus on preventive effects, the following research questions were posed:

  • Which programs are effective in preventing mental ill-health of the externalizing type?

  • What is the relative effectiveness of universal, selective or indicated prevention programs?

  • Are there any risks involved in using the programs?

The present state of knowledge did not provide a basis for formulating testable hypotheses, and the review was therefor largely explorative. However, we did expect weaker evidence for effect when applying a 6 months follow-up criterion, as compared to post intervention tests. Also, in keeping with the prevention literature at large, we expected smaller effect sizes for universal as compared to selective and indicated prevention trials. For ethical reasons, we included a specific focus on possible negative intervention effects.

Methods

The systematic review presented in this article is primarily based on a health technology assessment conducted by the Swedish Council on Health Technology Assessment (SBU), an independent public authority. The total assessment also included a systematic review of programs to prevent internalizing symptoms (SBU 2010). The literature search included PubMed, PsycInfo, ERIC and IBSS databases and was supplemented with studies found in reference lists and web sites dedicated to some of the programs. The literature search for the initial review was tailored to identify controlled studies, published in in English, German, French or any of the Scandinavian languages, in peer-reviewed journals between Jan 1, 1990 and October 30, 2009. The complete search strategy can be found at http://www.sbu.se/upload/Publikationer/Content0/1/barnpsykhalsa_bilagor/Bilaga%201.%20S%C3%B6kstrategier.pdf. Studies published prior to 1990 were included, to the extent that they were referred to in studies identified through the systematic literature search. For the purpose of the present article, a complementary literature search was performed in February 2013 in PubMed, now limited to studies on programs that had been identified in the original search. Four additional articles that fulfilled our criteria were found, reporting on trials of three different programs (Conduct Problems Prevention Research Group 2010, 2011; Hahlweg et al. 2010; Reedtz et al. 2011).

Inclusion and Exclusion Criteria

We included studies of programs aiming at preventing externalizing mental ill-health in children aged 2–19 years, i.e. from the early preschool years through adolescence. Since the focus was prevention, studies on clinical populations, and on children with impairments or medical conditions that substantially increase the risk for mental ill health, were excluded. The programs were required to be standardized and to have an explicit aim to prevent mental ill health. Interventions solely targeting antisocial behaviors, with substance abuse or delinquency as outcome measures and without assessment of mental health, were not included. The intervention could be directed at children and/or parents and be delivered on an individual basis or in a group setting. Care as usual (CAU) or alternative preventive interventions were accepted as control conditions. The studies had to investigate effects on mental health in children participating in the trial, and presumed mediators of effect were not accepted as primary outcome measures. Outcome measures included rating scales or clinical assessments of symptoms, structured behavioral observations, school adjustment measures with externalizing behavior assessment components (e.g. Teacher Observation of Classroom Adaptation; TOCA), clinical diagnoses of psychiatric illness and, finally, measures indicating antisocial behavior (e.g. self assessment). Outcome had to be measured no less than 6 months post intervention, and include both intervention and control groups.

With our focus on long-term effects, we also included studies that followed outcome for several years after program termination, even if all inclusion criteria were not met. Likewise, we included long-term trials reporting consecutive observations over several years, also in the absence of a follow-up 6 months post intervention, or later. Hence, the review of long-term outcome was less rigorous than our main protocol, and the results are reported under a special heading. Studies reporting negative effects, indicating that the program may involve risks, were included regardless of study design.

Study Selection and Data Extraction

Two members of the research group, independently of each other, screened abstract lists and selected studies to be reviewed in full-text. All studies selected by at least one member were read in full text, again by two researchers, for evaluation of study relevance and quality, and extraction of study data. Studies had to meet all of the following standards to be of adequate quality for inclusion in the analysis of the scientific evidence for effect: (a) adequate control of confounders, (b) attrition rates under 30 %, attrition rates of 30–50 % being accepted if a satisfactory attrition analysis was reported, (c) intent-to-treat (ITT) analysis, reported or calculable, and (d) analysis considering relevant confounders in non-randomized studies. If the two researchers were in disagreement regarding study relevance and quality, the study was processed in the entire group of eight researchers, guided by principle of consensus. Overall, the review process followed the PRISMA guidelines (www.prisma-statement.org). More detailed information about the evaluation of the quality of each study is available on request.

Data Analysis

When possible, meta-analyses were conducted by using the Cochrane Collaboration Review Manager software (http://ims.cochrane.org/revman). The pooled results for continuous outcomes were expressed as SMD, in accordance with Cochrane Collaboration recommendations. Effect sizes were classified as small, medium or large as proposed by Cohen (1992). A requisite for drawing conclusions regarding the scientific evidence for effect of a specific program was that it had been subject to at least two trials that met the inclusion criteria and had comparable outcome measures.

Research Ethics

Prior to the review, all research group members had signed a declaration assuring no conflict of interests. The study did not involve primary data, and ethical review and approval was therefor not applicable.

Results

A flow chart of the literature search and review is presented in Fig. 1. A substantial number of studies were excluded, either after reviewing the abstracts or the full text, due to an insufficient follow-up period. In the end, 38 controlled trials with adequate study quality were identified, evaluating in total 25 different prevention programs for externalizing problems. The vast majority of the included trials had been conducted in the USA, followed by Canada, Australia, and England. Only a few selective trials had been performed in Continental Europe. The programs included are summarized in Table 1.

Fig. 1
figure 1

Flow chart of the literature review

Table 1 Programs included in the meta-analysis

Design of Trials

Thirty-six of the 38 studies were RCT, and two were controlled without randomization. Four of the RCTs had used an optimal method for randomization. The number of participants in the respective trials varied from 100 to 998, with the largest samples recruited for universal trials.

The majority of the trials employed a no intervention or CAU control group. Two had what is best described as an attention control, whereas six employed a design with more than one treatment condition, to be compared with no intervention.

As primary outcome measure, the majority of trials employed various symptom rating scales. A few studies also included structured behavioral observations, as a complementary outcome measure. Long-term follow-up studies used (presence or absence of) psychiatric diagnoses as an index of outcome, as well as overall psychosocial adjustment including educational attainment and employment. Eleven studies had used some sort of blinded outcome assessment.

Program Content, Length and Intensity

All included programs contained cognitive-behavioral components. Many of the programs were modified versions of interventions that had first been developed as clinical treatments (e.g. Incredible Years/IY). Cognitive techniques were most visible in programs targeting older children, whereas purely behavioral techniques were more frequent for young children. Clear examples of the latter were the Good Behavior Game (GBG), which uses a token economy to encourage on-task and pro-social group performance, and the parent management techniques promoted by IY and the Positive Parenting Program (Triple P). Social learning theory had influenced program content visibly in both cognitive and modeling techniques. Several programs targeting parents included home assignments on the assumption that positive change requires active practice of new and more adaptive behaviors. One single program, Prime Time, subject to only one included trial, gave reference to attachment theory.

Program length ranged from three sessions given within a single month, to several years. The longer interventions tended to be less intensive. Most common were weekly sessions over a period of 3–9 months. The shortest programs were unimodal, targeting parents, whereas extended interventions tended to be multimodal. Program length varied with content and target populations, in a way that defied analysis regarding its unique impact on effect.

Competence of Staff

In general, program staff members were highly qualified, both with respect to general educational background and specific program competence. Many trials relied on health professionals, such as psychologists and counselors (31 %), quite a few used graduate students (17 %) or other members of the research team (17 %). Several programs were implemented in schools, and teachers served as program staff in 28 % of the trials. Notably, just a few trials (8 %) were conducted without involvement of the program developers.

Program Target Population and Prevention Level

According to our classification, five (14 %) of the 36 trials used a universal strategy of delivery, 16 (44 %) were selective, and 15 (42 %) were indicated. Note that our classification was not always in agreement with that of the authors, who might consider a program universal if it was offered to all families in a high-risk neighborhood. According to our definition, such interventions were classified as selective.

Basic information including findings from all of the included studies of universal, selective and indicated programs, respectively, is summarized in Tables 2, 3, 4. Length of follow-up(s) is stated, and the overall outcome is expressed as +/0/−; where + indicates a statistically significant positive effect of the intervention, 0 no effect and − a negative effect of the intervention, i.e. the control group had a better outcome than the intervention group. More detailed information, on all studies included, can be retrieved in tabulated form at http://www.sbu.se/upload/publikationer/Content1/1/Eng_tabeller_psykiskohalsa_web.pdf.

Table 2 Preventive effects of universal programs
Table 3 Preventive effects of selective programs
Table 4 Preventive effects of indicated programs

Effects of Universal Prevention Trials

In the following subsections, effect sizes are expressed in accordance with Cohen’s (1992) recommendations. A standard mean difference of 0.20 between intervention and control groups is referred to as a “small”, 0.50 as a “medium” and 0.80 and beyond as a “large effect”.

Six different programs were studied in one universal trial each. Three of them were school-based and entirely implemented in the classroom by teachers under supervision, namely Rochester Social Problem Solving Training Program, Second Step, and the GBG. Another three programs were school-based but also involved parents; the Baltimore Classroom-Centered and Family School Project (including GBG as a school component), the Promoting Alternative THinking Strategies (PATHS) program, and the Adolescent Transition Program, which rests heavily on parental involvement. No universal programs targeting parents only met our inclusion criteria.

Results regarding effect of universal trials are summarized in Table 2. According to three studies, GBG reduced symptoms of externalizing behavior in schoolchildren for at least 12 months, although effect sizes were small. Other universal school programs had been subject to a maximum of one study of adequate quality, and the scientific evidence regarding their respective effect was therefore insufficient.

Effects of Selective Prevention Trials

Nine different prevention programs were tested in 17 selective trials that met our inclusion criteria, and their results are summarized in Table 3.

Trials of the parent training programs Triple P and Incredible Years allowed for meta-analyses, as presented in Figs. 2, 3. Both programs reduced symptoms of externalizing problems in preschool children, who had minor to moderate social problems, for at least 12 months. The effects were small to medium (Fig. 2). The Incredible Years had been tested only in socio-economically disadvantaged environments. In those contexts, the program had a small effect on symptoms of externalizing problems in pre-school children, rated by blind observers at least 8 months post intervention (Fig. 3). Symptom ratings by parents suggested that the program had little or no effect (Fig. 2).

Fig. 2
figure 2

Selective prevention with the Incredible Years and Triple P parent training programs: Parental ratings of child behavior at follow up: a 6–8 months, b 12–16 months post intervention

Fig. 3
figure 3

Selective prevention with Incredible Years: Independent observer ratings of child behavior at 1 year follow up

Selective trials targeting families affected by internal stress (Parent Management Training/PMT, New Beginnings, Family Bereavement Program, Adolescents and Their Parents with Aids, considered together) reduced externalizing behavior in the children at least 11 months post intervention. The average effects were small.

The review did not allow for conclusions regarding the effects of any other program subject to a selective trial, since the remaining studies were too heterogeneous to be pooled in a meta-analysis.

Booster sessions were reported for a few of the selective trials, with variable results. One extra session of IY 1 year after program termination reported no effect, whereas a complete repeat trial of SAFE Children 3 years later reported a small but significant effect.

Effects of Indicated Prevention Trials

The effects of 11 programs were tested in a total of 16 indicated trials of adequate quality. Another 25 indicated trials met the inclusion criteria, but were of insufficient quality to contribute to the scientific evidence. Included trials represented family support programs, school programs, and multimodal programs. The results are summarized in Table 4 and Fig. 4.

Fig. 4
figure 4

Indicated prevention: Parental ratings of child behavior at 1 year follow up

The Family Check-Up (FCU), a family support program, was based on a structured three-session assessment and feedback intervention, but could also provide individually tailored continued support, and treatment. Three large trials of FCU were included in the review, showing reduced symptoms of externalizing behavior in children and adolescents for at least 12 months. The effects were of medium size.

Coping Power was subject to two trials, primarily implemented within the school curricula but with complementary supportive education targeting parents and teachers. It reduced the degree of externalizing behavior in schoolchildren for up to 12 months, with medium effects. However, sample sizes were small, and the attrition rates were 30–45 %.

Indicated trials of all other programs showed inconsistent results, 6 months or more post-intervention. See Fig. 4.

Long-Term Outcome

Eight selective or indicated trials, of which seven are presented in Tables 3 and 4, had been subject to long-term follow-up studies with at least one observation 5 years or longer after program termination. In the case of Fast Track, there had been consecutive observations during a 10-year long intervention, complemented with a follow-up 3 years post intervention.

These studies reported a lower incidence of psychiatric diagnoses (Fast Track, GBG and New Beginnings), better school attendance (Montreal Prevention Project), lower incidence of delinquency (PMT) and overall problem behaviors (Family Check-up), and a higher employment rate and self-support (Adolescents and Their Parent with AIDS). However, the long-term effects were small and typically found only on occasional outcome measures.

An eighth trial, the Seattle Development Project, presented a special case with one extremely long-term follow-up study. The program aimed at preventing antisocial adolescent behavior through an intervention delivered in different steps during grades 1–6. Initially, the study was randomized, but was converted into a quasi-experimental design when additional cohorts were recruited. Long-term observations were made when participants were 18, 21, 24 and 27 years (Hawkins et al. 1999, 2005, 2008). Despite an explicit program aim to prevent externalizing problems, positive long-term effects mostly concerned internalizing problems. At age 27, significantly fewer psychiatric diagnoses were reported for those who had participated throughout grades 1–6.

Negative Effects

The literature search on negative effects of prevention programs rendered 534 abstracts. In the end, ten studies constituted the scientific evidence for negative effects; a few of them were also part of the assessment of prevention effects. Typically, the reports on negative effects were based on incidental findings, which ran contrary to expectation. Early on, Dishion and colleagues reported an unexpected increase in externalizing symptoms and disruptive behavior in 11–14 years old participants in a group intervention for youths at high risk, the Adolescent Transition Program (Dishion and Andrews 1995; see Table 4). Program involvement of parents was reported to have a small but protective effect. Additional longitudinal studies of ATP, including a follow-up of the Cambridge-Somerville Youth Study, have confirmed these findings (Dishion et al. 1999, 2001). In the same vein, Warren and colleagues reported that parental involvement is intrinsic to and eliminates iatrogenic effects of Families and Schools Together (Warren et al. 2006).

Cavell and colleagues reported that the Prime Time group intervention made low-risk group participants more accepting of aggressive and disruptive behaviors (Cavell and Hughes 2000; see Table 4). Two studies of PALS, a social skills training program administered in a group format, reported that program participation increased the risk for negative peer interactions and use of drugs (Palinkas et al. 1996). Mager et al. (2005) found iatrogenic effects only in high-risk youths participating in group interventions together with well-adjusted peers, and suggested that the group composition fueled their negative self-image.

Three studies, all limited in size, reported that selective or indicated prevention programs may have negative effects on the family system, with increased stress, tension and conflicts between other family members (Mockford and Barlow 2004; Helfenbaum-Kun and Ortiz 2007; Szapocznik and Prado 2007).

Discussion

This systematic review of prevention programs targeting externalizing problems in children lends limited support to their effects. Among several hundreds of prevention programs investigated and reported in the international literature, only 24 programs met our inclusion criteria. In fact, only five of them had been subject to more than one trial of sufficient quality, which showed positive results, a requisite for drawing conclusions regarding specific program effect. These five programs include two parent training programs (Incredible Years and Triple P), a family support program (Family Check-Up), and two school programs (GBG and Coping Power). In addition, a small group of studies, considered together, indicate that family support programs (i.e. PMT) aimed at families undergoing a period of increased stress may prevent externalizing mental ill-health in children. Overall, effect sizes were small.

Our results may seem at odds with previous meta-analyses, which have tended to report larger and more unanimously positive program effects. What may account for these diverging results? First, our analysis was designed to evaluate preventive effects only, and excluded treatment studies, where effect sizes are usually more impressive. Second, only studies with outcome measures concerning the children’s externalizing problems were included; presumed mediators such as parenting skills, or parent or teacher satisfaction, were not accepted as primary outcome measures. Third, we excluded programs that were solely targeting antisocial behaviors, with substance abuse or delinquency as outcome measures, and that had no assessment of mental health. Fourth, we only included studies that met the specified quality criteria regarding control and analysis of confounders, attrition rates and ITT-analysis. Fifth, and most importantly, we used a follow-up period of at least 6 months as a critical inclusion criterion, to exclude merely transitory effects. Considering that many of the programs in the analysis intervene in preschool or early school years with an ultimate goal to prevent the development of externalizing problems in adolescence, this seems like a fairly modest criterion.

Limited evidence for effect must not be taken as a proof that prevention programs are useless. Rather, it demonstrates that our knowledge about the effects of the programs is disturbingly meager. Scientists and practitioners concerned with the wellbeing of children should be encouraged to conduct well-designed trials, which include follow-up assessments conducted at least 6 months after program termination.

The few long-term follow-up studies that have been conducted lend some, albeit unsystematic, support to the belief that prevention programs may indeed make a difference. The results are, however, inconclusive, due to the small number of studies and also to the fact that effects measured at one specific point in time tend to be difficult to replicate during consecutive follow-ups. A given outcome measure may be relevant at one developmental stage, and of subordinate interest at another, posing significant theoretical and methodological challenges.

Prevention programs are delivered at different levels of intervention. The prevention literature at large indicates that universal prevention produces smaller effect sizes per observation unit, since the great majority of the general population is unaffected by the problem targeted. Therefore, the effects of universal prevention can only be tested in very large-scale trials. Evaluations of programs for children at risk, in indicated or selective trials, are less demanding in terms of resources and are likely to produce higher effect sizes. However, our meta-analysis lends weak general support for indicated prevention, and there was no sign that brief, indicated trials of single-component programs had any effect at all. On the other hand, data from Fast Track and Family Check-up trials, support the idea that sustainable indicated prevention may benefit children who are most at risk. In summary, our meta-analysis did not allow for any conclusions about preferable prevention level, primarily because of the small number of universal prevention trials of sufficient size and scientific quality.

The length of the parent support programs varied greatly from 1 month to several years, sometimes including “booster sessions”, but variations in effect may have more to do with the socio-cultural context of the studies than the length and intensity of the programs. Studies of Triple P, a program that has been evaluated primarily in middle class settings, have typically reported larger effects than studies of the Incredible Years program, which has almost exclusively been evaluated in disadvantaged families.

Externalizing symptoms have a strong male preponderance. Accordingly, most of the study populations in this systematic review had an uneven gender distribution, and five studies focused entirely on boys. No program in our analysis had developed gender specific approaches, and gender effect analyses were rare. Thus, the available evidence in support of the effect of preventive programs targeting externalizing problems relies heavily on effects in boys.

The possibility for negative or unwanted effects must always be taken into account. It is well documented that aggregating at-risk children and adolescents for group interventions may result in a negative outcome, through social contagion (Dishion et al. 2001). Although less well researched, there is also reason to be aware that interventions aimed at parents may disrupt the balance in a fragile family system. To date, very few intervention trials have included a systematic procedure for reporting of iatrogenic effects, and it is fair to assume that our knowledge of harmful consequences is quite limited. An obvious recommendation for future trials is to include protocols for observation and systematic reporting also of unwanted outcomes.

Methodological Shortcomings and Challenges

Evaluating preventive effects poses a number of significant methodological and practical challenges. Since lower effect sizes are to be expected, prevention trials generally demand larger study populations than do clinical treatment trials. Cluster randomization is one strategy to handle this problem, but interferes with the basic assumption of independence between observation units, if not handled properly in the statistical analysis. Quite a few of the included studies had unbalanced study groups, with higher initial symptom levels in the intervention group compared to controls, despite adequate randomization procedures. This suggests that a regression to the mean may be part of the calculated effects, e.g. in the trials of Triple P. Another problem is that some studies present only a few out of many potential outcome measures, which raises questions about selective reporting of variables.

A major limitation in the literature is the shortage of studies reporting long-term outcome. Admittedly, there are a number of difficulties with longitudinal designs in prevention research. Maintaining study cohorts over of time is a demanding undertaking, involving sustainable logistics, at considerable costs. In reality, research funding is rarely granted for more than a few years at a time, allowing only for brief follow-up periods, at best. Furthermore, longitudinal studies present some purely scientific challenges of their own, conceptual as well as methodological. A linear relationship between a specific intervention and long-term outcome is not to be expected. Inventories measuring psychiatric symptoms at early school age may not be valid measures of mental health later in childhood, whereas school attendance and performance, as well as psychiatric diagnoses and overall social adjustment are of increasing importance during adolescence.

In most of the included trials, the program developers themselves had been actively involved, indicating a risk for allegiance effects. There is an obvious need for more effectiveness studies, carried out by independent researchers.

Conclusions and Future Directions

In spite of a vast research literature, the scientific evidence for lasting effects of prevention programs targeting externalizing problems in children and adolescents is limited. A mere handful of programs have been subject to more than one well-controlled trial with adequate follow-up. There is a need for well-designed studies that evaluate lasting effects in effectiveness studies, and address whether universal or selective/indicated approaches should be preferred, and whether there is a risk for negative consequences from program participation. Evaluation studies for prevention programs should include follow-up measures no less than 6 months post intervention, and preferably at several points in time, for both intervention and control groups, allowing for analysis of developmental trajectories and maintenance of the attained effects. Future meta-analyses in this field need to clearly differentiate between different levels of intervention, specify inclusion criteria accordingly, and limit conclusions to the level in focus.

Finally, funding agencies need to be made aware of the high costs involved in addressing the methodological problems mentioned above. Quality prevention research is dependent on sustainable funding. A lack of commitment on the part of funding sources is a major obstacle for the development and implementation of prevention programs based on sound scientific evidence.