Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

National estimates from the Youth ’19 Rangatahi smart survey: A survey calibration approach

  • C. Rivera-Rodriguez ,

    Roles Data curation, Formal analysis, Methodology, Software, Writing – original draft

    c.rodriguez@auckland.ac.nz, clriverarodriguez@gmail.com

    Affiliation Department of Statistics, The University of Auckland, Auckland, New Zealand

  • T. C. Clark,

    Roles Funding acquisition, Investigation, Project administration, Writing – review & editing

    Affiliation School of Nursing, University of Auckland, Auckland, New Zealand

  • T. Fleming,

    Roles Conceptualization, Funding acquisition, Investigation, Resources, Supervision, Writing – review & editing

    Affiliation School of Health, Victoria University of Wellington, Wellington, New Zealand

  • D. Archer,

    Roles Data curation, Project administration, Resources, Software, Validation, Writing – review & editing

    Affiliation School of Health, Victoria University of Wellington, Wellington, New Zealand

  • S. Crengle,

    Roles Investigation, Writing – review & editing

    Affiliation Department of Preventive and Social Medicine, University of Otago, Dunedin, New Zealand

  • R. Peiris-John,

    Roles Investigation, Writing – review & editing

    Affiliation Department of Epidemiology and Biostatistics, The University of Auckland, Auckland, New Zealand

  • S. Lewycka

    Roles Conceptualization, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam

Correction

17 Jun 2021: The PLOS ONE Staff (2021) Correction: National estimates from the Youth ’19 Rangatahi smart survey: A survey calibration approach. PLOS ONE 16(6): e0253512. https://doi.org/10.1371/journal.pone.0253512 View correction

Abstract

Background

Significant progress has been made addressing adolescent health needs in New Zealand, but monitoring and gathering high quality estimates of adolescent health and social issues remains challenging and resource intensive. Previous nationally representative secondary school surveys were conducted in New Zealand in 2001, 2007 and 2012, as part of the Youth2000 survey series. This paper focuses on a fourth survey conducted in 2019 (https://www.youth19.ac.nz/). The 2019 survey had a regional sampling strategy rather than a national sampling strategy as in previous years. The survey also included kura kaupapa Māori schools (Māori language immersion schools), as well as mainstream secondary schools. This paper presents the overall study methodology, and a weighting and calibration framework in order to provide estimates that reflect the national student population, and enable comparisons with the previous surveys to monitor trends.

Methods

Youth19 was a cross sectional, self-administered health and wellbeing survey of New Zealand high school students. The survey population was secondary school students of New Zealand aged 12 to 18 years (school years 9–13). The study population was drawn from three education regions: Auckland, Tai Tokerau (Northland) and Waikato. These are the most ethnically diverse regions in New Zealand and account for 46% of the adolescent population in New Zealand. The sampling design was two-stage clustered stratified, where schools were the clusters, and strata were defined by kura schools and educational regions. There were four strata, formed as follows: kura schools (Tai Tokerau, Auckland and Waikato regions combined), mainstream-Auckland, mainstream-Tai Tokerau and mainstream-Waikato. From each stratum, 50% of the schools were randomly sampled and then 30% of students from the selected schools were invited to participate. All students in the kura kaupapa schools were invited to participate. In order to make more precise estimates and adjust for differential non-response, as well as to make nationally relevant estimates and allow comparisons with the previous national surveys, we calibrated the sampling weights to reflect the national secondary school student population.

Results

There were 45 mainstream and 4 kura schools included in the final sample, and 7,374 mainstream and 347 kura students participated in the survey. There were differences between the sampled population and the national secondary school student population, particularly in terms of sex and ethnicity, with a higher proportion of females and Asian students in the study sample than in the national student population. We calculated estimates of the totals and proportions for key variables that describe risk and protective factors or health and wellbeing factors. Rates of risk-taking behaviours were lower in the sampled population than what would be expected nationally, based on the demographic profile of the national student population. For the regional estimates, calibrated weights yield standard errors lower than those obtained with the unadjusted sampling weights. This leads to significantly narrower confidence intervals for all the variables in the analysis. The calibrated estimates of national quantities provide similar results. Additionally, the national estimates for 2019 serve as a tool to compare to previous surveys, where the sampling population was national.

Conclusions

One of the main goals of this paper is to improve the estimates at the regional level using calibrated weights to adjust for oversampling of some groups, or non-response bias. Additionally, we also recommend the use of calibrated estimators as they provide nationally adjusted estimates, which allow inferences about the whole adolescent population of New Zealand. They also yield confidence intervals that are significantly narrower than those obtained using the original sampling weights.

1. Background

High quality population-based data that provides estimates of adolescent behaviours are essential for the planning of services, programmes, policy and for monitoring equitable outcomes. However undertaking such surveys are expensive, complex and resource-intensive. Significant progress has been made addressing adolescent health needs in New Zealand and globally since the turn of the century with reductions in morbidity and mortality [1, 2], and increased data surveillance monitoring of adolescent health trends. However, some areas, such as mental health issues remain a concern [3], alongside new important areas have emerged that impact adolescent wellbeing, such as vaping and social media use [4]. Monitoring and tracking trends in adolescent health are vital, particularly for Indigenous, ethnic and sexual minority youth, those with disabilities and from poor neighbourhoods [5].

To investigate the health and wellbeing of young New Zealanders, as part of the Youth2000 survey series, nationally representative secondary school surveys were conducted in New Zealand in 2001, 2007 and 2012, and 2019 [1,2]. These surveys provided an opportunity to assess the situation at each time point, and monitor trends in key indicators of health and wellbeing. These surveys randomly sampled secondary schools across New Zealand, and from each school that consented to take part, a random sample of around 8,500 year 9–13 students were selected to participate. More recent estimates are required in order to monitor progress and identify areas that need further attention. In 2019 (https://www.youth19.ac.nz/), the schools sampled only included three regions (Waikato, Auckland and Tai Tokerau/Northland), rather than from the whole country, due to loss of Government contract. Alternative funding was sought, and due to logistical and budgetary constraints a pragmatic decision to survey a smaller proportion of students and regions was made.

In this paper we present the overall study methodology, and how we have utilized a weighting and calibration framework that can provide estimates that reflect the national student population, ensure that ethnic groups, particularly Māori are adequately represented and enable comparisons with the estimates from previous surveys.

2. Methods

2.1. Study design

Youth19 was a cross-sectional, self-administered health and wellbeing survey of New Zealand secondary school students. Full details of the methods have been published elsewhere [6]. The study had the following aims:

  1. To collect, analyse and disseminate accurate, comprehensive and timely information on the health and wellbeing of young people living in Tai Tokerau, Auckland and Waikato Regions, in order to inform and improve policies and practices;
  2. To evaluate how whanaungatanga influences health outcomes for rangatahi Māori;
  3. To test the potential benefits of incorporating opt-in access to links for support services within a survey.

2.2. Target and study populations

New Zealand secondary school students (aged 13–18 years, school years 9–13) were surveyed across three regions: Auckland, Tai Tokerau/Northland and Waikato. Almost half the New Zealand youth population resides in these areas (46%), these are the most ethnically diverse regions in New Zealand and include a range of urban and rural settings as well as a breadth of socio-economic groupings. These three regions were chosen to represent the diversity of the New Zealand population, and to ensure that the number of participants from each of the main ethnic groups provided sufficient statistical power for sub-group analyses. Previous population-based studies have used these three regions and found them to be representative of national statistics [7].

2.3. Sampling design

We used the Education Counts 2017 national list of schools as our sampling frame [8], and excluded schools from regions other than Auckland, Tai Tokerau and Waikato. We used a two-stage cluster sampling design. We included single sex, co-education, public, private and fully integrated schools that had over 50 students in years 9–13. As in the previous three surveys, schools with under 50 students were excluded for logistical reasons, hence the conclusions presented here are only for students attending schools with over 50 students. Special schools that only included students who had intellectual or physical disabilities which would have prevented them from being able to participate in the survey where excluded. We stratified our sample by kura schools and educational regions. There were four strata, formed as follows: kura schools (Tai Tokerau, Auckland and Waikato regions), mainstream-Auckland, mainstream-Tai Tokerau and mainstream-Waikato.

There were 161 eligible mainstream schools (100 in Auckland, 23 in Tai Tokerau and 38 in Waikato). From each stratum, 50% of the schools were randomly sampled using a random number generator. All selected schools were invited to participate through email and follow up phone calls. We piloted the survey in two additional schools from the same sampling frame in Auckland in 2019, these two schools were purposively selected. These were large ethnically and socio-economically diverse schools. Minimal changes were made to the survey after piloting, and these schools have been included in the total. We visited the other schools that agreed to participate between May and September 2019. We randomly sampled 30% of students on the school roll to be invited to participate in the study. One mainstream school also requested 100% of students be invited and this was done.

There were 8 eligible kura kaupapa Māori schools in the three study regions, and two from each region (6 in total) were invited to participate. These schools are smaller than mainstream schools and include immersion in Māori language and culture. Four schools participated and all students in these kura kaupapa schools were invited to participate.

We calculated sample weights as inverse probability weights using the sampling design described above. This design is described in detail in Table 1 and Fig 1.

2.4. Data collection

The survey was refined from previous Youth2000 series questions (https://www.fmhs.auckland.ac.nz/en/faculty/adolescent-health-research-group/publications-and-reports.html), validated measures and measures used in other surveys, as well as new questions developed from a rangatahi and Māori whanau photovoice and qualitative research process, and a digitally integrated survey process. Students completed the web-based survey on tablets in English or Te Reo Māori (the language of New Zealand’s indigenous people). Questions appeared in text on the screen and were available via voice-over through headphones [6].

2.5. Regional estimates

Most estimates for this study are based on totals, means or proportions. To help simplify the exposition, we present the methods in the context of estimating totals. This is applicable to means and proportions since they are functions of totals. Initially, we have a population of size N and we are interested in estimating the total of a variable of interest, called y, which can be written as . In the absence of complete data from all the population, Ty cannot be calculated. Consequently, this should be estimated. Since the sampling design is a stratified- multistage design, the estimator has to account for this design through weights [912]. The weighted estimator of Ty is where wi = 1/πi, and πi is the sampling probability for individual i. The weight wi can be interpreted as the number of people that individual i represents in the population. This type of estimator and its variances are available from the survey package in R.

2.6. Missing observations and extrapolation to the national population

The weighted estimator presented above accounts for the sampling scheme, but it has several drawbacks. First, it is unbiased, but only when there is not missing information. Second, it is known to be inefficient because it yields wider confidence intervals than other estimators of totals [13]. An approach to attaining more efficient estimators is to use auxiliary information available for the entire population (e.g. information from the sampling frame). For instance, for the Youth 2019 surveys, an option would be to use information on the ethnic distribution of students in the population. This information was not used to inform the sampling design, but we can use it post design to improve the estimators [14, 15].

Calibration is among these methods, it has been used in the literature when sampling weights are incorrect, to correct for non-response or to extrapolate to wider populations where there is compelling evidence that the factors contributing to the estimators are very similar in the target population and in the wider population [10, 16, 17]. The primary idea of calibration is to adjust the sampling weights wi such that totals of known quantities are exactly estimated. To see this, let M denote the total number of Māori students in the population of interest. From the sampling frame, we know that this number is 24983 for the three regions in the study, and 59040 for the whole country. Although M is known, it is interesting to investigate what would be the estimator of M using only the survey data. That is , where li is a binary variable denoting if the individual is Māori or not. Since M is actually known, one could always modify the sampling weights such that . The new weights () are found by minimizing a distance function between the original sampling weights and the modified weights subject to the constraint . The new weights are known as calibrated weights and the estimator is denoted . In theory, the variance of will never be larger than the variance of , which is based on the original weights. This calibration process can be done using several variables simultaneously. For example, the weights can be calibrated to demographic factors that are considered important in the analysis, and are available both for the sampling frame and the study population. Calibration can be implemented via the survey package in R with the function calibrate() [18, 19].

2.7. Calibrated estimates: Regional and national

We use calibrated weights at the regional level (Regions: Auckland, Tai Tokerau and Waikato) in order to improve the efficiency of our estimates, and adjust for differential non-response. In our case, we calibrate the regional weights to Regional totals of the demographic variables available from Education Counts: kura kaupapa Māori, School Deciles, Age, Gender and Ethnicity. The deciles are a measure of the socio-economic position of a school’s student community relative to other schools throughout the country. For example, decile 1 schools are the 10% of schools with the highest proportion of students from low socio-economic communities, whereas decile 10 schools are the 10% of schools with the lowest proportion of these students. A school’s decile does not indicate the overall socio-economic mix of the school or reflect the quality of education the school provides. Deciles are used to provide funding to state and state-integrated schools to enable them to overcome the barriers to learning faced by students from lower socio-economic communities. The lower the school’s decile, the more funding they receive [20].The majority of our outcome variables show a significant relationship to at least one of these demographic variables, this can be seen in the descriptive plots in S1 Statistics. Calibration invokes no assumptions apart from the study sample being a sample selected using a probabilistic design from a population of interest [16]. In our case, this assumption holds for regional estimates. However, for national estimates, the population of interest (national) is different to the population from where the sample was selected.

Our main goal is to generate national statistics that enable us to compare the results to previous national surveys. In order to do this, we have to assume that the regional sample is selected from the national population. This means that the distributions of factors contributing to the estimators are very similar in the Regional population (Regions: Auckland, Tai Tokerau and Waikato) and in the national population. In order to account for the demographic distribution of the national population, we calibrate these weights to the National totals of the same demographic variables used for the regional weights (kura kaupapa Māori, School Deciles, Age, Gender and Ethnicity). This calibration was done using the calibrate() function from the R package survey. The totals used for calibration are education counts available from https://www.education.govt.nz/our-work/contact-us/. In order to understand how different weights affect the estimation of outcomes of interest, we compared results for key health and well-being indicators (Tables 4 and 5).

2.8. Ethics

In each participating mainstream school, the principal or head of the board of trustees provided consent for the students to be invited to participate. Information for parents in English and Te Reo Māori was provided to the school (digitally and or printed) and made available to parents and caregivers who could opt to have their child excluded from the survey. Ethics approval was granted by the University of Auckland Human Subjects Ethics Committee (application #022244).

3. Results

3.1. Study participants

There were 2,531 schools nationally, and 624 in the Auckland, Tai Tokerau and Waikato regions. We excluded 819 schools from these regions because they had less than 50 year 9–13 students, and five because they were partnership schools. A further 2 were excluded mistakenly due to human error. This left 161 eligible mainstream schools in the three regions. Two large ethnically diverse schools were purposively selected for piloting, and 78 schools were randomly selected, making 80 (49.7%) in total. Of these, 45 (56.3%) agreed to participate. There were 41,828, students at participating mainstream schools, and 7,374 (59.7%) participated. The sampling design is shown in Fig 1.

There were 95 kura kaupapa Māori nationally, and 8 in the Auckland, Tai Tokerau and Waikato regions. Six were invited, and 4 (66.7%) agreed to participate. There were 470 students at participating kura, all of whom were invited to participate, and 347 (71%) participated. The rest of the results are presented for mainstream schools only.

The unweighted characteristics of participating mainstream schools and students are shown in Table 2, alongside comparable data for the previous national surveys, and all secondary school students in New Zealand. The participation rates were much lower than previous surveys, both for schools (56.3%) and for students (59.7%). In 16 schools, participation was under 50%, with a measles outbreak, teacher strikes, and high truancy rates indicated by school staff as likely to have affected response rates in their schools. Apart from this, illness, assessments and field-trips may have resulted in students being unable to participate. The majority of non-participating students did not arrive at the room in which the survey was taking place, and only 49 arrived at the room but declined to participate.

thumbnail
Table 2. Unweighted characteristics of participating mainstream schools and students from previous surveys and national school data prior to the studies.

https://doi.org/10.1371/journal.pone.0251177.t002

There are some important differences in the demographic characteristics of participating students compared to previous surveys, and to the national secondary school population distribution. There were a lower proportion of high decile schools included in the sample, but higher participation at high decile schools means that the 2019 sample population matches the national student population quite well in terms of school decile. There was a lower proportion of boys (45.1%) compared to the national student population, and to the surveys in 2001 and 2012, and a slightly higher proportion of students aged 17 and above than in previous surveys, though this is still lower than the national student population. There were ethnic differences too, with a much higher proportion of Asian students than previous surveys, and than the national student population, reflecting a higher proportion of the Asian population living in the Auckland region.

3.2. Estimates

Tables 3 and 4 display the actual regional and national totals and proportions for variables used to calibrate the sampling weights. This excludes kura kaupapa Māori schools because previous waves did not include such schools. We can observe in the confidence intervals that the variance yielded by calibrated weights is zero for these variables. This is due to the fact that we are calibrating to the actual totals, therefore the calibrated estimates should be exactly the same as the actual totals and in consequence there is no uncertainty (or variance). Fig 2 shows the distribution of calibrated and sampling weights. There is a significant shift (right skewed) in the distribution of calibrated weights. This happens because a large number of individuals are overrepresented by the original sampling weights. Thus, calibration decreases the magnitude of the weights of those individuals that are overrepresented, and increases the weights of individuals that are underrepresented.

thumbnail
Fig 2. Distribution of regional and national sampling weights and calibrated weights.

https://doi.org/10.1371/journal.pone.0251177.g002

thumbnail
Table 3. National and regional student population estimates- comparison of actual quantities, estimates using sampling weights and estimates using calibrated weights.

https://doi.org/10.1371/journal.pone.0251177.t003

thumbnail
Table 4. National and regional estimates- comparison of actual quantities, estimates using sampling weights and estimates using calibrated weights.

https://doi.org/10.1371/journal.pone.0251177.t004

Table 5 shows the estimates of the total student numbers for key health and wellbeing indicators, and Table 6 shows the estimated proportions. For the regional estimates, calibrated weights yield standard errors lower than those obtained with the unadjusted sampling weights. This leads to significantly narrower confidence intervals for all the variables in the analysis. We only present calibrated estimates of national quantities because the sampling design was a regional design and therefore, we do not have national level sampling weights. However, national calibration provides a tool to compare to previous surveys, where the sampling population was national.

thumbnail
Table 5. Estimates of total student numbers for health and wellbeing indicators.

https://doi.org/10.1371/journal.pone.0251177.t005

4. Discussion

We have conducted a multistage cluster sample survey of New Zealand secondary school students from three regions, to build on three previous national surveys. We used calibrated inverse probability weighting in order to correct for demographic differences between the regional and national student populations and for non-response, which enables extrapolation of the results from the Youth 2019 survey to the whole secondary school population of New Zealand. The original sampling design was only representative of the three main regions in New Zealand (Tai Tokerau, Auckland and Waikato). These three regions are believed to represent the diversity of the New Zealand population [17], however using these data to make inferences about the national situation is imprecise.

One of the main goals of this paper is to improve the estimates at the regional level using calibrated weights. Calibration aims to account for oversampling or non-response of some groups of individuals. An interesting example is the proportion of individuals suffering depressive symptoms. Using the original sampling weights, this proportion is estimated to be 0.25 (0.224,0.275), while national calibration yields a lower estimate of 0.227 (0.216,0.239). A reason for this is that the original sample could have oversampled individuals more prone to suffer such symptoms, including a higher proportion of girls and a higher proportion of Pacific and Asian students. Another example is the proportion of individuals who reported binge drinking in the last 4 weeks. This proportion is estimated as 0.177 (0.155,0.199) using sampling weights, while it is estimated to be 0.219 (0.204,0.234) using nationally calibrated weights. The original sample could have oversampled individuals that were less likely to engage in binge drinking, with the higher proportion of girls, Pacific and Asian students representing groups who engage in binge drinking less.

A question that arises is what estimators are more reliable. In such case we recommend the use of calibrated estimators as they yield confidence intervals that are significantly narrower than those obtained using the original sampling weights. This is a well-known property of calibration since it reduces the uncertainty in the sample by incorporating information known prior to the study [19, 23, 24].

An additional goal of this paper was to use the regional sample to make inferences about the whole adolescent population of New Zealand. This is particularly important because previous Youth2000 surveys were designed using a national sampling frame instead of a regional sampling frame. The 2019 survey was designed using a 3-region sampling frame for logistical and financial reasons. There is ongoing interest in comparing the results and trends with previous national surveys. To achieve this, we calibrated our regional sampling weights to represent the national population based on some of the demographic factors presented in Table 2. The calibrated estimates presented in Table 4 show some differences between the regional and national proportions.

There are few nationally-representative data available for health and wellbeing indicators among New Zealand youth, apart from the Youth 2000 surveys. The ASH Year 10 Snapshot survey reported that 5.9% of Year 10 students are regular smokers. Our data shows that 4.8% of all secondary school students are regular smokers, but this includes younger students who are less likely to smoke. The NZ Health Survey estimated that 78.9% of adults over 15 years visited their GP in the last 12-months, which compares with our estimate of 78.1%. Likewise, the NZ Health Survey estimated that 20.6% of adults over 15 years had an unmet need for healthcare, and our data estimates this to be 20.2% [25, 26] These results highlight that calibration methods can improve the precision of national estimates when compared to similar surveys, however it should be noted that calibration methods cannot account for factors outside of demographic features (i.e. unique regional differences) and therefore should be utilised with this limitation in mind.

Future research will involve calibration of the previous surveys using a similar approach to reduce bias in the estimates, as well as investigating how different designs can improve the results and methods for combining the periodic complex surveys done in the years 2001, 2007, 2012 and 2019.

References

  1. 1. Clark T, Fleming T, Bullen P, Crengle S, Denny S, et al. (2013) Health and well-being of secondary school students in New Zealand: trends between 2001, 2007 and 2012. J Paediatr Child Health 49: 925–934. pmid:24251658
  2. 2. Lewycka S, Clark T, Peiris-John R, Fenaughty J, Bullen P, et al. (2018) Downwards trends in adolescent risk-taking behaviours in New Zealand: Exploring driving forces for change. J Paediatr Child Health 54: 602–608. pmid:29779222
  3. 3. Bor W, Dean AJ, Najman J, Hayatbakhsh R (2014) Are child and adolescent mental health problems increasing in the 21st century? A systematic review. Aust N Z J Psychiatry 48: 606–616. pmid:24829198
  4. 4. Soneji S, Barrington-Trimis JL, Wills TA, Leventhal AM, Unger JB, et al. (2017) Association Between Initial Use of e-Cigarettes and Subsequent Cigarette Smoking Among Adolescents and Young Adults: A Systematic Review and Meta-analysis. JAMA Pediatr 171: 788–797. pmid:28654986
  5. 5. Clark TC, Le Grice J, Moselen E, Fleming T, Crengle S, et al. (2018) Health and wellbeing of Māori secondary school students in New Zealand: Trends between 2001, 2007 and 2012. Aust N Z J Public Health 42: 553–561. pmid:30370961
  6. 6. Fleming T, Peiris-John R, Crengle S, Archer D, Sutcliffe K, et al. (2020) Youth19 Rangatahi Smart Survey, Initial Findings:Introduction and Methods. The Youth19 Research Group. The University of Auckland and Victoria University of Wellington, New Zealand.
  7. 7. Morton SMB, Ramke J, Kinloch J, Grant CC, Carr PA, et al. (2015) Growing Up in New Zealand cohort alignment with all New Zealand births. Aust N Z J Public Health 39: 82–87. pmid:25168836
  8. 8. Education Counts: School Rolls (n.d.). Available: https://www.educationcounts.govt.nz/statistics/schooling/student-numbers/6028. Accessed 22 July 2020.
  9. 9. Deville J-C, Särndal C-E (1992) Calibration estimators in survey sampling. J Am Stat Assoc 87: 376–382.
  10. 10. Sarndal CE, Swensson B, Wretman J (1992) Model Assisted Survey Sampling. New York.
  11. 11. Fuller WA (2009) Sampling Statistics. Hoboken, NJ, USA: John Wiley & Sons, Inc. https://doi.org/10.1002/9780470523551
  12. 12. Särndal C-E, Lundström S (2005) Introduction to estimation in the presence of nonresponse. Estimation in Surveys with Nonresponse. Chichester, UK: John Wiley & Sons, Ltd. pp. 43–48. https://doi.org/10.1002/0470011351.ch5
  13. 13. Breslow NE, Holubkov R (1997) Weighted likelihood, pseudo-likelihood and maximum likelihood methods for logistic regression analysis of two-stage data. Stat Med 16: 103–116. pmid:9004386
  14. 14. Spiegelman D, Rivera-Rodriguez CL, Haneuse S (2016) Evaluating Public Health Interventions: 3. The Two-Stage Design for Confounding Bias Reduction-Having Your Cake and Eating It Two. Am J Public Health 106: 1223–1226. pmid:27285260
  15. 15. Breslow NE, Amorim G, Pettinger MB, Rossouw J (2013) Using the Whole Cohort in the Analysis of Case-Control Data: Application to the Women’s Health Initiative. Stat Biosci 5. pmid:24363785
  16. 16. Deville JC, Sarndal CE (1992) Calibration estimators in survey sampling. J Am Stat Assoc 87: 376–382.
  17. 17. Fuller W (2009) Sampling Statistics. John Wiley and Sons.
  18. 18. Lumley T, Shaw PA, Dai JY (2011) Connections between survey calibration estimators and semiparametric models for incomplete data. Int Stat Rev 79: 200–220. pmid:23833390
  19. 19. Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M (2009) Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology. Stat Biosci 1: 32. pmid:20174455
  20. 20. School Deciles (2021). New Zealand Ministry of Education. Available: https://www.education.govt.nz/school/funding-and-financials/resourcing/operational-funding/school-decile-ratings/. Accessed 23 March 2021.
  21. 21. Statistics New Zealand (2005) Statistical standard for ethnicity.
  22. 22. Salmond C, Crampton P, Sutton F, Atkinson J (2006) NZDep2006 Census Area Unit Data. Available: http://www.wnmeds.ac.nz/academic/dph/research/socialindicators.html.
  23. 23. Rivera-Rodriguez C, Toscano C, Resch S (2019) Improved calibration estimators for the total cost of health programs and application to immunization in Brazil. PLoS One 14: e0212401. pmid:30840645
  24. 24. Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M (2009) Using the whole cohort in the analysis of case-cohort data. Am J Epidemiol 169: 1398–1405. pmid:19357328
  25. 25. NZHS (2018) Regional Data Explorer. New Zealand Health Survey: Regional Data Explorer. Available: https://minhealthnz.shinyapps.io/nz-health-survey-2014-17-regional-update/. Accessed 25 March 2021.
  26. 26. ASH (2019) 2019 ASH Year 10 Snapshot: Topline Results–Smoking. Action on Smoking and Health.