WHO MONICA Project e-publications, No. 12

Quality Assessment of Weight and Height Measurements in the WHO MONICA Project

May 1998

Anu Molarius1, Kari Kuulasmaa1 and Susana Sans2 for the WHO MONICA Project3

1 MONICA Data Centre, National Public Health Institute, Helsinki, Finland
2 Department of Health and Social Security, Institute of Health Studies, Barcelona, Spain
3 Annex: Sites and key personnel of the WHO MONICA Project

© Copyright World Health Organization (WHO) and the WHO MONICA Project investigators 1999. All rights reserved.


Thanks are due to Alun Evans who commented on the text, Anna-Maija Koivisto who was involved in the preparation of the initial survey quality assessment and Tuula Virman-Ojanen who collected the data from the Survey Procedures Questionnaires.

The MONICA Centres are funded predominantly by regional and national governments, research councils, and research charities. Coordination is the responsibility of the World Health Organization (WHO), assisted by local fund raising for congresses and workshops. WHO also supports the MONICA Data Centre (MDC) in Helsinki. Not covered by this general description is the ongoing generous support of the MDC by the National Public Health Institute of Finland, and a contribution to WHO from the National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA for support of the MDC. The completion of the MONICA Project is generously assisted through a Concerted Action Grant from the European Community. Likewise appreciated are grants from ASTRA Hässle AB, Sweden, Hoechst AG, Germany, Hoffmann-La Roche AG, Switzerland, the Institut de Recherches Internationales Servier (IRIS), France, and Merck & Co. Inc., New Jersey, USA, to support data analysis and preparation of publications.

MONICA data items considered in this document



1. Introduction

The main hypothesis of the WHO MONICA Project (1) is to assess whether 10-year trends in incidence and mortality from cardiovascular disease are related to changes in known risk factors. Weight (relative weight, degree of overweight, obesity etc.) was not originally included as one of these risk factors although data on weight and height have been collected from the beginning of the MONICA survey periods. Although the matter is still somewhat under debate, overweight is now considered as one of the risk factors for the main hypothesis.

Compared with most other measurements in the MONICA population surveys, like serum cholesterol or blood pressure, weight and height can be measured more easily, provided that the standard measurement procedures are being followed and the measurers have been trained properly. However, if the standard procedures and/or training have been neglected, the measurements are vulnerable to various biases. Potential sources of bias are inadequate measurement devices, incorrect calibration of the measurement devices, inappropriate clothing or position of the subject during measurement, and wrong position of the measurer.

The main variable where the results of the measurements will be used is relative weight, most often expressed by Body Mass Index (BMI, calculated as weight (kg) divided by the square of height (m2)). Table A shows the bias (%) in BMI induced by different biases in weight and height.

Table A Bias (%) in BMI induced by different biases in weight and height
    bias in weight (%)
    ­10.0 ­5.0 ­2.0 ­1.0 ­0.5 0.0 0.5 1.0 2.0 5.0 10.0
­10.0 11.1 17.3 21.0 22.2 22.8 23.5 24.1 24.7 25.9 29.6 35.8
­5.0 ­0.3 5.3 8.6 9.7 10.2 10.8 11.4 11.9 13.0 16.3 21.9
­2.0 ­6.3 ­1.1 2.0 3.1 3.6 4.1 4.6 5.2 6.2 9.3 14.5
­1.0 ­8.2 ­3.1 0.0 1.0 1.5 2.0 2.5 3.1 4.1 7.1 12.2
­0.5 ­9.1 ­4.0 ­1.0 0.0 0.5 1.0 1.5 2.0 3.0 6.1 11.1
0.0 ­10.0 ­5.0 ­2.0 ­1.0 ­0.5 0.0 0.5 1.0 2.0 5.0 10.0
0.5 ­10.9 ­5.9 ­3.0 ­2.0 ­1.5 ­1.0 ­0.5 0.0 1.0 4.0 8.9
1.0 ­11.8 ­6.9 ­3.9 ­3.0 ­2.5 ­2.0 ­1.5 ­1.0 0.0 2.9 7.8
2.0 ­13.5 ­8.7 ­5.8 ­4.8 ­4.4 ­3.9 ­3.4 ­2.9 ­2.0 0.9 5.7
5.0 ­18.4 ­13.8 ­11.1 ­10.2 ­9.8 ­9.3 ­8.8 ­8.4 ­7.5 ­4.8 ­0.2
10.0 ­25.6 ­21.5 ­19.0 ­18.2 ­17.8 ­17.4 ­16.9 ­16.5 ­15.7 ­13.2 ­9.1

The biases in Table A were derived as follows: If bias in weight is 100a%, bias in height is 100b% and bias in BMI is 100c%, then:

(1+a)weight/((1+b)height)2 = (1+c)BMI = (1+c)weight/height2,

and hence

c= (1+a)/(1+b)2 - 1.

The relative bias induced to BMI by weight is the same as the bias of weight, and the relative bias induced by height is approximately twice the bias of height. For example, if weight is biased downwards by two per cent (e.g. 49 kg instead of 50 kg), which can happen easily if the device is not properly calibrated, and height is measured 0.5% upwards (i.e. 0.8 cm upwards from 165 cm), because the measurement value is read from below the triangle of the height rule, there will be a systematic bias of 3% in the BMI measurements. 3% of a BMI value in the range 25-30 kg/m2 is nearly 1 kg/m2; this is very high considering the fact that the population standard deviations of BMI are of the magnitude 4 kg/m2, and the standard error of the mean BMI within a 10-year/sex group of a MONICA sample around 0.2-0.4 kg/m2.

The purpose of this report is to give an assessment of the weight and height measurements in the three surveys of the MONICA Project. It will, however, not be possible to assess the actual biases in the measurements. The focus of this report will be on the procedures used by the MONICA Collaborating Centres (MCC) to fulfill the MONICA standardization criteria and on the quality of the data available in MONICA Data Centre (MDC). These will be used as indicators of the performance of the measurers. If standard procedures were used and the data look good, it is likely that the measurements are also unbiased. On the other hand, if standard procedures were not followed and the data look strange, there is a high risk that the data are biased.

2. Material and methods

2.1 Populations

The results of this document are reported by Reporting Unit Aggregates (RUA) which are potential units of analyses of the MONICA data. The RUAs, their abbreviations and reporting units are listed in Table 1. Some of the RUAs have several versions because different combinations of Reporting Units (RU) may be used for cross-sectional and trend analyses if all reporting units of the population were not included in all three or two surveys. Therefore, in AUS-PER, GER-BRE, GER-EGE, GER-KMS, GER-RDM, RUS-MOI and RUS-NOC there is an overlap of reporting units included in the RUAs in some surveys. The RUAs are identified by the abbreviation and a version letter. For UNK-GLA, which carried out four surveys, this report assesses the first (initial), third (middle) and fourth (final) survey. In this report, altogether 54 RUAs are considered for the initial MONICA survey, 43 for the middle survey and 41 for the final survey.

2.2 Age and sex

For the quality analyses all observations within the age group 25-64 were used, except in FRA-LILa in the final survey and in AUS-NEW, BEL-LUX, FRA-STR, FRA-TOU, LTU-KAU, NEZ-AUC, POL-TAR, POL-WAR, RUS-MOC, RUS-MOIa, RUS-MOIb and SWI-TIC where the age range studied was 35-64. Age was defined as age in full years on the date of examination (see DEF1 in Reference (2)). Tables 4 and 5, which give the availability of data and the distributions of weight and height, are restricted to age group 35-64 for all RUAs. No age standardization was used in the analyses for this report.

2.3 Sources of information

Information on the actual survey procedures applied by the MCCs were collected in 1991 using the Questionnaire on MONICA Population Survey Procedures (Form VI). The Survey Procedures Questionnaire was further checked regarding the final survey by the MCCs in 1995. The quality assessment of the data provided is based on the data currently available in the MDC.

3. Quality assessment of survey procedures

The Manual of Operations of 1983 (4) described the procedures for height and weight measurement. The same procedures were repeated in the different versions of the MONICA Manual, until a more detailed description concerning the recommendation for scales and the procedures for checking the scales was given in the version of March 1992 (3). Briefly, weight and height were to be measured with the participants in standing position without shoes and heavy outer garments. Weight was to be recorded to the nearest 200 g, and height to the nearest full cm. A more detailed description of the procedures used in the different RUAs, based on the information given on the Survey Procedures Questionnaire (or possible other communication with the MCCs), is shown in Table 2.

3.1 Removal of clothes

All MCCs from which there is a reply to the survey procedures questionnaire reported having removed shoes and heavy outer garments before the measurements of weight and height, as instructed by MONICA.

3.2 Type of scale

The Manual of Operations specifies that "Only accurate balance scales should be used". The March 1992 version of the Manual specifies further that "The use of balance scales is recommended. If the MCC uses digital scales, testing with standard weights is of particular importance".

In most RUAs, balance scales were used in all surveys. In FRA-LIL, FRA-STR, FRA-TOU, UNK-BEL and UNK-GLA digital scales were used in all surveys. SWE-GOT used digital scales in the middle and final surveys, and AUS-PERab, RUS-MOC and RUS-MOIa in the final survey. CAN-HAL has reported that some bathroom and some balance scales were used in their initial survey. In CHN-BEI and HUN-BUD bathroom scales were used. In three RUAs, where balance scales were used generally, bathroom scales were used on home visits in the initial and middle survey. (In AUS-PER this concerned 1-2% of the surveyed persons in the initial and middle survey and in GER-AUR and GER-AUU 8 subjects in the initial survey. The number in the middle survey in GER-AUR/AUU is not known.)

Both balance and digital scales are reliable provided that they are used properly and their calibration is monitored regularly. However, the quality requirements of an epidemiological study are not met if bathroom scales are used.

3.3 Accuracy of weight measurement

The Manual required recording the body weight to the nearest 200g, although it was technically possible to report the data with the accuracy of 100g to the MDC. Some MCCs recorded it to an accuracy of 100g, some to 500g, some to 1 kg and one MCC (USA-STA) to the nearest 0.5 pounds (about 0.227 kg).

Table 2 shows the accuracy of recording weight, as reported by the MCC on the Questionnaire on MONICA Population Survey Procedures. Table 7 gives the distributions of the decimals in the kilograms of the actual survey data, which also reflects the accuracy to which the weight was recorded. There are surprisingly many discrepancies between the accuracy reported by the MCC and revealed by the data (Tables B, C and D). This particularly concerns the initial survey, where the two sources of information disagree in 21 out of 54 RUAs. In most cases, such discrepancies probably indicate inadequate training of the weight measurers.

Table B Number of RUAs with different accuracies for recording weight in the initial survey
From questionnaire From survey core data Total
100 g 200 g 227 g 500 g 1.0 kg
100 g 20 2 - 1 5 28
200 g 9 9 - - 1 19
227 g - - 1 - - 1
500 g - - - - 3 3
1.0 kg - - - - 3 3
Total 29 11 1 1 12 54


Table C Number of RUAs with different accuracies for recording weight in the middle survey
From questionnaire From survey core data Total
100 g 200 g 227 g 500 g 1.0 kg
100 g 26 1 - - - 27
200 g 2 11 - - - 13
227 g - - 1 - - 1
500 g - - - - 2 2
1.0 kg - - - - - 0
Total 28 12 1 0 2 43


Table D Number of RUAs with different accuracies for recording weight in the final survey
From questionnaire From survey core data Total
100 g 200 g 227 g 500 g 1.0 kg
100 g 17 5 - - 1 23
200 g 6 11 - - - 17
227 g - - 1 - - 1
500 g - - - - - 0
1.0 kg - - - - - 0
Total 23 16 1 0 1 41

The effect of the different recording accuracies is not serious even if the accuracy is only to full kilogram, provided that the values are always recorded to the nearest kilogram. If the measurement value is not rounded but for example truncated to the recording accuracy, a systematic bias follows. The magnitude of such bias is on average 0.25 kg (i.e. 0.5-0.25% for weights of 50-100 kg) if the recording accuracy is 0.5 kg, and 0.5 kg (i.e. 1-0.5% for weights of 50-100 kg) if the recording accuracy is 1 kg. In particular in the latter case the bias introduced to BMI can be significant. Also, if the rounding is not controlled, or if the measurers are not trained, the data on weight are unreliable.

The actual recording accuracy of weight in the initial survey was 1 kg in BEL-LUX, GER-BER, GER-COT, GER-EGEb, GER-ERF, GER-RDMc, HUN-BUD, HUN-PEC, ISR-TEL, ITA-LAT, MLT-MLT and UNK-GLA. In the middle survey it was 1 kg in HUN-BUD and HUN-PEC. In the final survey only NEZ-AUC recorded weight to the full 1 kg.

3.4 Accuracy of height measurement

The Manual stipulated recording the body height to the nearest centimetre. 12 RUAs in the initial survey, 10 in the middle survey and 8 in the final survey recorded it with the accuracy of 0.5 cm. UNK-BEL recorded height with the accuracy of 0.1 cm in all surveys, and FIN-KUO, FIN-NKA and FIN-TUL in the final survey. USA-STA recorded height to 0.5 inch (about 1.25 cm) in all three surveys. In ISR-TEL the recording accuracy varied between 1 and 2 cm. Table 2 gives the accuracy of recording height, as reported by the MCCs on the Questionnaire on MONICA Population Survey Procedures.

When height is recorded by the MCC to the nearest 0.5 cm, the value has to be rounded to full centimetres when transferring to the MDC. If the rounding is always up or down, an average bias of 0.25 cm (i.e. about 0.15%) can be expected. Such a bias induces a bias of about 0.3% to the BMI, which is insignificant. The rounding used in the different RUAs is shown in Table 2.

3.5 Use of self-reported data

Four RUAs in the initial, five in the middle, and eight in the final survey reported that they accepted self-reported data instead of measuring height and weight. In each this concerned only non-ambulatory subjects, which is acceptable according to the MONICA Manual.

4. Quality assurance of the measurements in MCCs

Table 3 summarizes the replies to the Questionnaire on MONICA Population Survey Procedures concerning the quality assurance control applied in the MCCs. All RUAs, for which the questionnaire has been completed, reported that they trained the measurers, except the Swiss RUAs for weight in the initial survey. However, only 19 RUAs in the initial survey, 12 in the middle survey and 17 in the final survey indicated that they had tested or re-certified the measurers during the survey.

Six RUAs reported that they never tested the scales using standard weights during the initial survey. In the middle survey there were only three such RUAs, and the same three in the final survey. 10 RUAs tested the scales only once during the initial survey, 12 during the middle survey and 6 during the final survey. The MONICA Manual advised the testing of the scales daily. The 1992 version of the Manual includes the same instruction, but in addition, elsewhere, instructs "Check the scales at least monthly using standard weights" and "Check the zero level every day before starting measurements and immediately afterwards". There is thus an apparent contradiction in the instructions given. The new instruction should have been sufficient provided that standard weights were also used whenever the scales were moved to a new place. There was no question about testing the zero level in the Survey Procedures Questionnaire. Also, there were no questions about the checking of the height rule.

The last columns of Table 3 show the responses to the question "If readjustment of the scale is necessary, what is done with the data already collected using that scale?" In about a third of the RUAs such measurements were rejected, in a quarter they were reported as if they were correct, and in a quarter there was no plan of action in this contingency. There were also several RUAs which indicated that they tried to re-schedule the examination. This question is relevant only if the scales and height rules are tested regularly, and in such cases the measurements should have been rejected if the re-adjustment was large.

The 1992 version of the MONICA Manual gave specific instructions for situations where the scales need re-adjustment. However, at the time the new version of the Manual was distributed, most MCCs had already started their final survey. Therefore it is not surprising that there was almost no change of action with incorrect data for the final survey.

5. Quality assessment of data provided

5.1 Routine data checking

The MDC checks all population survey core data received from the MCCs at the time they are included in the MONICA database. All possible inconsistencies in the data are reported to the MCC to enable correction of errors. The following constraints concern weight and height measurements:

HEIGHT=999 or 110<HEIGHT<209.
WEIGHT=9999 or 250<WEIGHT<998 or 1000<WEIGTH<1999.
If WEIGHT<9999 and HEIGHT<999 then 0.012<(WEIGHT/HEIGHT2)<0.080.

All violations of these constraints were reported to the MCC for their correction or confirmation. Data values outside the constraint limits were acceptable, but the MCC had to check that the values were not unusual due to data errors. The MCCs were only asked to correct values if they knew that they were incorrect. The currently unresolved constraint violations concerning data on weight and height are listed by survey in Appendix 1. There are only a few unresolved constraint violations.

5.2 Missing data, mean values and standard deviations

5.2.1 Weight

Table 4 gives the number of observations (respondents), proportion of missing data among respondents, and the mean value and standard deviation of the weight measurements.

Missing data: In most RUAs missing data are not a problem. The extreme cases are BEL-CHA and BEL-GHE, where one third and one fourth of the survey participants respectively did not come to the survey examination. Other relatively large proportions of missing data are in CAN-HAL (6% in the initial and 17% in the final survey), MLT-MLT (9% in the initial survey), RUS-NOCa (6% in the middle survey), SWE-GOT (8% in the final survey) and YUG-NOS (16% in the final survey). Also in SWE-GOT in the final survey the increased proportion of missing data is due to the fact that many subjects who answered the posted survey questionnaire did not come to the physical examination.

Mean values and standard deviations: Very exceptional mean values and standard errors, and large changes between the surveys could reflect quality problems. The only big change occurred in FRA-TOU, where the mean weight increased from 69 kg to 78 kg to the middle survey and then decreased to 71 kg in the final survey. This, however, can be easily explained by the fact that Toulouse did not examine women in their middle survey.

5.2.2 Height

Table 5 gives the number of observations, proportion of missing data among respondents, and the mean value and standard deviation of the height measurements.

Missing data: The results are similar as for weight, except for NEZ-AUC which had 5% of data missing in the final survey.

Mean values and standard deviations: There were prominent changes in the mean height in three RUAs between the initial and middle survey: CHN-BEI (decrease by 1.4 cm), HUN-BUD (decrease by 1.7 cm) and FRA-TOU (increase by 7.2 cm). Each of these changes, however, have an explanation: in CHN-BEI, the middle survey sample included more old people than the initial survey (2), so that the crude mean values are likely to reflect the age differences. A similar explanation applies to HUN-BUD, where the age distribution within the 10-year age groups was very skewed in the initial survey (see MNM 324A). FRA-TOU did not examine women in their middle survey. Between the middle and final survey there was an increase in mean height in CHN-BEI (by 1.5 cm) and a decrease in FRA-TOU (by 5.7 cm) due to the reasons presented above, but no other outstanding changes.

5.3 Within cohort trends in height

In contrast to weight, height is a relatively stable body measure in an individual throughout adult life. Therefore if the measurement procedures did not change between the surveys, there should be no essential changes in height within a birth cohort. Large changes in mean height within a birth cohort indicate a probable change in the subpopulation which the respondents represent.

We investigated the stability of height by calculating the difference in mean height by 10-year birth cohorts between the surveys. 10-year birth cohorts were defined by the years of birth corresponding closest to the 10-year age groups 25-34, 35-44 and 45-54, in the middle of the initial survey in each RUA. Table 6 gives the differences in mean height between the three surveys within these birth cohorts by RUA. DEN-GLO was excluded from this analysis because it surveyed men and women of ages 30, 40, 50 and 60 years and so did not examine the same birth cohorts in the three surveys (2). Also, AUS-PERb and GER-BREb which did not carry out the initial survey and RUAs which only did the initial survey were excluded from this analysis.

To identify the RUAs where there is possibly a bias in the cohort trend, either through a measurement bias or a change in the population which the sample represents, a Cohort Trend Score (CTS) was defined. The score is based in the estimated changes and their standard errors for men and women in the common age groups 35-44 and 45-54 in three steps:

  1. The average change (A) was calculated for each sex/birth cohort. These are shown at the end of Table 6. As height is known to decrease slightly with age, this average change was used as the reference value around which the random variation was expected to occur.
  2. Upper and lower limits were set for a change as A ± 2.5 SE, where SE is the standard error of the estimated change. If a change is normally distributed with mean A and variance SE2, then the probability that all four changes (two birth cohorts and two sexes) are within the limits is 95%. The observed changes which are outside the limit have been marked with an asterisk (*) in Table 6.
  3. The Cohort Trend Score (CTS) is defined as:
CTS = 2 if all four changes are within limits;
1 if one of the four changes is out of limits;
0 if at least two of the four changes are out of limits.

If the score is 2, there is no evidence of bias between the surveys. If the score is 1, there may be a bias, at least concerning the representativeness of the sample in some sex/birth cohort. A score 0 is a sign of concern about a more general bias.

The score was 0 in three RUAs (BEL-CHA, HUN-BUD and ITA-BRI) between the initial and middle survey, in two RUAs (ITA-FRI and RUS-NOI) between the middle and final survey, and in three RUAs (BEL-CHA, RUS-NOI and UNK-BEL) between the initial and final survey. In addition, in several RUAs there were isolated significant changes in the youngest birth cohort aged 25-34 in the initial survey. In CHN-BEI (Mid-Fin), FIN-TUL (Ini-Mid), GER-AUU (Ini-Mid) and SWI-VAF (Mid-Fin) there was a significant increase and in GER-AUU (Mid-Fin), SWE-GOT (Ini-Fin) and SWI-VAF (Ini-Mid) there was a significant decrease in mean height in this cohort. The youngest birth cohort was, however, not taken into account for the cohort trend score.

In HUN-BUD the mean height decreased in all birth cohorts between the initial and middle survey. These changes can be explained by the differences in age distribution between surveys within the cohorts as discussed in Section 5.2.2. The MCC suggests that in SWE-GOT he decrease may be explained by increasing immigration. The reason in the other RUAs with significant changes is unknown. The MCCs where the score was 0 or which had significant changes in more than one birth cohort should check whether the observed changes are due to measurement bias or the representativeness of the respondents.

5.4 Distributions of the terminal digits

5.4.1 Weight

Table 7 gives the distributions of the terminal digits (i.e. the decimals of kg) for weight in the three surveys by RUA. This table was already mentioned in Section 3, where it was used to validate the questionnaire information from the MCCs concerning the accuracy of recording the weight measurements. Now we will look at the tables to detect possible preference for particular terminal digits, regardless of what was the intended accuracy of recording the measurement.

An immediate observation from Table 7 is that there is a strong preference for terminal digits 0 and 5 in most RUAs. On the other hand, as could be expected, the distribution of the terminal digits is relatively uniform in the RUAs which used a digital scale (AUS-PERab (Fin), FRA-LIL, FRA-STR, FRA-TOU, RUS-MOC (Fin), RUS-MOIa (Fin), SWE-GOT (Mid&Fin), UNK-BEL and UNK-GLA), the exceptions being RUS-MOC, RUS-MOIa and UNK-BEL which have clear zero preference in the final survey, and UNK-GLA, which reported only full kilograms in the initial survey.

Because some RUAs seemed to measure weight to full kilograms, it was decided to look also at the distribution of the second last digits (i.e. the terminal digits for the full kilograms). They can be seen in Table 8. The proportion of zeros as the last digit of the full kilos is increased in many of the RUAs which measured weight to the full kilogram, but not outstandingly high in the other RUAs. The extremes in the initial survey were MLT-MLT (24.1%), ISR-TEL (21.9%), HUN-BUD (16.5%) and GER-RDMc (14.8%). In the middle survey the only RUA with high zero preference was GER-HAC (14.1%). In the final survey no RUA showed extremely high proportions of zero last digits of the full kilograms.

Tables 7 and 8 show that many RUAs improved their weight measurement remarkably from the initial to the middle and final survey, but also that there was room for improvement in many others.

5.4.2 Height

Table 9 gives the distributions of the terminal digits for height in the three surveys by RUA. Overall, these distributions are much more uniform than those for weight (Table 7). In the initial survey, the extreme proportions of full 10 centimetres were found in ISR-TEL (24.0%), ITA-BRI (18.1%), LTU-KAU (17.0%), FRA-TOU (16.8%), RUS-MOIa (16.9%), RUS-MOC (15.5%), HUN-BUD (15.5%), ROM-BUC (14.3%). In the middle survey, the extremes were RUS-MOIa (16.7%), ITA-BRI (16.3%), RUS-MOC (14.7%) and LTU-KAU (14.5%). In the final survey, there was only one RUA (ITA-BRI 16.4%) with more than 14% of zero values in the last digit of height.

6. Summary and recommendations

6.1 Quality scores

The proportion of terminal zeros was used as a summary indicator of the quality of the measurements. The definition of a summary score for weight is complicated by the fact that different MCCs intentionally used different accuracies of recording the measurement. If weight was recorded to 100 g the expected proportion of terminal zeros is 10%, whereas it is 20% if weight was recorded to 200 g. As there is no major practical difference between recording to the 100 g or 200 g, the cut point of good quality was taken at 30%, which indicates a concern over both accuracies. On the other hand, if weight is measured to full kilograms, there are such big concerns already involved in the rounding of the values that the indicator should show bad quality regardless of the intended accuracy.

If the measurements are not done properly, there can also be preference of full kilograms. If the terminal digits are uniformly distributed, about 10% of the readings have zero as the second last digit. Taking into account the standard error of this expected value (about 0.9% for sample size of 1000), the proportion of the terminal zeroes would by chance be very rarely more than 12%, and extremely rarely more than 13%. Therefore, the second last digit was also taken into account when defining a summary score.

WEIGHT SCORE was defined as:

WEIGHT SCORE = 2 (no indication of a problem) if the proportion of terminal zeros is <=30%
AND there are <= 13% zeros in the second last digit;
1 (some concern) if not 2 or 0;
0 (major concern) if the proportion of terminal zeros is > 60%
OR there are > 14% zeros in the second last digit.

The scores for each RUA in the three surveys are given in Table 10. Note that for Stanford the score is not meaningful because they used a different unit for the measurement.

For weight, also a SCALE SCORE was defined as:

SCALE SCORE = 2 if balance scales were used (this was the recommendation);
1 if digital scales were used;
0 if bathroom scales were used.

It is essentially easier to define a summary score for the terminal digit preference of height than weight, because all MCCs provided data on height to the MDC using the same accuracy, and the measurement device for height is so simple that very little if any terminal digit preference would be expected in an ideal situation. It is reasonable to expect that the terminal digits are distributed uniformly, and hence that about 10% of the readings have zero as the terminal digit. If the measurement was not done carefully, the proportion of zeros as the terminal digit can be expected to be larger than 10%. The HEIGHT SCORE was defined as:

HEIGHT SCORE = 2 if the proportion of zeros is <= 13%;
1 if not 2 or 0;
0 if the proportion of zeros is > 14%.

The scores for each RUA in the three surveys are given in Table 10. Note that for Stanford the score is not meaningful because it used a different unit for the measurement.

A SUMMARY SCORE (Table 10) for weight and height measurements was derived using the sum of WEIGHT SCORE, HEIGHT SCORE and SCALE SCORE:

SUMMARY SCORE = 2 if the sum is 5 or 6;
1 if the sum is 3 or 4;
0 if the sum is 0, 1 or 2 or the quality of some of the measurements is very bad even though the sum is more than two.

The summary score is based on the assumption that distributions of the terminal digits are good overall indicators of the training and performance of weight and height measurers. However, the summary score does not reflect problems in the calibration of the measurement devices. The latter seems to be difficult to check from the data and therefore thorough quality control of the devices during the survey is of particular importance. Also, the cohort trend score was not used in the definition of the summary score, because a low cohort score is likely to reflect changes in the representativeness of the sample. The cohort trend scores will be dealt with in more detail in the non-response quality assessment report.

The exception where the summary score is zero even though the sum of the three scores is more than two was applied to one RUA only: MLT-MLT (initial survey). In MLT-MLT not only the accuracy of recording the weight was a full kilogram, but 24% of the subjects also had their weights recorded in full tens of kilograms.

6.2 Cross-sectional analyses

It is recommended that BMI (or other measures of relative weight) is not used in data analysis for RUAs where the summary score is zero. The summary score was zero in three RUAs in the initial survey, in one RUA in the middle survey and in one RUA in the final survey (Table 10). In addition, in the publications, the RUAs which used bathroom scales to measure weight, should be indicated.

6.3 Trend analyses

It is recommended that RUAs where the summary score is zero in one or more surveys are excluded from the trend analyses of BMI and other measures of relative weight.

7. Discussion

Concerning the data on weight and height measurements in MONICA relatively few problems were observed. In the initial survey, however, there were terminal digit preferences of full tens of kilograms or full 10 centimetres in several RUAs. Also the large number of discrepancies in the reported and actual accuracies of weight measurement reflect possible inadequacies in training of the measurers. The quality of the data improved in the middle and final survey.

The quality assessment based on the data is probably good in identifying the RUAs where the measurers were not trained properly. However, it is less good in detecting errors in the calibration of the scales and height rules. In many RUAs the calibration was not checked regularly. In such RUAs an undetected bias due to the calibration error is possible. For height measurement serious calibration problems could be reflected in large fluctuations in mean height within the birth cohorts.

The accuracy of height measurement is especially important for calculating BMI, because the relative bias induced by height to BMI is approximately twice the bias of weight. The analysis of height within birth cohorts gives support to the view that relatively few problems occurred in the measurement of height.

8. Comments on individual RUAs

The following list includes only the RUAs with specific findings or exceptional background information relevant for the use of data.































9. References to publications

  1. Tunstall-Pedoe H for the WHO MONICA Project. The World Health Organization MONICA Project (Monitoring Trends and Determinants in Cardiovascular Disease): A major international collaboration. J Clin Epidemiol 1988;41:105-14.
  2. Kuulasmaa K, Tolonen H, Ferrario M, Ruokokoski E for the WHO MONICA Project. Age, date of examination and survey periods in the MONICA surveys. (May 1998). Available from: URL:http://www.thl.fi/publications/monica/age/ageqa.htm, URN:NBN:fi-fe19991075
  3. WHO MONICA Project. MONICA Manual. Part III: Population Survey. Section 1: Population survey data component. (December1997). Available from: URL:http://www.thl.fi/publications/monica/manual/part3/iii-1.htm, URN:NBN:fi-fe19981151

10. References to internal MONICA documents

  1. WHO MONICA Project. Manual of Operations. WHO/MNC/82.2, DRAFT MOO, November 1983.