WHO MONICA Project e-publications, No. 5

Participation Rates, Quality of Sampling Frames and Sampling Fractions in the MONICA Surveys

September 1998

Hermann K. Wolf1, Kari Kuulasmaa2, Hanna Tolonen2 and Esa Ruokokoski2 for the WHO MONICA Project3

1 Department of Physiology and Biophysics, Dalhousie University, Halifax, Canada
2 MONICA Data Centre, National Public Health Institute, Helsinki, Finland
3 Annex: Sites and key personnel of the WHO MONICA Project


© Copyright World Health Organization (WHO) and the WHO MONICA Project investigators 1999. All rights reserved.

This document includes the main findings of unpublished report:


Acknowledgements

The MONICA Centres are funded predominantly by regional and national governments, research councils, and research charities. Coordination is the responsibility of the World Health Organization (WHO), assisted by local fund raising for congresses and workshops. WHO also supports the MONICA Data Centre (MDC) in Helsinki. Not covered by this general description is the ongoing generous support of the MDC by the National Public Health Institute of Finland, and a contribution to WHO from the National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA for support of the MDC. The completion of the MONICA Project is generously assisted through a Concerted Action Grant from the European Community. Likewise appreciated are grants from ASTRA Hässle AB, Sweden, Hoechst AG, Germany, Hoffmann-La Roche AG, Switzerland, the Institut de Recherches Internationales Servier (IRIS), France, and Merck & Co. Inc., New Jersey, USA, to support data analysis and preparation of publications.


Contents

1. Introduction

The purpose of the MONICA risk factor surveys (1, 2) is to estimate the cardiovascular risk factor distribution in the study populations. When the main hypotheses are tested, the risk factor trends will be related to the trends of cardiovascular disease rates in the same populations. Therefore, it is necessary that the risk factor data and the event registration data refer to the same population. Apart from the quality of the measurements, the risk factor data may be biased for the following reasons:

According to the information provided by the MONICA Collaborating Centres (MCC), the population survey data and event registration data were collected in the same geographical populations in all Reporting Units. The exceptions are those MCCs where one of the data components was not collected at all, and Auckland where the area of the initial survey population covered only 80% of the area of the event registration population.

The sampling scheme has not been taken into account in the analyses so far, and therefore it may bias the results in some populations. However, preliminary investigation of the sampling schemes rules out major influences on the results of data analysis. The exception may be, when the population used for analysis is a combination of several sub-populations but the sampling was not planned for the combination of the sub-populations. In such a case there may be a need for weighting of the sub-populations. This situation is investigated in Section 12 of the current report.

The age range surveyed varies a little between the populations, depending on how the age was defined for the sample selection. This has been investigated in a separate report (3). The remaining aspects, which are related to the quality of the sampling frame and survey non-response are investigated in the present report.

Since the quality of the sampling frame and non-response are interrelated, they are considered together: If the sampling frame is not accurate, it may include a substantial number of subjects who actually do not belong to the population. Often it cannot be confirmed that the subjects are not members of the population, and then they will show up as non-respondents. If it can be confirmed that such subjects do not belong to the population, they should be defined as ineligible and they should be ignored when calculating participation rates. The sampling fractions are considered together with the sampling frame and non-response because all these investigations are based on the same data sources.

The overall quality of data for the non-respondents is poor. A number of possible explanations exist for this quality problem: i) the requirements for non-respondent data collection were not properly defined and described in the MONICA Manual until 1985; ii) there was a lack of statistical and/or epidemiological input in the protocol design of many MCCs; iii) in many cases the non-respondents were not prepared to provide any information and sometimes, they were outright hostile.

The present quality report aims to assess objectively the results from each individual survey and determine the trend between the surveys.

2. Definitions of terms

Target population:
The target population comprises all the individuals included in a study. In MONICA, it is defined as the Reporting Units. It should be the population from which event data are gathered and to which the survey data should apply.
Sampling frame:
The sampling frame is the list of sampling units from which the sample is selected. In a single-stage sampling scheme, the sampling frame ideally represents the target population exactly. There are, however, reasons why actual sampling frames often deviate from the ideal. For example, there are always people dying or moving in and out of the target population, and therefore the sampling frame is never fully up-to-date.
In multi-stage sampling each sampling stage has a separate sampling frame. For example, the primary sampling frame may list the towns and villages of the population area, and the second-stage frame lists the people of the towns and villages. Usually there is no difficulty in getting a complete primary-stage frame. However, the frames that list the people have similar problems as the sampling frames of a single-stage sampling.
Foreign elements:
These are elements in the sampling frame that are not valid members of the target population. In our context, foreign elements are, for example, persons still listed in the sampling frame but no longer living in the area under study (ineligibles).
Ineligibles:
Members of the sampling frame that are excluded from the survey by definition (moved away or died between time of sampling and survey). Ideally, the ineligibles are exactly the same as the foreign elements of the sampling frame. In practice, however, the membership in the target population cannot always be established for everybody selected to the sample. According to the technical definition given in MONICA Manual (Part III, Section 1, Subsection 3 of Reference 2), selected sample are by default eligible. Ineligible are only those for which the ineligibility criteria can be confirmed, i.e. if in doubt, he/she is eligible.
Non-respondents:
All members of the sample set, except the ineligibles, for which no survey data have been collected.

3. Material and methods

3.1. Populations

The report considers the Reporting Unit Aggregates (RUA) which are potential candidates for units of analysis of the MONICA data. The RUAs, their abbreviations and their Reporting Units (RU) are listed in Table 1.1. Some of the RUAs have several versions distinguished by suffix a and b. Different combinations of RUs may be used for cross-sectional and trend analysis. This is the case if not all RUs of the RUA were included in every survey. Therefore, in AUS-PER, GER-BER, GER-BRE, GER-EGE, GER-KMS, GER-RDM, RUS-MOI and RUS-NOC there is an overlap of reporting units included in the RUAs in some surveys. For UNK-GLA, which carried out four surveys, the first (Ini), third (Mid) and fourth (Fin) survey are considered. Altogether 54 RUAs are considered for the initial survey, 43 for the middle and 41 for the final survey.

3.2. Age and sex

All subjects for whom data are available were included in most analyses of the current document. For some analyses the subjects were selected according to their age and sex. In such cases the age and sex are specified in the respective tables. Age-standardization was not used for the analyses of this report. When data from the survey core data (Form 04) or survey non-respondent data (Form 08) have been used, age has been defined as age in full years on the date of examination (see DEF1 in Reference 3).

3.3. Sources of information

The data sources for this report are:

4. Quality of sampling frame

We will consider here indicators of the quality of the sampling frames used by the MCCs. For RUAs with multi-stage sampling, the frame used in the primary-stage sampling is usually simple. Therefore, we will focus attention on the critical sampling frames that list individual persons or households. The data for and results of the assessment are presented in Table 2.

We used the following indicators of the quality of sampling frame:

Source of the sampling frame:
Population registers are usually relatively good sampling frames. In many RUAs there exist no population registers. Therefore, population lists compiled for other purposes must be used, such as public health service registers or electoral rolls. There are usually different reasons why such lists may contain too few or too many names. The sampling frames used in the RUAs are listed in Table 1.2
Age of sampling frame:
We define the age of the sampling frame as the difference between the last update of the sampling frame and the time when it was used to draw the sample. This information is based in part on replies to question 10 in the form Sample Selection Descriptions (MONICA Memo 50) and in part on communication with individual MCCs. An "old" frame is likely to be inaccurate because of migration and deaths in the target population after the frame was updated. Theoretically, the age of a sampling frame has two components, one for new elements to be entered, and one for foreign elements to be deleted. If sampling frames are created de novo by counting, such as at census, there is no difference between these two age components. However, with sampling frames that represent lists that are updated at regular intervals, entries may be made more promptly than deletions (or vice versa) and as a result the two age components can be quite different. In one case, it will result in incomplete coverage of the target population, in the other in excessive foreign elements. We ignore any differences in the two age components of sampling frames for the tabulations of this report. However, we will refer to them in comments on individual RUAs, as needed.
Proportion of ineligibles:
The proportion of persons who were ineligible in the original sample indicates how commonly the sampling frame includes identifiable foreign elements. Theoretically we should also be interested in missing elements, i.e. the proportion of the members of the target population who are not included in the sampling frame, but such data are not available in MONICA. The proportion of ineligibles is used as a general indicator of the accuracy of the sampling frame.
Proportion not possible to contact:
The survey non-respondent data, which the MCCs should have provided for every non-respondent, has an item labelled "reason of non-response". One of the options, "not possible to contact", refers to those with whom no contact could be made, but no information was available to indicate that they are ineligible for the sample. As some of the subjects in this category may actually be foreign elements of the sampling frame, their proportion in the original sample may also reflect the inaccuracy of the sampling frame. The proportions given in Table 2 are for those RUAs where the reason of non-response is available for at least half of the non-respondents. (For more information about data item "reason for non-response", see Table 10). For the proportion in Table 2 the numerator was calculated from the individual non-respondent data and the denominator from the aggregate data provided by the MCCs.

Individually, none of these indicators, except the age of sampling frame, is very specific for quality. However, based on all of them combined, and on additional information available from the MCCs, a Sampling Frame Score (SFS) was assigned to the RUAs, to indicate our current understanding of the quality of the sampling frame. The score has the following values (the respective mean proportion is calculated over all the surveys of a RUA):

SFS = 2 if there is no major concern about the sampling frame:
SFS not equal to 0
AND no change in sampling frame between surveys
AND mean proportion "not eligible" < 5%
AND mean proportion "not possible to contact" < 5%.
1 if SFS not = 0 and SFS not = 2
0 if the sampling frame has major problems:
All proportions "not eligible" are missing
OR all proportions "not possible to contact" are missing
OR maximal value of proportion "not eligible" > 20%
OR maximal value of proportion "not possible to contact" > 20%.

The score was "0" in 12 RUAs (CAN-HAL, GER-COT, GER-RDM, GER-RHN, HUN-PEC, ISR-TEL, ITA-LAT, LTU-KAU, MLT-MLT, RUS-MOIb, RUS-NOCb, UNK-GLA).

No RUA changed the sampling frame between the initial and middle survey, but two RUAs adopted new sampling frames for the final survey with some implications for frame compatibility (FRA-LIL, FRA-STR).

It is assumed that all MCCs used the best available sampling frame at their disposal. Therefore, a low sampling frame score does not usually indicate bad performance by the MCC, but reflects local constraints. However, it still means that the sample may be biased because of the poor quality of the sampling frame. On the other hand, the score is not only a reflection of the quality of the sampling frame but also depends to some extent on the efforts of the MCC. Therefore, some of the MCCs with a score of zero probably used a good sampling frame but failed to put enough effort into pursuing persons who were hard to locate (e.g. GER-COT).

5. Definition of eligibility and non-respondents

In the MONICA Project the eligibility of a subject selected to the sample was defined at the time of sample selection. The MONICA Manual states that "The individuals selected in the original sample who died or moved out of the reporting unit area before the survey examination are called non-eligibles". In addition, some RUAs have occasional subjects who are ineligible because of a clerical error in the sampling frame. There have been subjects whose gender was incorrect in the sampling frame, or whose age in the sampling frame was inaccurate  and out of the range for the survey.

For non-respondents a technical definition was given in order to determine which data should be reported to the MDC for these subjects. The definition was:

"A non-respondent is a person selected and eligible to the original sample who could not be found or contacted or a person who did not provide questionnaire data ...."

In most cases this technical definition gives a sensible estimate of the availability of data. There are two situations where it is misleading:

  1. If there is a considerable number of subjects who provided the questionnaire data but never attended the clinical examination, then this definition does not provide useful information about the response rate for the cholesterol and blood pressure measurements. This was the case in six RUAs (see Section 8 and Table 7);
  2. When the sampling frame includes a large number of foreign elements and there is no way to identify such individuals, the response rate will be under-estimated, because subjects who are ineligible become classified as non-respondents. Such a bias may be significant in CAN-HAL (Ini, Fin), FRA-TOU, LTU-KAU (Fin), UNK-BEL, UNK-GLA (Ini).

The MCCs were first contacted in 1985 for the definition of eligibility, which was applied when data were submitted to the MDC (Table 1.2). At that time, no clear information was received from most MCCs. One possible explanation for this failure is that the MCCs did not define eligibility in any systematic way when collecting the data. This might indicate deficient organization during the initial survey, when more than half of the MCCs had only data for fewer than 50% of non-respondents. In MONICA the definition of eligibility was introduced in the Manual in 1985, and the wording was clarified later because there was misunderstanding in some MCCs. The introduction of the definition improved the situation and, therefore, more information is available for the middle and final surveys. MCCs reported their compliance with the eligibility definition of the Manual with the submission of the aggregate data of the middle and final survey (see Appendix 1b and Appendix 1c). According to this information there was a high degree of compliance with the Manual definition. However, some of the information conflicted with other data available at the MDC and it cannot be ruled out that the improved compliance is partly due to erroneous data from the MCCs.

Among the RUAs for which the definition of eligibility is known, the following had a different definition from the one given in the Manual or provided conflicting information:

AUS-NEW:
In addition to MONICA Manual's requirements, the following were ruled as ineligibles for the initial and middle survey: a) mentally handicapped, b) too old for study when interviewed. According to the MONICA definition, the mentally handicapped persons ought to be non-respondents due to medical reasons. The final survey adhered apparently to the Manual's definition, but this information is suspect since it was submitted together with a statement that the initial and middle survey also followed the Manual's definition of eligibility.
BEL-LUX:
The centre used an electoral list as sampling frame. Presumably, this is the reason why non-citizens were treated as ineligible. Also, persons found to be living elsewhere than the address of the electoral list were designated as ineligible. It is not clear whether the same exclusion criteria were applied to the event and demographic data.
CAN-HAL:
Persons in correctional institutions and persons unable to understand English and without an interpreter were excluded from the survey. (The language problem happened twice in the initial and 10 times in the final survey; no information is provided about prevalence of incarceration). No language exclusion was applied to the event and demographic data. However, prisoners usually receive their medical care within the prison system and are therefore automatically excluded from the event registration.
FRA-STR:
For the initial survey "collective households" (e.g. old-age homes, monasteries or convents) are defined as ineligible in addition to the persons specified in the Manual. According to the MCC, "collective households" represent a negligible fraction of the total sample. The MCC changed the sampling frame for the final survey. In a second reply to the Sample Selection Description the MCC stated that the Manual's definition of ineligibility has been broadened by including persons without French citizenship and those not registered on the electoral roll. These additional restrictions do not apply to the event and demographic data. From the results of the initial survey it was estimated that there are about 4.5% non-French citizens in the population.
FRA-TOU:
Persons in prison and foreigners were included in the category of ineligibles. In the Sample Selection Description the MCC describes a scheme of augmenting the sampling frame by lists of foreigners obtained from various consulates. It is uncertain whether this scheme was ever implemented and whether it was carried through to the middle and final survey. It is also not clear how big a problem the omission of foreigners is, i.e. how they impact on event and demographic data.
GER-AUR and GER-AUU:
The ineligibles included persons without German citizenship (estimated to be about 2% in the age group > 50). These exclusion criteria do not apply to the event and demographic data.
GER-BRE:
The ineligibles included persons without German citizenship. Such an exclusion criterion was not applied to the event and demographic data. 
HUN-PEC:
No sample size information is available for the initial survey. For the middle survey the MCC was not able to examine the reasons for non-attendance. As a result, all ineligibles were included in the group of non-respondents.
ISR-TEL:
In the second stage of the sampling, apartments were selected. The apartments were visited and the people living in them formed the sampling frame of the final stage. The apartments whose tenants were absent at the time of the visit were excluded from the sampling frame.
ITA-LAT:
Eligibility was restricted to Italian citizens. Ineligible were also persons where there was a problem with the address.
LTU-KAU:
The initial and middle survey defined officers and their wives, as well as persons who had changed addresses, as ineligible. Address changes only qualify as ineligibles if it is known that the persons moved permanently outside the target area. It is not clear whether the same eligibility criteria were also used for the final survey and the event and demographic data.
NEZ-AUC:
The initial survey declared Maoris and Polynesians as ineligible. It is not known whether these individuals were also excluded from the final survey. As reported by the MCC, the same exclusion rules did not apply to the event and demographic data. However, local investigations have shown that this is not a major problem.
A related problem concerns the difference in target population between the various components of the MONICA Project (see comments in Section 7)
POL-TAR:
The "not eligible" included some subjects who were not possible to contact, but their ineligibility could not be confirmed. According to the MONICA definition, such subjects are non-respondents. It is not clear whether the same definition was used for all three surveys.
RUS-NOC and RUS-NOI:
For all three surveys, imprisoned persons (estimated to be < 1% of the population) and those away from home for more than one year for occupational reasons were included in the group of ineligibles. These individuals were not excluded from demographic data, but prisoners are excluded from event registration.
SWI-TIC and SWI-VAF:
Severely ill or handicapped persons unable to attend the examination room were classified as ineligibles. According to the MONICA definition, such persons are non-respondents due to medical reasons.
UNK-GLA:
Before the sample was selected, the general practitioners (GP) excluded from their list those whom they considered not suitable for the screening. Otherwise the definition is as specified in the MONICA Manual. The category of people excluded from the GPs' lists could be non-respondents (unable to attend for medical reasons). However, in this case they were not sampled because they were not included in the sampling frame. The calculation of the participation rate, therefore, may be biased by the absence of these individuals.
USA-STA:
Reason for ineligibility included severe illness, language problems and "previously surveyed". The two latter reasons involved about 2% each of the ineligibles during the middle and final survey. This numerical information is not known for the initial survey. Also, it is not known how the "language ineligibility" has been considered in the event registration and demographic and mortality data.

6. Consistency of the participation rate information from different sources

One aspect that decreases the accuracy of the participation rate in many RUAs is the uncertainty about the definitions used. Another is the inconsistency of the participation rate information from different sources of data. We made two comparisons between the following data sources: the serial number inventory data (individual data), the sample size data reported by the MCC (aggregate data), and survey respondent and non-respondent data received by MDC (individual data). The results are summarised in Table 3 and Table 4.

6.1. Comparison between serial number inventory data, survey individual data, and aggregate data

The MCCs had to provide three survey data sets at the individual level:

  1. Serial number inventory data (Form 05) for everybody selected for the original sample. In the data each subject was assigned a status:
  2. Survey core data (Form 04) for every respondent, i.e. for those who had status 1 in the serial number inventory data.
  3. Non-respondent data (Form 08) for every non-respondent.

No individual data for the ineligibles had to be submitted other than the serial number inventory data. Therefore, we do not know the age and sex distribution of the ineligibles.

The MDC had not received the serial number inventory data for two RUAs for the initial survey (ITA-LAT, MLT-MLT) and one RUA for the final survey (RUS-NOCa).

Non-respondent data were missing for five RUAs of the initial survey (HUN-PEC, MLT-MLT, NEZ-AUC, RUS-MOC and RUS-MOI), one RUA of the middle survey (GER-COT), and three RUAs of the final survey (RUS-MOC, RUS-MOI, RUS-NOCa).

The following pairs of data sets should have equal numbers of records:

The number of ineligibles is available at individual level only in the serial number inventory data. Therefore, their consistency is compared with the aggregate number of ineligibles reported by the MCCs.

The numbers of the three pairs are given in Table 3 for all three surveys. In addition, the table lists a summary quality score for each individual survey and a quality trend for all surveys combined.

The Individual data agreement score (IDA) is defined as:

IDA = 2 if no discrepancies within pairs of sources of information.
1 if small discrepancies within pairs; all three pairs differ by less than + 20.
0 if major discrepancies within pairs or unavailability of some of the data; any pair differs by more than + 20.

A Quality trend score (QTS) was assigned according to the following rules:

QTS = 2 if improved or maintained high quality:
IDA = 2 for final survey
AND
there has been no decrease in IDA between successive surveys.
1 if no quality fluctuations or no clear trend:
QTS is neither 0 nor 2.
0 if deterioration or no improvement in poor quality:
IDA = 0 in final survey
OR
there has been a decrease between successive IDAs, but no increase.

In the initial survey, 19/54 (35%) RUAs had score "2", 12/54 (22%) had score "1" and 23/54 (43%) RUAs had score "0". In the middle survey 23/43 (53%) RUAs had score "2", 10/43 (23%) had score "1" and 10/43 (23%) had score "0". In the final survey, the distribution of the scores was 21/41 (51%) for score" 2", 11/41 (27%) for score "1", and 9/41 (22%) for score 0". The result indicates that the data management for initial surveys in most MCCs was not reliable, but it improved with the next two surveys. However, it is clearly not satisfactory that in the final survey 9 RUAs still had a score of zero.

The trend score confirms that there was an improvement between surveys in 20 RUAs but 13 RUAs got even worse. The overall trend is poor in more than half of the RUAs, an unsatisfactory situation.

6.2. Comparison between individual data and aggregate data supplied by the MCCs

We compared individual data and aggregate data that were reported by the MCCs using the forms Participation rate in the initial MONICA population survey (Appendix 1a), Sample size in the 2nd MONICA survey (Appendix 1b) and Sample size in the final MONICA survey (Appendix 1c). Table 4 gives ratios where the aggregate data are in the numerator and the individual data in denominator. For "ineligible" the individual data are the serial number inventory data and the denominator consists of the data in column "ineligible" of Table 3. For "respondents" the denominator is the survey core data (Form 04) and for "non-respondents" the non-respondent data (Form 08). All ratios should be equal to 1.

The Individual and aggregate data agreement score (IAA) was defined for the agreement between the two sources of data for respondents and non-respondents. The ratio for "ineligibles" was not considered for the score, because the same data were already evaluated in the score for serial number inventory. The criteria for the IAA score are as follows:

IAA = 2 if both ratios are equal to 1.00.
1 if both ratios are between 0.95 and 1.05 but at least one of them is different from 1.00.
0 if at least one of the ratios is less than 0.95 or more than 1.05, or some of the data are missing

The ratios for respondents are between 0.98 and 1.03 for most RUAs in the initial survey, the exceptions being RUS-NOCb (1.33). For the middle survey, the ratios extend from 0.94 to 1.02 with the exception of BER-BRE (0.92). For the final survey the ratios lie between 0.98 and 1.02 with the exception of AUS-PER (1.05), FRA-STR (1.13), GER-BREa (0.91) and GER-BREb (0.92). It indicates that, with a few exceptions, the management of core data is reasonable. However, for non-respondents there are major shortcomings in most RUAs, especially in the initial survey, where the ratios range from 0.66 to 13.5.

A quality trend score was defined similar to QTS in Section 6.1. An improvement or maintenance of quality (trend score 2 or 1) was observed for 37/47 RUAs (79%), indicating that for a significant number of RUAs the accounting for the sample units got worse in later surveys.

7. Participation rate

Table 5.1 shows the participation rates for the RUAs in age group 35-64. In accordance with the MONICA definition, these rates are the proportion of eligibles that provided at least questionnaire information. Item-response rates, especially for data items collected during clinic visits, may be lower and are discussed in Section 8.

Two definitions for participation rates are employed:

Definition A.
The rate is calculated with the size of the eligible sample in the denominator and the number of respondents (according to the technical definition of the MONICA Manual, see Reference 2) in the numerator.
Definition B.
The numerator is the same as for definition A. The denominator is defined as the number of eligibles minus the number of non-respondents that were not possible to be contacted. This definition is introduced to provide for those MCCs where the sampling frame may have contained a large number of foreign elements, whose status of eligibility could not be determined. In these situations, a large difference exists between the rates according to the two definitions. The relationship between participation rate according to definition A (PRA) and definition B (PRB) is given by:

PRB = PRA/(1 - X(1-PRA)),

where X is the proportion of those not possible to contact (i.e. REASON=1) among all non-respondent records. If X is missing, it is assumed to be 0 and PRB = PRA.

Based on different data sources three participation rates are determined.

By aggregate data:
The first participation rate is based on the aggregate data provided by the MCC.
By individual data:
The second participation rate is based on the individual level survey core data (Form 04) and non-respondent data (Form 08).
To be reported:
The third participation rate is the one to be used when participation rate is reported. It is normally identical to the rate calculated by aggregate data, unless it is known that the aggregate data are in error. In that case, the rates calculated by individual data will be reported.

In some MCCs a minor discrepancy between the participation rates "by aggregate data" and "by individual data" can be explained by a difference in the calculation of age. The following explanations concern RUAs where an exception was made to the general reporting rule, i.e. there was a major discrepancy between the participation rates calculated from different data sources, or some other comment is needed:

AUS-NEW
Ini: The MCC does not know the original sample size or number of ineligibles and thus, no aggregate data are available.
Mid: Note that a subgroup of the non-respondents was considered as ineligible (see comment in Section 4).
AUS-PER
Mid: The large difference between the response rates "by aggregate data" and "by individual data" in Table 5.1 is due to erroneous coding of age for non-respondents.
BEL-CHA and BEL-GHE
All three surveys: The rates refer to persons who were interviewed at home and had provided the questionnaire data. However, only a fraction of them attended the clinical examination. Therefore, the item-response rates for blood pressure and total cholesterol are significantly lower (see Table 7).
The date of examination was not available for the middle survey non-respondent data from Ghent. When calculating the age group of the non-respondents, the date of examination was estimated as the middle of the survey period.
The significantly lower participation rate "by individual data" for the middle survey is a consequence of the discrepancy between ineligibles "by aggregate data" and "by individual data" (see Table 3).
BEL-LUX
Ini: It looks like non-respondent data (Form 08) have been received by MDC also for those who are ineligible. Therefore, the rate "by individual data" is probably too low, although no confirmation for this suggestion has been received from the MCC.
CAN-HAL
Ini: The MCC has communicated that the non-respondents may include a large number of subjects who are ineligible to the sample. This is because the sampling frame is out of date. The proportion of non-respondents who were not possible to contact was 53% in the original sample. A similar situation exists for the final survey, although to a lesser extent.
CHN-BEI
Mid: The aggregate data are incorrect, since the number of non-respondents includes only the fraction that was closely investigated.
FRA-LIL
Ini: There were additional 45 non-respondents without known age. They were ignored for the rate calculations, since they influenced the results only minimally.
FRA-STR
Ini: The sample was selected from age group 25-64. More accurate age information for the subjects is available to the MONICA investigators only for those who could be contacted. The following Table A, reported by the MCC, gives the classification of the subjects in age group 25-64 (The numbers in brackets show the values computed from the individual data. They are shown only when they are different from the values reported by the MCC.):
Table A. Participation rate data for FRA-STR
  Men Women Total
Respondents 741 803 1544
Non-respondents 835 777 1612
Non-respondents with age group known 389
(390)
407
(413)
796
(803)
Ineligible     258
Participation rate A 47.0% 50.8% 48.9%
Participation rate B     same as rate A
GER-BRE
Ini: The sample was selected from age group 25-69, and the accurate age is known only for those who were contacted. Therefore, the participation rate "by individual data" is incorrect. The participation rate "by aggregate data" and "to be reported" concerns age group 25-69.
GER-RHN
Ini: MDC has not received an explanation for the large discrepancy between participation rate "by aggregate data" and "by individual data". Therefore, the rate "by individual data" is to be reported, because the individual non-respondent data had been updated by the MCC, but not the aggregate data.
HUN-BUD
Ini: The MCC has confirmed that the number of non-respondents reported (400 for all ages) is correct. There are some reservations about the rates; non-respondent data are available for a surprisingly high 95% of the non-respondents (see Tables 9.1 and 9.2). The way the high item-response rate for non-respondents was achieved is unknown.
HUN-PEC
Ini: The MCC has no information about the original sample size or the eligible sample size. Therefore, it is not possible to give any estimate of the participation rate.
Mid: The MCC is unable to separate the non-respondents and the ineligibles. Therefore, all that did not attend the examination were classified as non-respondents.
ISR-TEL
Ini: The MDC has contradictory information. According to the Sample Selection Description the sample selection and interview took place at the same time. Only those who were at home at the time of the visit were selected to the sample. However, according to the non-respondent data 82% of the non-respondents were in the category "not possible to contact".
ITA-BRI
Mid: The high item-response rate of 90% (see Tables 9.1 and 9.2) was achieved through an extraordinary effort to reach non-respondents via telephone interviews.
LTU-KAU
Fin: 55% of the non-respondents were not possible to contact. The sample was selected two years before the examination because the survey was postponed after the sample selection. Furthermore, before the final survey the country underwent major changes, and during that time no reliable information on migration was available.
MLT-MLT
Serial number inventory data and non-respondent data are missing.
NEZ-AUC
Ini: The MDC has no non-respondent data for the RUA.
From the sampling frame it is only known whether a person is older than 18 years. Therefore, the age becomes known only when contact is made with the person. The MCC estimated the sample size by multiplying the total number of letters sent by the proportion of the population 18 years and over who were aged 35-64 years.
Note also that the target population for the initial survey covers only about 80% of the population for other data components (see Section 1). The MCC might consider restricting all data components to the same common target area, or dividing the target population into two RUs, with one of them being the area of the initial survey.
POL-TAR
Ini: The group of ineligibles includes some potential non-respondents (see comments in Section 4). Therefore, values for the participation rate are slightly inflated. However, the effect is small (designating all ineligibles as non-respondents would lower the participation rate by about 5%).
ROM-BUC
Ini: It appears that non-respondent data (Form 08) have been received by MDC only for part of the non-respondents. Therefore, the rate "by individual data" is probably too high, although the MCC has not confirmed this suspicion.
RUS-MOC and RUS-MOI
Ini: The MCC is unable to provide the non-respondent data for RUS-MOI and RUS-MOC for organizational reasons.
Fin: The MCC could not provide non-response data for RUS-MOC and RUS-MOIa because of lack of resources.
RUS-NOC
Ini: Aggregate data are missing for RUS-NOCa (RUS-NOCa was created after the initial survey by sub-dividing RUS-NOCb).
SWI-TIC and SWI-VAF
Ini and Mid: A subgroup of the non-respondents was considered as ineligible (see comment in Section 4). The participation rates may therefore be slightly inflated.
UNK-BEL
Ini: The response rate "by individual data" will be reported, since the aggregate data have not been corrected for the status of some ineligibles that should be respondents.
Mid: The MCC has indicated that the non-respondents may include a large number of subjects who are ineligible for the sample. This is because the sampling frame is out of date.
UNK-GLA
A subgroup of potential non-respondents was excluded from the sampling frame (see comment in Section 4). The participation rates may therefore be slightly inflated.
USA-STA
Ini: The participation rates are slightly inflated (by about 2%), since households that could not be contacted have been included in the ineligibles.
Mid and Fin: The individual age was not known for 157 non-respondents in the middle survey and 83 non-respondents in the final survey. Therefore, as an exception, the participation rates were calculated for the age group 25-64.
All three surveys: Also, some subjects were considered as ineligible because of language problems (see comments in Section 4). As far as could be determined, their effect on participation rates was minor.
YUG-NOS
Ini: Some individuals who completed the survey questionnaire but did not attend the clinic were considered as non-respondents (should be respondents). This causes a minor depression of the participation rates.
Mid: Surprisingly, there were no ineligibles during the middle survey, whereas the other two surveys had ineligibles. No explanation for this difference is available.

Table 5.2 summarises the change in participation rates between the different surveys. The change is always calculated as the rate for the later survey minus the rate for the earlier survey. Thus, negative values indicate a decrease in participation with time. The average change between initial and middle survey was -2.15%, between middle and final -2.14%, and between initial and final -4.08%. These numbers confirm a trend of decreasing participation rates that has also been observed in other surveys.

Table B summarises the changes in the participation rates "to be reported/Definition A" between initial and middle, middle and final, as well as between initial and final.

Table B. Summary of the changes in the participation rate A "to be reported" between surveys
  Number of RUAs in categories of change in response
  <-20 -20...-10 -10...0 0...10 10...20
Mid-Ini
Fin-Mid
Fin-Ini
0
0
1
3
2
5
21
18
20
16
17
12
0
0
1

Tables 6.1 and 6.2 give the age specific participation rates for men and women respectively. The rates are based on the "to be reported" selection and calculated for 10-year age groups for each of the three surveys. The data show that there are differences in participation rates between age groups and sexes, but the patterns differ between RUAs. Only the fact that the subjects of the youngest age group (25-34 years) are the most reluctant to participate seems to hold for most of the RUAs, but there are several exceptions also to this rule. (The clearest examples are ISR-TEL, RUS-NOI and FRA-LIL in the initial survey, CZE-CZE and RUS-NOI in the middle survey, and RUS-NOI, BEL-CHA, GER-BRE and YUG-NOS in the final survey).

8. Item-response rate for survey respondents

The participation rate, as discussed in Section 7, defines the rate with which contact with a consenting participant was established. Successful contact with a respondent does not imply that all data items stipulated by the MONICA protocol can be collected. Therefore, the response rate for specific data items may be considerably lower than the rates discussed in Section 7. A typical scenario that might lead to reduced item-response rates for cholesterol measurement would be where the respondent completes the questionnaire part of the data but refuses to attend a separate clinic session for physiological data collection.

In Table 7, we compare the participation rate with the item-response rates for blood pressure, cholesterol, BMI, and smoking for each RUA and its respective surveys. Tables 8.1a (systolic blood pressure, men), 8.1b (systolic blood pressure, women), 8.2a (total cholesterol, men), 8.2b (total cholesterol, women), 8.3a (BMI, men), 8.3b (BMI, women), 8.4a (smoking, men) and 8.4b (smoking, women) provided this information in a sex and age-group specific format. The item-response rates were calculated as a product of the overall participation rate and the proportion of respondent records for which data for the specific variable were available.

In general, the item-response rates are quite close to the participation rates. The exceptions are BEL-CHA (all surveys), BEL-GHE (all surveys), CAN-HAL (Fin), ISR-TEL (Ini), SWE-GOT (Fin), and USA-STA (Ini and Fin). As far as it is known, the above scenario applies to all of these RUAs. This assumption is also supported by the item-response rate for smoking. Since smoking data were collected at the time of the interview, item-response rate and participation rate are very close.

There is little difference between the item-response rates for the four variables. However there are also a few exceptions to this general observation. Local idiosyncrasies in data collection  are probably the explanation for the fact that the item-response rates are only low for blood pressure and cholesterol in ISR-TEL.

9. Availability of non-respondent data

The MONICA Manual states:

"Even though it is not possible to get complete data required for the core study for the non-respondents, the MCC should try to collect information about their age, sex, marital status, education, smoking history and blood pressure. The objective is to estimate the selection bias which non-response inflicts on the core study. Age and sex are often known at the time of sample selection. For other data, a telephone interview or a postal questionnaire can be tried. It is recommended that all non-respondents are asked to provide information. However, it is acceptable that only a random sample of non-respondents is investigated in full. The reason for non-response should be recorded in all cases."

"The survey non-respondent data (Form 08) should be submitted to the MDC for every non-respondent, regardless of how much information was received from the non-respondent. In most cases the MCC should at least elicit the age group, sex and reason for non-response."

The instructions for collecting non-respondent data in MONICA were introduced in 1985. Therefore, it is understandable that some of the MCCs, which did the initial survey earlier, have not collected the data as required. However, it is difficult to accept a situation where the MCC is not even able to list the non-respondents, which points to a poor survey management.

Tables 9.1 and 9.2 show for each RUA and the initial, middle and final survey the availability of the non-respondent data items as a percentage of submitted non-respondent records. In column 3 of Table 9.1, the correspondence between aggregate and individual non-respondent counts is categorized as follows:

  1. SAME: This applies for RUAs where the denominator of the proportions probably represents the true number of non-respondents.
  2. MORE: In this category the denominator probably includes also cases that are ineligible for the sample. Therefore, the availability of the data may actually be better than indicated in the table.
  3. LESS: For these RUAs we suspect that non-respondent data have not been submitted for cases where few data are available. Therefore, the true proportions may be smaller than indicated in the table.
  4. SAMPLE: In these RUAs only a sample of the non-respondents was investigated. Therefore, a small proportion may actually indicate a very large availability of data in the sample. The sampling procedures in these populations were:
BEL-LUX
Initial survey: An attempt was made to investigate thoroughly a non-random sample of 30% of the non-respondents. The sampling criterion is not known. MDC probably also has non-respondent records for the ineligibles.
CHN-BEI
Initial survey: An attempt was made to investigate thoroughly a non-random sample of 50% of the non-respondents.
Mid: An attempt was made to investigate thoroughly a non-random sample of 40% of the non-respondents.
HUN-BUD
Initial survey: A random sample of unknown size was attempted for thorough investigation. This, however, is in conflict with the fact that the MDC has a record for every non-respondent and detailed information is known for about 95% of them.
ITA-BRI
Initial survey: A non-random sample of about 20% of the non-respondents was attempted for thorough investigation. The sampling criteria are unclear, and Tables 9.1 and 9.2 reveal that detailed data are available in 26% of the non-respondents.
POL-WAR
Initial survey: A random sample of 20% of the non-respondents was attempted for thorough investigation.
ROM-BUC
Initial survey: There is no information about the sampling, but the MDC has a non-respondent record only for about 25% of the non-respondents. Therefore, the proportions shown in Tables 9.1 and 9.2 are over-estimates.
RUS-NOC and RUS-NOI
Final survey: A non-random sample of about 25% of non-respondents with equal number of individuals in each 10-year age group was investigated thoroughly by home visits.

One may assume that the complete absence of data for the items Marital Status to Weight in Table 9.2 is an indication for a systematic neglect of investigation of non-respondents. This then implies that in 21/54 (39%) RUAs in the initial survey, in 10/43 (23%) RUAs in the middle survey, and in 7/41 (17%) RUAs in the final survey, no attempt was made to investigate the non-respondents in more detail. In most of the cases where such an attempt was made, detailed information is available for fewer than 50% of the non-respondents. Therefore, the utility of the non-respondent data is very limited for many RUAs.

10. Reason for non-response

The reason for non-response may not be available from the RUAs where the initial survey was started before this data item was introduced in 1985. However, the reason for non-response should be available for every non-respondent of every RUA in the middle and final surveys. Table 9.1 indicates that there is only a slight improvement in the availability of this data item between surveys. In the initial survey the reason of non-response was available for more than 80% non-respondent in 21/54 (39%) RUAs. In the middle survey the number was 19/43 (44%), and in the final survey it was 21/41 (51%).

Table 10 gives the proportions of the different reasons for non-response for the initial, middle, and final survey respectively. Among the RUAs where the reason is known in more than 80% of the records, the main reasons of non-response were "Not possible to contact", "Not interested" and "Other refusal", but the proportions varied remarkably between the RUAs. It is possible that the difference between "Not interested" and "Other refusal" has been interpreted differently between the MCCs, and even between the two surveys within some RUAs. "Temporarily out of the area" and "Medical reasons" were rare reasons in nearly all RUAs. This scenario did not change much from survey to survey.

11. Consistency of respondent populations in repeated surveys

When repeated surveys are conducted in random samples of the same population, as happens in MONICA, the sample mean values of the measurements can change for many reasons:

  1. There is a change in the population mean values, either because of changes in the individual persons' values or because of a change in the composition of the population;
  2. There is a statistical error because we are measuring only a sample of the population and many of the persons' values (like blood pressure and cholesterol) have short term fluctuations;
  3. There is a change in the representation of the respondent sample, either because of a bias in the sampling frame or because of non-response bias; or
  4. There is a measurement bias.

The objective of the MONICA Project is to estimate the changes due to the first reason. The standard error of the estimates gives information about the statistical error. The third and fourth reasons indicate bias, which we want to avoid. The fourth reason is investigated in the quality assessment reports of the individual risk factors, whereas the third reason is a topic of the current report.

Estimates of population changes in situations where there should be very little or no change in the actual population mean values can be used as indicators of possible bias in the estimates. Under the assumption that the MONICA participants aged 35 or older have essentially concluded their formal education, a birth cohort (i.e. people born in certain years) should show no change in the number of years of schooling from survey to survey, except for randomness caused by sample selection. Similarly, birth cohorts in the age range that is of interest to MONICA should not increase in mean body height, but instead show a small decline as they age. Finally, birth cohorts are unlikely to show a change in the proportion of never-smokers, unless there is a significant selective mortality of smokers. These cohort trends have been investigated in detail in the quality assessment reports (QA) on education (4), weight and height ( 5) and smoking (6). In each of the three quality assessment reports, Cohort trend scores (CTS) were defined.  The scores were based in the estimated changes and their standard errors for men and women in the common age groups 35-44 and 45-54, in three steps:

  1. The average change (A) was calculated for each sex/birth cohort. This average change was used as the reference value around which the random variation
    was expected to occur.
  2. Upper and lower limits were set for a change as A ± 2.5 SE, where SE is the standard error of the estimated change. If a change is normally distributed with mean A and variance SE2, then the probability that all four changes (two birth cohorts and two sexes) are within the limits is 95%.
  3. The Cohort Trend Score (CTS) was defined as:
    CTS = 2 if all four changes are within limits;
    1 if one of the four changes is out of limits;
    0 if at least two of the four changes are out of
    limits.

The CTS for body height is derived in Table 6 of the Weight and height QA (5). Similarly, CTS for years of schooling can be found in Table 8 of the Education QA (4) and the CTS for never-smokers in Table 18 of the Smoking QA (6). The three CTS for each RUA are collected in Table 11 of this report to provide a comprehensive picture of birth cohort trends.

It is often difficult to assess from the data whether the changes in the cohort trends are due to measurement bias, change in the target population or change in sample representation. Therefore, the MCCs with low CTS were asked to check possible reasons for the cohort changes in the quality assessment reports for the three individual variables. A change in the target population or in the survey representation is likely to induce changes in the cohorts for more than one variable. A measurement bias in several variables is also possible but less likely. Hence, we will confine our comments here to RUAs where more than one variable has a low CTS for a particular survey pair:

BEL-CHA
Between the initial and the final survey there was a large increase in body height and a decrease in years of schooling. Neither of these changes is plausible, suggesting a change either in the target population or in the survey participation. Furthermore, the fact that height and years of schooling are usually positively correlated, strengthens the possibility of measurement bias as an explanation.
CHN-BEI
A major change appears to have occurred between the initial and the middle survey. All three variables have low CTS, which is due to a decrease in all three variables. The observed changes provide strong evidence of changes in either the target population or the survey participation.
FRA-STR
All three variables have low CTS for the Ini-Fin survey pair. The low scores for body height and never-smokers are due to an increase in these variables, which is incompatible with stable target population or survey participation. The low score for years of schooling is due to an increase in years of schooling. The MCC reported that the Census also found increased schooling between 1982 and 1990. Nevertheless, the simultaneous occurrence of three low CTS rather suggests a change in either target population or survey participation.
RUS-MOI
For the Ini-Fin survey pair, years-of-schooling has a low CTS because of a large increase in this variable. An increase in the proportion of never-smokers also produced a low CTS. The increase in the years of schooling is plausible, but the increase in the never-smokers is not. The combination of the two scores thus is more likely to reflect a change in target population or survey participation.
RUS-NOI
For the initial and final survey comparison a large decrease in body height and a large increase in years of schooling caused low CTS for both variables. These changes warrant a careful investigation of the stability of the target population or participant characteristics.
SPA-CAT
Only between the middle and final survey occurred more than one low CTS. This was the result of an increase in body height and a decrease in years of schooling. Furthermore, years-of-schooling showed progressive decrease with every survey, producing CTS of zero for all three comparisons. According to the MCC, the decrease in years of schooling is the consequence of a change in the target population as a result of immigration.
UNK-BEL
Body height and years-of-schooling produced low CTS for the Mid-Fin and Ini-Fin comparison. The low scores for both comparisons were the result of increases in the respective variables. While the increase in years of schooling is plausible, the increase in body height is not. The combination of the two observations, therefore, suggests that in the final survey a change has occurred in either the target population or the survey participation.

It should be pointed out that a high CTS is not a proof for unbiased survey participation. However, it makes it less likely that target population or participant characteristics have changed between surveys. As mentioned above, a change in the characteristics of the target population is fully compatible with the objectives of the MONICA Project and not a source of bias. (Nevertheless, if caused by large migration, it is often associated with difficulties in obtaining a representative sampling frame and accurate population estimates.) A change in the characteristics of the survey participation, however, is a source of bias in the estimation of changes in the risk factors in the population.

12. Sampling fractions

A sample is said to be self-weighting if every member of the population has an equal probability of becoming selected to the sample. A simple mean calculated from a self-weighting sample gives an unbiased estimate of the population mean value. If the sample is not self-weighting, a weighted mean of the sample will be needed to get unbiased estimates of the population mean value. Therefore, the data analysis will be simpler for a self-weighting sample than for a sample that is not self-weighting.

Typical situations, which may lead to unequal sampling probabilities of the subjects, are stratified sampling and multi-stage sampling. Most of the MONICA samples were stratified by 10-year age group, but this is not a problem because the age or age-group will nearly always be taken into account in the data analysis anyway. Within the age groups, nearly all, if not all, of the MONICA samples were designed such that the subjects have approximately equal sampling probabilities. There is, however, one situation where this is not necessarily the case. The sample sizes in the different Reporting Units (RU) of the RUAs may have been chosen to be equal even if the population sizes are not equal, or some other criteria may have been used for the sample sizes. Therefore, there is particular interest to check the sampling probabilities of the subjects of the RUs of the RUAs that consist of more than one RU. There are nine such RUAs: AUS-NEWa, CZE-CZEa, GER-EGEb, ICE-ICEa, ITA-FRIa, POL-TARa, POL-WARa, SWE-NSWa and USA-STAa.

A simple measure of the sampling probability is the sampling fraction, defined as the proportion of the sample in the population. The sampling fraction of the RUs of the nine RUAs by sex and 10-year age group is given in Table 12. The denominator for calculating sampling fractions was the population size of the age/sex group of the RU in the calendar year of the middle of the survey (or the nearest year for which the data were available). The eligible sample size from the aggregate data was the numerator.

No weighting of the subjects will be needed if the sampling fractions are approximately equal in the RUs within each age/sex group within each survey. According to Table 12 this seems to be the case for most of the RUAs. The exceptions are:

ICE-ICE:
The sampling fraction of RU2 (Arnes County) is consistently nearly 10 times the sampling fraction of RU1 (Reykjavik). The reason is that the sample size was the same in Arnes County as in Reykjavik although Reykjavik is nearly ten times as big as Arnes County. Arnes County was included in the survey as a representative of the rest of the country (except Reykjavik). A third of Iceland's population lives in Reykjavik. Unweighted sample mean values would give a very high relative weight to RU2 compared with RU1, but the relative weight would be similar for each of the three surveys. As a consequence, unweighted estimates of trends in the mean values will represent the trend of the RUA as a whole only if the trends in the different RUAs are similar.
USA-STA:
The sampling fractions vary between about 1 and 3 between the populations but remain consistent between the three surveys. Unweighted sample means would give much higher relative weight to some reporting units compared with others, but the relative weights would be similar for each of the three surveys. As a consequence, unweighted estimates of trends in the mean values will represent the trend of the RUA as a whole only if the trends in the different RUAs are similar.
GER-EGEb:
The sampling fractions of RU19 are twice the sampling fractions of RU17 in the initial survey but the sampling fractions are equal in the final survey. Unweighted sample mean values would give equal relative weights to the two RUs in the final survey but not in the initial or the middle survey. As a consequence, if there is a major difference in the risk factor levels between the two RUs, unweighted estimates of trends in the mean values will not represent the trend of the RUA as a whole even if the trends in the different RUAs were similar.

Weighting the RUAs in analysis is a solution to these problems, but will complicate all analyses essentially. If we can assume that each of these three RUAs are relatively homogenous, no weighting will be needed.

It is not very likely that unweighted analysis will have a major impact in the cases of ICE-ICE and USA STA. For GER-EGE the situation seems more complex. To see the potential effect of neglecting the weighting in GER-EGE, Table C gives an example for risk factor differences which are probably much larger than those actually observed for the reporting units.

Table C. Influence of neglecting weighting when estimating change in risk factor mean value in a hypothetical situation
Variable RU Sampling fraction Proportion or mean value Estimate of change
Ini Fin Ini Mid Weighted Unweighted
Smoking A
B
1
2
2
2
30 %
40 %
30 %
40 %
0 % -1.7 %
Systolic blood pressure A
B
1
2
2
2
130 mmHg
140 mmHg
130 mmHg
140 mmHg
0 mmHg -1.7 mmHg
Total cholesterol A
B
1
2
2
2
5.4 mmol/l
6.0 mmol/l
5.4 mmol/l
6.0 mmol/l
0 mmol/l -0.1 mmol/l

The biases given in the rightmost column of Table C are relatively small compared with the achievable measurement accuracy of the risk factors. As the differences in the risk factor levels between the RUs in Table C are probably also much larger than the real differences between the RUs of GER-EGE, one should feel quite comfortable in estimating the trend without weighting even for GER-EGE.

13. Discussion and recommendations

This report was an attempt to summarise the situation concerning the quality of sampling frames and survey non-respondents in the MONICA risk factor surveys. We can draw some general conclusions from the findings.

There were big differences in the availability of good sampling frames among the RUAs. Only about 40% of the sampling frames were found to be of good quality, and 25% were clearly of poor quality. There still exists a lot of uncertainty about the specific properties of the sampling frames in the various RUAs. For example, it is not clear whether all population registers can be considered equal in terms of being up-to-date. Also, there appear to be significant differences in the quality of electoral rolls. We therefore feel that our use of proportion of ineligibles and proportion of not located for quality assessment was probably the best choice under the circumstances.

Many MCCs did not report the exact definition of eligibility used in their initial survey. In the middle and final surveys, most MCCs defined eligibility to the survey according to the instructions given in the MONICA Manual. However, such assertions are often suspect since they conflict with other data available at the MDC. If definition of eligibility changes, the comparison of response rates between different surveys becomes difficult. In RUAs where the proportion of ineligibles is high, there is a great risk of bias, if many ineligibles were so classified by mistake and if they were exceptional with respect to the risk factors for cardiovascular diseases. Such biases could have a noticeable influence on the estimates of risk factor trends. We suggest that data weighting should be considered for RUAs, where eligibility changed or where major discrepancies between individual and aggregate data (Table 4) exist.

If the response rate exceeds 80%, we can be quite confident in applying the results of the survey to the whole RUA, provided that the quality of the data is otherwise good. Only one third of the RUAs in the initial survey, one fifth of the RUAs in the middle survey, and one sixth of the RUAs in the final survey reached that level. The 70% limit, which we might still consider satisfactory, was exceeded by two thirds of the RUAs in the initial survey, three-quarters in the middle survey, and two thirds again in the final survey. The fact that a noticeable number of RUAs remained below 70% and the extreme ones even below 50% is a concern. It may be erroneous to assume that the risk factor changes, which are observed in these surveys, reflect the situation in the population.

The data available for this report do not provide hard evidence for the representativeness of the respondents. The crucial problem is the low availability of information about the non-respondents, a problem shared with most other surveys. Even if more data were available, they would have to be treated with caution, in comparison to the data about the respondents. The conclusions one could draw from the non-respondent data would hardly be more than qualitative. Nevertheless, such data would help in understanding the full risk factor profile and trends in the population. Perhaps the next step for investigating the non-respondents in more detail has to be taken locally by the MCCs. The MCCs have the best knowledge of the available sampling frames, the procedures used to achieve high response rates, the procedures used to investigate the non-respondents, and possibly other information which helps to characterise the non-respondents.

Suggestions for conducting such investigations are:

  1. If the level of education of the target population is available from census data, it can be compared with the level of education in the MONICA sample. Note, however, that the questions for establishing education levels have to be comparable between census and MONICA surveys.
  2. Evaluate the changes for "Reason of non-response" between surveys.

14. Comments on individual RUAs

The following list includes only the RUAs with specific findings or exceptional background information relevant for the use of the data.

AUS-NEW

AUS-PER

BEL-CHA

BEL-GHE

BEL-LUX

CAN-HAL

FRA-LIL

FRA-STR

FRA-TOU

GER-AUR and GER-AUU

GER-BER and GER-COT and GER-EGE and GER-HAC and GER-KMS and GER-RDM

GER-BRE

GER-RHN

HUN-BUD

ISR-TEL

ITA-BRI

ITA-FRI

ITA-LAT

LTU-KAU

MLT-MLT

NEZ-AUC

ROM-BUC

RUS-MOC and RUS-MOI

RUS-NOC and RUS-NOI

SPA-CAT

UNK-BEL

UNK-GLA

USA-STA

YUG-NOS

References

  1. Tunstall-Pedoe H for the WHO MONICA Project. The World Health Organization MONICA Project (Monitoring Trends and Determinants in Cardiovascular Disease): A major international collaboration. J Clin Epidemiol 1988;41:105-14.
  2. WHO MONICA Project. MONICA Manual. (1998-1999). Available from: URL:http://www.thl.fi/publications/monica/manual/index.htm, URN:NBN:fi-fe19981146.
  3. Kuulasmaa K, Tolonen H, Ferrario M, Ruokokoski E for the WHO MONICA Project. Age, date of examination and survey periods in the MONICA surveys. (May 1998).  Available from: URL:http://www.thl.fi/publications/monica/age/ageqa.htm, URN:NBN:fi-fe19991075.
  4. Molarius A, Kuulasmaa K, Moltchanov V, Ferrario M for the WHO MONICA Project. Quality Assessment of Data on Marital Status and Educational Achievement in the WHO MONICA Project. (December 1998). Available from: URL:http://www.thl.fi/publications/monica/educ/educqa.htm, URN:NBN:fi-fe19991078.
  5. Molarius A, Kuulasmaa K, Sans S for the WHO MONICA Project. Quality assessment of weight and height measurements in the WHO MONICA Project. (May 1998). Available from: URL:http://www.thl.fi/publications/monica/bmi/bmiqa20.htm, URN:NBN:fi-fe19991079.
  6. Molarius A, Kuulasmaa K, Evans A, McCrum E, Tolonen H for the WHO MONICA Project. Quality assessment of data on smoking behaviour in the WHO MONICA Project. (February 1999). Available from: URL:http://www.thl.fi/publications/monica/smoking/qa30.htm, URN:NBN:fi-fe19991077.