![]() |
Data transfer format: Case and subcohort selection data
|
|
© National Institute for Health and Welfare
and the MORGAM Project investigators Last updated: 11 November 2015 For more information, please contact Kari Kuulasmaa (firstname.lastname@thl.fi) |
The purpose of this form is to provide a format for the transfer of data from the selection of cases and subcohort to MORGAM database. Data in the format specified here are provided for every member of each MORGAM cohort for which the selection of cases and the subcohort has been done. The MORGAM case-cohort design is described in sections "Case-cohort sampling in MORGAM" and "Enlargement of subsample" of the Manual.
Data in the format specified here is provided for every member of the MORGAM cohort. This format should not be used for transferring data from the MORGAM Participating Centres to the MORGAM Data Centre (MDC), because these data are generated by the MDC.
| ITEM NAME | SPECIFICATION AND CODES | FORMAT OR VALUE | |
|---|---|---|---|
Form identification: |
|||
| FORM | Form identification | I2 | |6|5| |
| VERSN | Form version | I1 | |5| |
General information on selection - these items have the same value for all records in this transfer set: |
|||
| SELECTION | Case and subcohort selection id number | C3 | |_|_|_| |
| PHASE | Phase of selection of cases and subsample
in the cohort 01 = first 02 = second etc. |
C2 | |_|_| |
| DATE | Date when the selection of cases and cohort subsample was done (date ANSI) | C8 | |_|_|_|_||_|_||_|_| |
KEY1R: |
|||
| CENTRE | MORGAM Participating Centre | C2 | |_|_| |
| RUNIT | MORGAM Reporting Unit | C2 | |_|_| |
| COHORT | Cohort identification within the RUNIT 01 = MONICA baseline survey 02 = MONICA middle survey 03 = MONICA final survey 21, 22, ... other cohorts |
C2 | |_|_| |
| SERIAL | Serial number | C6 | |_|_|_|_|_|_| |
| ROUNDS | Measurement round of the cohort | C2 | |_|_| |
Case and subcohort selection data: |
|||
| ELIGSC | Eligibility of the person to the subcohort 1 = eligible 2 = ineligible |
I1 | |_| |
| PROB | Selection probability of the person to the random cohort subsample | R | |
| SUBCOH | Was the person selected to the random cohort subsample? 1 = yes 2 = no |
I1 | |_| |
| PROBDTH | Selection probability of the person because of death | R | |
| CASEDTH | Was the person selected to the case-cohort
set because of death? 1 = yes 2 = no |
I1 | |_| |
| PROBCHD | Selection probability of the person because of CHD event during follow-up | R | |
| CASECHD | Was the person selected to the case-cohort
set because of CHD event during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBSTR | Selection probability of the person because of stroke during follow-up | R | |
| CASESTR | Was the person selected to the case-cohort
set because of stroke during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBTED | Selection probability of the person because of venous thromboembolism during follow-up | R | |
| CASETED | Was the person selected to the case-cohort
set because of a thrombo-embolic event during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBAP | Selection probability of the person because of angina pectoris during follow-up | R | |
| CASEAP | Was the person selected to the case-cohort
set because of angina pectoris during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBHF | Selection probability of the person because of heart failure during follow-up | R | |
| CASEHF | Was the person selected to the case-cohort
set because of heart failure during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBAF | Selection probability of the person because of atrial fibrillation during follow-up | R | |
| CASEAF | Was the person selected to the case-cohort
set because of atrial fibrillation during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBDIAB | Selection probability of the person because of type 2 diabetes during follow-up | R | |
| CASEDIAB | Was the person selected to the case-cohort
set because of type 2 diabetes during follow-up? 1 = yes 2 = no |
I1 | |_| |
| PROBBCHD | Selection probability of the person because of CHD at baseline | R | |
| CASEBCHD | Was the person selected to the case-cohort
set because of CHD at baseline? 1 = yes 2 = no |
I1 | |_| |
| PROBBSTR | Selection probability of the person because of stroke at baseline | R | |
| CASEBSTR | Was the person selected to the case-cohort
set because of stroke at baseline? 1 = yes 2 = no |
I1 | |_| |
| GENGROUP | Genotyping group 1 = subcohort member or case during the follow-up who was healthy at baseline 2 = baseline case not in the subcohort 8 = not selected to the case-cohort set 9 = not used |
I1 | |_|
|
| FORMAT | Type | Format | Example | Comments |
|---|---|---|---|---|
| C | Character | C7 | SWE-NSWa | RUA abbreviation used in MORGAM. |
| C2 | 03 | Cohort identification | ||
| F | Float | F5.2 | 13.1 | Variable includes a decimal point (.). |
| R | Real number | R | 0.24927345 | Selection probability. |
| I | Integer | I5 | 221 10323 |
The data files shall be prepared in ASCII comma-delimited format using semicolon (;) as the delimiter, with the names of the variables in the first row.
Follow these instructions carefully when creating the computer file for data transfer. Any exceptions to these coding rules must be documented in the Case and subcohort selection report in the internal MORGAM web site.
| FORM | Form identification | I2 | |6|5| |
Number 65 indicates the "Data transfer format: Case and subcohort selection data".
| VERSN | Version of this form | I1 | |5| |
This indicates the version number of this data transfer format entitled "Data transfer format: Case and subcohort selection data".
| SELECTION | Case and subcohort selection id number | C3 | |_|_|_| |
This is a unique sequence number which identifies the selection of cases and subcohorts of this transfer set. The first ever selection has value 001, the second 002 etc. Gaps are not allowed in the numbering. For each selection, a report with the same SELECTION number has to be prepared in the internal MORGAM web site (see Case and subcohort selection report).
Item SELECTION must have the same value for every record in this data transfer set.
| PHASE | Phase of selection of cases and subsample
in the cohort 01 = first 02 = second etc. |
C2 | |_|_| |
The selection of cases and subcohort can be supplemented later, for example due to extension of the follow-up period of a MORGAM cohort.
Each time when there are changes to the case-cohort selection for a RUNIT/COHORT combination, the PHASE value is increased to the next unused value within the cohorts being considered. Item PHASE must have the same value for every record in this data transfer set. (If, for any reason, the highest earlier PHASE-value varied between the cohorts considered, there will be gaps for the sequence of the PHASE values for some cohorts. This does not matter.)
| DATE | Date when the selection of cases and cohort subsample was done (date ANSI) | C8 | |_|_|_|_||_|_||_|_| |
The first four numbers indicate the year, the next two the month and the last two the day of month.
Item DATE must have the same value for every record in this data transfer set.
| CENTRE | MORGAM Participating Centre | C2 | |_|_| |
| RUNIT | MORGAM Reporting Unit | C2 | |_|_| |
| COHORT | Cohort identification within the RUNIT 01 = MONICA baseline survey 02 = MONICA middle survey 03 = MONICA final survey 21, 22, ... other cohorts |
C2 | |_|_| |
| SERIAL | Serial number | C6 | |_|_|_|_|_|_| |
| ROUNDS | Measurement round of the cohort 01 = baseline measurement 02 = second measurement etc. |
C2 | |_|_| |
These are key items used for identifying the record and merging it with other records of the same individual.
CENTRE is the official MORGAM Participating Centre code number, RUNIT the official MORGAM Reporting Unit code number and COHORT the official MORGAM Cohort code number as they appear in section "MORGAM Participating Centres and cohorts" of the MORGAM Manual.
SERIAL is the identification for the individual. It is unique within the combination of CENTRE, RUNIT and COHORT.
ROUNDS identifies the measurement round of the cohort (repeat measurements).
| ELIGSC | Eligibility of the person to the subcohort 1 = eligible 2 = ineligible |
I1 | |_| |
The two most common criteria for the eligibility of the person to the subcohort are:
Other criteria have been used in specific situations. These are described in the Case and subcohort selection report.
| PROB | Selection probability of the person to the random cohort subsample | R |
This is the sampling probability (see section Sampling within the strata of Case cohort sampling in MORGAM for the biomarker substudy and Selection of cases and cohort subsample for the genetic substudy) for the individual to the random subcohort, i.e. the marginal probability on which the person is in the cohort subsample.
Code 0 if the person was ineligible for the subsample.
The sampling weights for data analysis in the case-cohort design are derived from items PROB, PROBDTH, PROBCHD etc, depending on the definition of the end-point for each analysis.
| SUBCOH | Was the person selected to the random
cohort subsample? 1=yes 2=no |
I1 | |_| |
This item indicates whether the individual belongs to the random subcohort or not (for genetic study see Subcohort sampling).
Code 2 if the person was ineligible for the subsample.
| PROBDTH | Selection probability of the person because of death | R |
This item gives the person's sampling probability to the case-cohort set because of death. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not die during follow-up. For the eligible person who died during follow-up, it is usually one. In situations where only a random sample of the deaths are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEDTH | Was the person selected to the case-cohort
set because of death? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of death.
| PROBCHD | Selection probability of the person because of CHD event during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of a CHD event during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have a CHD event during follow-up. For the eligible person who had a CHD event during follow-up, it is usually one. In situations where only a random sample of the CHD cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASECHD | Was the person selected to the case-cohort
set because of CHD event during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of CHD event during follow-up.
| PROBSTR | Selection probability of the person because of stroke during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of a stroke during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have a stroke during follow-up. For the eligible person who had a stroke during follow-up, it is usually one. In situations where only a random sample of the stroke cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASESTR | Was the person selected to the case-cohort
set because of stroke during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of stroke during follow-up.
| PROBTED | Selection probability of the person because of venous thromboembolism during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of venous thromboembolism during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have a thromboembolic event during follow-up. For the eligible person who had a thromboembolic event during follow-up, it is usually one. In situations where only a random sample of the cases of thromboembolism are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASETED | Was the person selected to the case-cohort
set because of a thrombo-embolic event during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of a thrombo-embolic event during follow-up.
| PROBAP | Selection probability of the person because of angina pectoris during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of the diagnosis of angina pectoris during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have angina pectoris during follow-up. For the eligible person who had angina pectoris during follow-up, it is usually one. In situations where only a random sample of the angina pectoris cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEAP | Was the person selected to the case-cohort
set because of angina pectoris during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of the diagnosis of angina pectoris during follow-up.
| PROBHF | Selection probability of the person because of heart failure during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of heart failure during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have heart failure during follow-up. For the eligible person who had heart failure during follow-up, it is usually one. In situations where only a random sample of the heart failure cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEHF | Was the person selected to the case-cohort
set because of heart failure during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of heart failure during follow-up.
| PROBAF | Selection probability of the person because of atrial fibrillation during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of atrial fibrillation during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have atrial fibrillation during follow-up. For the eligible person who had atrial fibrillation during follow-up, it is usually one. In situations where only a random sample of the atrial fibrillation cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEAF | Was the person selected to the case-cohort
set because of atrial fibrillation during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of atrial fibrillation during follow-up.
| PROBDIAB | Selection probability of the person because of type 2 diabetes during follow-up | R |
This item gives the person's sampling probability to the case-cohort set because of the diagnosis of type 2 diabetes during follow-up. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have type 2 diabetes during follow-up. For the eligible person who had type 2 diabetes during follow-up, it is usually one. In situations where only a random sample of the type 2 diabetes cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEDIAB | Was the person selected to the case-cohort
set because of type 2 diabetes during follow-up? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because type 2 diabetes during follow-up.
| PROBBCHD | Selection probability of the person because of CHD at baseline | R |
This item gives the person's sampling probability to the case-cohort set because of history of CHD at baseline. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have a CHD at baseline. For the eligible person who had CHD at baseline, it is usually one. In situations where only a random sample of the baseline CHD cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEBCHD | Was the person selected to the case-cohort
set because of CHD at baseline? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of CHD at baseline.
| PROBBSTR | Selection probability of the person because of stroke at baseline | R |
This item gives the person's sampling probability to the case-cohort set because of history of stroke at baseline. It is always zero for persons who were ineligible to the case-cohort set and for persons who did not have stroke at baseline. For the eligible person who had stroke at baseline, it is usually one. In situations where only a random sample of the baseline stroke cases are included in the case-cohort set, it is less than one. The used sampling procedure must be described in the Case and subcohort selection report.
| CASEBSTR | Was the person selected to the case-cohort
set because of stroke at baseline? 1 = yes 2 = no |
I1 | |_| |
This item indicates whether or not the person was selected to the case-cohort set because of stroke at baseline.
| GENGROUP | Genotyping group 1 = subcohort member or case during the follow-up who was healthy at baseline 2 = baseline case who is not in the subcohort 8 = not selected to the case-cohort set 9 = not used |
I1 | |_| |
This item is relevant only for genetic studies.
| Date | Update |
|---|---|
| 2007-05-21 | Item GENGROUP was added |
| 2012-01-12 | Item ROUNDS was added and form version changed to 3 |
| 2013-04-26 | Version 4: New case definitions, new criteria for sampling. |
| 2015-11-11 | Version 5: The possibility of random sampling of cases was accommodated |