![]() |
Data transfer format - SNP genotypic data transfer to MDC
|
|
© National Institute for Health and Welfare
and the MORGAM Project investigators Last updated: 2011-01-14 For more information, please contact Päivi Laiho (firstname.lastname@thl.fi), Ari Haukijärvi (firstname.lastname@thl.fi) or Kari Kuulasmaa (firstname.lastname@thl.fi) |
The purpose of this transfer format is to provide an exact and common format for the MORGAM Genotyping Laboratories (GLAB) to transfer genotypic data to the MORGAM Data Centre (MDC) at National Institute for Health and Welfare (THL). The data should be sent through e-mail.
| NAME | SPECIFICATION AND CODES | FORMAT or Value | |
|---|---|---|---|
| FORM | Form number | I2 | |4|6| |
| VERSION | Version of this form | I1 | |2| |
| GLAB | Genotyping Laboratory code (GLAB), selected from the list of genotyping laboratories | I3 | |_|_|_| |
| SHIPMENT | Shipment number | I6 | |_|_|_|_|_|_| |
| SHIPDATE | Date of this shipment (date ANSI) YYYYMMDD (year month day) |
C8 | |_|_|_|_||_|_||_|_| |
| KEY2 | Sample identification | I7 | |_|_|_|_|_|_|_| |
| MARKER | SNP / polymorphism name (abbreviation) as used locally | C30 | free text |
| METHOD | Method of genotyping | I1 | |_| |
| GENOTYPE | Genotype | C256 | free text |
| STATUS | Genotyping status | I1 | |_| |
| GENODATE | Genotyping analysis run date or reading date (date ANSI) YYYYMMDD (year month day) |
C8 | |_|_|_|_||_|_||_|_| |
| ORIENTATION | Sequence orientation 'F' = Forward 'R' = Reverse |
C1 | |_| |
| STRAND | Strand 'T' = Top 'B' = Bottom |
C1 | |_| |
| COMMENTS | Comments on this genotypic data row (if any) | C100 | free text |
The item NAME on the document is a computer variable name used for the item by the MDC.
| FORMAT | Type | Format | Example | Comments |
|---|---|---|---|---|
| C | Character | C8 | 20090205 | Used for dates, trailing zeroes are mandatory. |
| C20 | "B 17 E" | Location of sample in box; surround with double quotes (") if value contains spaces. Maximum of 20 characters allowed for this variable, including quotes. | ||
| C3 | -80 | Storage temperature, includes leading '-' sign. | ||
| F | Float | F5.2 | 13.1 2.18 |
Variable must include a decimal point (.). Leading and trailing zeroes are not mandatory. |
| I | Integer | I3 | 1 15 180 |
Leading zeroes are not mandatory. |
Instructions for making corrections to data that have already been sent to the MDC are
given in Section Data communication between the
Participating Centres and the MDC.
Please contact the MDC for instructions if you can not provide information as specified in
this document or if you have any problems with the interpretation of the coding for any
specific items.
Follow these instructions carefully when creating a computer file for the data transfer from the GLAB to the MDC.
| FORM | Form number | C2 | |4|6| |
|---|
Number 46 indicates the "Data transfer format: SNP data transfer to MDC".
| VERSION | Version of this form | C1 | |2| |
|---|
Current version of this data transfer format. If the version number of the data transfer format which you are using is not "1", these instructions do not correspond to the format you are using. Version number changes if some major changes are introduced in the form, for example adding / deleting some variables.
| GLAB | Genotyping Laboratory code | C3 | |_|_|_| |
|---|
Enter appropriate code of your MORGAM Genotyping Laboratory from the list of MORGAM Genotyping Laboratories:
If the genotypic data are sent from an MPC instead of a MORGAM laboratory, enter here the MPC code (e.g. 026 for Augsburg).
| SHIPMENT | Shipment number | I6 | |_|_|_|_|_|_| |
|---|
This is shipment number from genotyping laboratory to MDC. Usually it is a sequential number of shipments to MDC. If you do nof use it, enter number of shipment to MDC on that particular day.
| SHIPDATE | Date of this shipment (date ANSI) YYYYMMDD (year month day) |
C8 | |_|_|_|_||_|_||_|_| |
|---|
The date of the shipment (the date when you completed this data form). Format is yyyymmdd (date ANSI, 8 characters, year day month).
| KEY2 | Sample identification | C7 | |_|_|_|_|_|_|_| |
|---|
KEY2 is a primary key, identifying DNA sample in MORGAM. KEY2 does not include any of the components of KEY1 (identifying individual in MORGAM study). KEY2 sample identification code is generated in the MDC and it's relation with KEY1 is kept in the MDC. The labels used for transferring the DNA samples from the DNA storage unit in KTL to other MORGAM Genotyping Laboratories include only KEY2 codes.
| MARKER | SNP / polymorphism name (abbreviation) as used locally | C30 | free text |
|---|
Enter here the SNP / polymorphism name, as it is recorded in your local database.
| METHOD | Method of genotyping: 1 = Chip 2 = MassSpec 3 = Amplifuor 4 = Taqman 5 = Agarose (fragment analysis on agarose ) 6 = KASPar 7 = Fluorescent fragment analysis on ABI Sequencer |
C1 | |_| |
|---|
Enter here the code of genotyping method.
| GENOTYPE | Genotype | C256 | free text |
|---|
Enter here alleles in capital letters (A, C, T, G) in alphabetical order, from both chromosomes, separated by slash (/). Deletions are marked as D, unknown genotype as N, for example:
The reason for an unknown genotype will be give in item STATUS.
| STATUS | Genotyping status: 1=Sample lost 2=Not genotyped because no DNA left 3=Sample available but not genotyped 4=Genotyping unsuccsessful 5=Successfully genotyped |
C1 | |_| |
|---|
Use codes 1...4 to specify the reason for unknown genotype if code N was used for GENOTYPE.
Use code 5 if both alleles were coded using A,C,T,G or D.
| GENODATE | Genotyping analysis run date or reading date (date ANSI) YYYYMMDD (year month day) |
C8 | |_|_|_|_||_|_||_|_| |
|---|
Enter here the analysis run date or reading date of this genotype.
| ORIENTATION | Sequence orientation 'F' = Forward 'R' = Reverse |
C1 | |_| |
|---|
Sequence orientation according to which the items GENOTYPE, 5PRIME and 3PRIME were coded.
| STRAND | Strand 'T' = Top 'B' = Bottom |
C1 | |_| |
|---|
Strand position.
| COMMENTS | Comments on this data row | C100 | free text |
|---|
Enter here comments (if any) for this data row. Surround value of the COMMENTS with double quotes (") if it contains text with spaces. Place field delimiter (;) in data row if there were no comments.
The data should be prepared in ASCII comma-delimited format using semicolon (;) as delimiter, with the names of the variables in the first row. If the value of a variable contains text with delimiter used inside the text then the value should be surrounded by double quotes ("). A dot (.) shold be used as a decimal point.
To simplify the tracing of data transfers between the MPC and the MDC, the files should be named as follows:
F46_GLAB_YYYYMMDD_N.CSV, where:
To avoid possible errors in variable naming use the first row from the example below containing variable names. Example of Form 46 ASCII comma-delimited data file:
|
FORM;VERSION;GLAB;SHIPMENT;SHIPDATE;KEY2;MARKER;METHOD;GENOTYPE;STATUS;GENODATE,STRAND;ORIENTATION;COMMENTS 46;2;911;3;20110913;1234567;ITGA2_123;2;A/G;5;20110215;F;B; 46;2;911;3;20110913;1234567;F13_18;2;C/C;5;20110215;R;T; |
The data files should be archived (ZIP format) before sending them to the MDC. The archive names should be the same as the file names. MORGAM data Error checking program will be used to prepare data (will be available later). The data should be sent to MDC (ari.haukijarvi@thl.fi) as an attachment to an e-mail message. Please make sure to save a copy of the transferred files in the sending MPC or laboratory.
When the data are received in the MDC, an acknowledgement will be sent to the sender. The data will then be checked for consistency and quality, and after any problems have been resolved, they will be entered into the MORGAM database.
|
To the top of the form |