MORGAM logo

Data transfer format - SNP genotypic data transfer to MDC

  • Form: 46
  • Version: 2
  • Date: 2011-01-14

Valid HTML 4.01!
© National Institute for Health and Welfare and the MORGAM Project investigators
Last updated: 2011-01-14
For more information, please contact Päivi Laiho (firstname.lastname@thl.fi), Ari Haukijärvi (firstname.lastname@thl.fi) or Kari Kuulasmaa (firstname.lastname@thl.fi)

The purpose of this transfer format is to provide an exact and common format for the MORGAM Genotyping Laboratories (GLAB) to transfer genotypic data to the MORGAM Data Centre (MDC)  at National Institute for Health and Welfare (THL). The data should be sent through e-mail.

Contents



Format specification

NAME SPECIFICATION AND CODES FORMAT or Value
FORM Form number I2

|4|6|

VERSION Version of this form I1

|2|

GLAB Genotyping Laboratory code (GLAB), selected from the list of genotyping laboratories I3

|_|_|_|

SHIPMENT Shipment number I6 |_|_|_|_|_|_|
SHIPDATE Date of this shipment (date ANSI)
YYYYMMDD (year month day)
C8 |_|_|_|_||_|_||_|_|
KEY2 Sample identification I7 |_|_|_|_|_|_|_|
MARKER SNP / polymorphism name (abbreviation) as used locally C30 free text
METHOD Method of genotyping I1 |_|
GENOTYPE Genotype C256 free text
STATUS Genotyping status I1 |_|
GENODATE Genotyping analysis run date or reading date (date ANSI)
YYYYMMDD (year month day)
C8 |_|_|_|_||_|_||_|_|
ORIENTATION Sequence orientation
'F' = Forward
'R' = Reverse
C1 |_|
STRAND Strand
'T' = Top
'B' = Bottom
C1 |_|
COMMENTS Comments on this genotypic data row (if any) C100 free text


General Instructions

The item NAME on the document is a computer variable name used for the item by the MDC.

The item FORMAT specifies the format in which the value should be presented in the transfer data set:

FORMAT Type Format Example Comments
C Character C8 20090205 Used for dates, trailing zeroes are mandatory.
C20 "B 17 E" Location of sample in box; surround with double quotes (") if value contains spaces. Maximum of 20 characters allowed for this variable, including quotes.
C3 -80 Storage temperature, includes leading '-' sign.
F Float F5.2 13.1
2.18
Variable must include a decimal point (.). Leading and trailing zeroes are not mandatory.
I Integer I3 1
15
180
Leading zeroes are not mandatory.

Instructions for making corrections to data that have already been sent to the MDC are given in Section Data communication between the Participating Centres and the MDC.

Please contact the MDC for instructions if you can not provide information as specified in this document or if you have any problems with the interpretation of the coding for any specific items.


Specific instructions for each item

Follow these instructions carefully when creating a computer file for the data transfer from the GLAB to the MDC.

To the top of the form

FORM Form number C2

|4|6|

Number 46 indicates the "Data transfer format: SNP data transfer to MDC".

To the top of the form

VERSION Version of this form C1

|2|

Current version of this data transfer format. If the version number of the data transfer format which you are using is not "1", these instructions do not correspond to the format you are using. Version number changes if some major changes are introduced in the form, for example adding / deleting some variables.

To the top of the form

GLAB Genotyping Laboratory code C3

|_|_|_|

Enter appropriate code of your MORGAM Genotyping Laboratory from the list of MORGAM Genotyping Laboratories:

If the genotypic data are sent from an MPC instead of a MORGAM laboratory, enter here the MPC code (e.g. 026 for Augsburg).

To the top of the form

SHIPMENT Shipment number I6 |_|_|_|_|_|_|

This is shipment number from genotyping laboratory to MDC. Usually it is a sequential number of shipments to MDC. If you do nof use it, enter number of shipment to MDC on that particular day.

To the top of the form

SHIPDATE Date of this shipment (date ANSI)
YYYYMMDD (year month day)
C8 |_|_|_|_||_|_||_|_|

The date of the shipment (the date when you completed this data form). Format is yyyymmdd (date ANSI, 8 characters, year day month).

To the top of the form

KEY2 Sample identification C7

|_|_|_|_|_|_|_|

KEY2 is a primary key, identifying DNA sample in MORGAM. KEY2 does not include any of the components of KEY1 (identifying individual in MORGAM study). KEY2 sample identification code is generated in the MDC and it's relation with KEY1 is kept in the MDC. The labels used for transferring the DNA samples from the DNA storage unit in KTL to other MORGAM Genotyping Laboratories include only KEY2 codes.

To the top of the form

MARKER SNP / polymorphism name (abbreviation) as used locally C30 free text

Enter here the SNP / polymorphism name, as it is recorded in your local database.

To the top of the form

METHOD Method of genotyping:
1 = Chip
2 = MassSpec
3 = Amplifuor
4 = Taqman
5 = Agarose (fragment analysis on agarose )
6 = KASPar
7 = Fluorescent fragment analysis on ABI Sequencer
C1 |_|

Enter here the code of genotyping method.

To the top of the form

GENOTYPE Genotype C256 free text

Enter here alleles in capital letters (A, C, T, G) in alphabetical order, from both chromosomes, separated by slash (/). Deletions are marked as D, unknown genotype as N, for example:

The reason for an unknown genotype will be give in item STATUS.

To the top of the form

STATUS Genotyping status:
1=Sample lost
2=Not genotyped because no DNA left
3=Sample available but not genotyped
4=Genotyping unsuccsessful
5=Successfully genotyped
C1 |_|

Use codes 1...4 to specify the reason for unknown genotype if code N was used for GENOTYPE.

Use code 5 if both alleles were coded using A,C,T,G or D.

To the top of the form

GENODATE Genotyping analysis run date or reading date (date ANSI)
YYYYMMDD (year month day)
C8 |_|_|_|_||_|_||_|_|

Enter here the analysis run date or reading date of this genotype.

ORIENTATION Sequence orientation
'F' = Forward
'R' = Reverse
C1 |_|

Sequence orientation according to which the items GENOTYPE, 5PRIME and 3PRIME were coded.

STRAND Strand
'T' = Top
'B' = Bottom
C1 |_|

Strand position.

To the top of the form

COMMENTS Comments on this data row C100

free text

Enter here comments (if any) for this data row. Surround value of the COMMENTS with double quotes (") if it contains text with spaces. Place field delimiter (;) in data row if there were no comments.


Data transfer Instructions

The data should be prepared in ASCII comma-delimited format using semicolon (;) as delimiter, with the names of the variables in the first row. If the value of a variable contains text with delimiter used inside the text then the value should be surrounded by double quotes ("). A dot (.) shold be used as a decimal point.

To simplify the tracing of data transfers between the MPC and the MDC, the files should be named as follows:

F46_GLAB_YYYYMMDD_N.CSV, where:

To avoid possible errors in variable naming use the first row from the example below containing variable names. Example of Form 46 ASCII comma-delimited data file:

FORM;VERSION;GLAB;SHIPMENT;SHIPDATE;KEY2;MARKER;METHOD;GENOTYPE;STATUS;GENODATE,STRAND;ORIENTATION;COMMENTS
46;2;911;3;20110913;1234567;ITGA2_123;2;A/G;5;20110215;F;B;
46;2;911;3;20110913;1234567;F13_18;2;C/C;5;20110215;R;T;

The data files should be archived (ZIP format) before sending them to the MDC. The archive names should be the same as the file names. MORGAM data Error checking program will be used to prepare data (will be available later). The data should be sent to MDC (ari.haukijarvi@thl.fi) as an attachment to an e-mail message. Please make sure to save a copy of the transferred files in the sending MPC or laboratory.

When the data are received in the MDC, an acknowledgement will be sent to the sender. The data will then be checked for consistency and quality, and after any problems have been resolved, they will be entered into the MORGAM database.


Updates

Date Version Update
2009-05-20 1 The list of genotyping laboratory codes was updated.
2011-01-14 2 Added:
ORIENTATION - Sequence orientation ('F' = Forward, 'R' = Reverse)
STRAND - Strand position ('T'=Top, 'B'=Bottom)
To the top of the form