MORGAM logo

Data transfer format - MORGAM polymorphisms

  • Form: 51
  • Version: 6
  • Date: 2011-01-14

Valid HTML 4.01!
© National Institute for Health and Welfare and the MORGAM Project investigators
Last updated: 2011-01-14
For more information, please contact Päivi Laiho (firstname.lastname@thl.fi), Ari Haukijärvi (ari.haukijarvi@thl.fi) or Kari Kuulasmaa (kari.kuulasmaa@thl.fi)

The purpose of this transfer format is to provide an exact and common format for the MORGAM Genotyping Laboratories (GLAB) to transfer data on MORGAM polymorphisms to the MORGAM Data centre (MDC)  at National Institute for Health and Welfare (THL). The data should be sent through e-mail.

Contents


Format specification

NAME SPECIFICATION AND CODES FORMAT or Value
FORM Form number I2

|5|1|

VERSION Version of this form I1

|6|

GLAB Genotyping Laboratory code (GLAB), selected from the list of genotyping laboratories I3

|_|_|_|

SHIPMENT Shipment number I5 |_|_|_|_|_|
SHIPDATE Date of this shipment (date ANSI)
YYYYMMDD (year month day)
C8 |_|_|_|_||_|_||_|_|
MARKER SNP / polymorphism name (abbreviation) as used locally C30 free text
GENE_NAME Gene name as defined in NCBI Entrez Gene database C20 free text
GENOTYPE Genotype coding C10 free text
DELINS Deletion / insertion  coding C10 free text
5PRIME 5' flanking sequence C30 free text
3PRIME 3' flanking sequence C30 free text
RS ID of this polymorphism in RS database C10 free text
ORIGINAL_PROJECT The name of the original project of the SNP C50 free text
COMMENT Comments by the laboratory regarging the SNP (if any) C100 free text


General Instructions

The item NAME on the document is a computer variable name used for the item by the MDC.

The item FORMAT specifies the format in which the value should be presented in the transfer data set:

FORMAT Type Format Example Comments
C Character C8 20090205 Used for dates, trailing zeroes are mandatory.
C20 "B 17 E" Location of sample in box; surround with double quotes (") if value contains spaces. Maximum of 20 characters allowed for this variable, including quotes.
C3 -80 Storage temperature, includes leading '-' sign.
F Float F5.2 13.1
2.18
Variable must include a decimal point (.). Leading and trailing zeroes are not mandatory.
I Integer I3 1
15
180
Leading zeroes are not mandatory.

Instructions for making corrections to data that have already been sent to the MDC are given in Section Data communication between the Participating Centres and the MDC.

Please contact the MDC for instructions if you can not provide information as specified in this document or if you have any problems with the interpretation of the coding for any specific items.


Specific instructions for each item

Follow these instructions carefully when creating a computer file for the data transfer from the GLAB to the MDC.

FORM Form number I2

|5|1|

Number 51 indicates the "Data transfer format: MORGAM polymorphisms".

VERSION Version of this form I1

|6|

Current version of this data transfer format. If the version number of the data transfer format which you are using is not "2", these instructions do not correspond to the format you are using. Version number changes if some major changes are introduced in the form, for example adding / deleting some variables.

GLAB Genotyping Laboratory code C3

|_|_|_|

Enter appropriate code of your MORGAM Genotyping Laboratory from the list of MORGAM Genotyping Laboratories:

SHIPMENT Shipment number I5 |_|_|_|_|_|

This is shipment number from genotyping laboratory to MDC. Usually it is a sequential number of shipments to MDC. If you do nof use it, enter
number of shipment to MDC on that particular day.

SHIPDATE Date of this shipment (date ANSI)
YYYYMMDD (year month day)
C8 |_|_|_|_||_|_||_|_|

The date of the shipment (the date when you completed this data form). Format is yyyymmdd (date ANSI, 8 characters, year day month).

MARKER SNP / polymorphism name (abbreviation) as used locally C30 free text

Enter the SNP / polymorphism name (abbreviation) as used in local database.

GENE_NAME Gene name as defined in NCBI Entrez Gene database C20 free text

Enter the gene name (abbreviation) as defined in NCBI Entrez Gene database. The Form 50 "Data transfer format - MORGAM genes" should be sent to MDC containing data on this gene.

GENOTYPE Genotype coding C10 free text

Enter here alleles in capital letters: A, C, T, G, from both chromosomes separated by slash (/). Deletions are marked as D, not genotyped data as N; for example:

DELINS Deletion / insertion coding C10 free text

Enter here allele codes for deletion and insertion separated by slash (/), for example:

5PRIME 5' flanking sequence C30 free text

Enter here the 5' flanking sequence, 30 characters.

3PRIME 3' flanking sequence C30 free text

Enter here the 3' flanking sequence, 25 characters.

RS ID of this polymorphism in RS database C10 free text

Enter here the ID of this polymorphism in the RS database (if exists), for example: rs153311

ORIGINAL_PROJECT Enter here the name of the original project of the SNP C50 free text

The name of the original project in which this polymorphism was first used. See the available project codes at ???. If you have any uncertainty about the project, please contact the MCL for instructions.

COMMENT Comments by the laboratory regarging the SNP (if any) C100 free text

Enter here Comments by the laboratory regarging the SNP (if any), for example if the SNP is a proxy for another SNP.


Data transfer Instructions

The data should be prepared in ASCII comma-delimited format using semicolon (;) as delimiter, with the names of the variables in the first row. If the value of a variable contains text with delimiter used inside the text then the value should be surrounded by double quotes ("). A dot (.) shold be used as a decimal point.

To simplify the tracing of data transfers between the MPC and the MDC, the files should be named as follows:

F51_GLAB_YYYYMMDD_N.CSV, where:

To avoid possible errors in variable naming use the first row from the example below containing variable names. Example of Form 51 ASCII comma-delimited data file:

FORM;VERSION;GLAB;SHIPMENT;SHIPDATE;MARKER;GENE_NAME;GENOTYPE;DELINS;5PRIME;3PRIME;RS;ORIENTATION;ORIGINAL_PROJECT;COMMENT
51;4;911;1;20090127;ICAM1_36;ICAM1;A/G;;gagcactcaaggggaggtcacccgc;aggtgaccgtgaatgtgctctgtga;rs5030382;F;MORGAM;

The data files should be archived (ZIP format) before sending them to the MDC. The archive names should be the same as the file names. MORGAM data Error checking program will be used to prepare data (will be available later). The data should be sent to MDC (ari.haukijarvi@thl.fi) as an attachment to an e-mail message. Please make sure to save a copy of the transferred files in the sending MPC or laboratory.

When the data are received in the MDC, an acknowledgement will be sent to the sender. The data will then be checked for consistency and quality, and after any problems have been resolved, they will be entered into the MORGAM database.

 

Updates

Date Version Update
2009-12-11 4 Added ORIENTATION - Sequence orientation ('F' = Forward, 'R' = Reverse)
Removed CELERA - ID of this polymorphism in Celera database
2010-11-12 5 Added STRAND - Strand position ('T'=Top, 'B'=Bottom)
2011-01-14 6 Removed STRAND and ORIENTATION
To the top of the form