database core

dbcore@genomeutwin.org

Database core unit works to standardize the data format for each contributing unit and to create a quality-directed data environment. Encryptation and other data safety issues are of utmost importance for Genomeutwin-project.
The database core has decided:
1. To have a standard format platform for our work.

2. Standard EU twin number.

3. In communication encrypt everything we send.

4. A pilot datamodel in Stockholm.

5. Web tracking system.

EUIDNUM

EUIDNUM consists of four parts:
country code 3 digits
randomized number 7 digits
identification number 1 digit
check sum 1 digit

E.g.
246123456714

Country Code
The country codes, giving data origin will be according to the ISO 3166 standard.
Randomized number
In the database there will be twins and non twins. Each twin pair will share the same randomized number, a none twin will receive an own randomized number. The none twin randomized number will only occurrence once, but the randomized number for twins, will occur two times if twins, three times if triplets and four times if quadruplets. The country code is part of the EUidnumber, and this allows each country to administrate their own randomized numbers. How this will be done is up to each country as long as it generates a unique number for each individual and contains of 7 numbers. The randomized number does not need to be generated in random, it can for example be derived from a locally used twin pair number.
Identification number
The EUidnumber needs indicator weather or not this is a twin. To obtain this information and still have unique numbers the EUidnumber should end (except the checksum) with the following numbers:
1 - Twin 1
2 - Twin 2
3 - Triplet
4 - Quadruplet
0 - Non twins
Checksum
The checksum is calculated by GUMM algorithm (H. Peter Gumm: A new class of check digit methods for arbitrary number systems, IEEE Transactions on information theory, 31 (1985), 102-105). The GenomEUtwin Data Transfer Java application can be used to calculate GUMM check digits.

data processing

Data processing progress for genotype and phenotype data.

PHENOTYPE DATA

Data Format and Variable Standard for GenomEUtwin's Phenotype Database version 4.0 (updated 9.12.2004)

Appendix A - Migraine Phenotype version 2.0 (updated 13.5.2005)

GenomEUtwin Data Transfer Java application

GENOTYPE DATA

The genotype database (GtDB) is being developed as a collaborative open source project.
GtDB Project (SourceForge.net)

GtDB Data Formats version 1.6

gtdata@genomeutwin.org

WORKING GROUPS

WG1: Phenotype Data
WG2: Data Modeling
WG3: Data Integration
WG4: Output and Analysis

LINKS

ISO Country Codes
IBM Discovery Link
SWE Twin Forum username: "eu_guest" password: "forum815"
PGP (Pretty Good Privacy) download the latest freeware version

Last Updated on 13 May, 2005