database core |
dbcore@genomeutwin.org |
Database core unit works to standardize the data format for each contributing unit and to create a quality-directed data environment. Encryptation and other data safety issues are of utmost importance for Genomeutwin-project.
The database core has decided:1. To have a standard format platform for our work.
2. Standard EU twin number.
3. In communication encrypt everything we send.
4. A pilot datamodel in Stockholm.
5. Web tracking system.
EUIDNUM
EUIDNUM consists of four parts:| country code | 3 digits |
| randomized number | 7 digits |
| identification number | 1 digit |
| check sum | 1 digit |
E.g.
246123456714
Country Code
The country codes, giving data origin will be according to the ISO 3166
standard.
Randomized number
In the database there will be twins and non twins. Each twin pair will
share the same randomized number, a none twin will receive an own randomized
number. The none twin randomized number will only occurrence once, but
the randomized number for twins, will occur two times if twins, three
times if triplets and four times if quadruplets. The country code is
part of the EUidnumber, and this allows each country to administrate
their own randomized numbers. How this will be done is up to each country
as long as it generates a unique number for each individual and contains
of 7 numbers. The randomized number does not need to be generated in
random, it can for example be derived from a locally used twin pair
number.
Identification number
The EUidnumber needs indicator weather or not this is a twin. To obtain
this information and still have unique numbers the EUidnumber should
end (except the checksum) with the following numbers:
1 - Twin 1
2 - Twin 2
3 - Triplet
4 - Quadruplet
0 - Non twins
Checksum
The checksum is calculated by GUMM algorithm (H. Peter Gumm: A new class
of check digit methods for arbitrary number systems, IEEE Transactions
on information theory, 31 (1985), 102-105). The GenomEUtwin Data Transfer
Java application can be
used to calculate GUMM check digits.
data processing
Data processing progress for genotype and phenotype data.
PHENOTYPE DATA
Data Format and Variable Standard for GenomEUtwin's Phenotype Database version 4.0 (updated 9.12.2004)
Appendix A - Migraine Phenotype version 2.0 (updated 13.5.2005)
GenomEUtwin Data Transfer Java application
GENOTYPE DATA
The genotype database (GtDB) is being developed as a collaborative open source project.GtDB Project (SourceForge.net)
GtDB Data Formats version 1.6
gtdata@genomeutwin.org
WORKING GROUPS
WG1: Phenotype Data
WG2: Data Modeling
WG3: Data Integration
WG4: Output and Analysis
LINKS
ISO Country
Codes
IBM Discovery Link
SWE Twin Forum username:
"eu_guest" password: "forum815"
PGP (Pretty Good Privacy) download
the latest freeware version
