Locus file (data file)

Description

Locus file contains information about disease allele frequency, marker allele frequencies, liability classes, penetrances etc.

Format

Locus file is general ASCII-text file. File format described here is so called LINKAGE format.

Simple locus files can be created with makedata and more complex with preplink. But usually all locus files can be created with makedata. Locus file structure is:

    Line 1: Number of Loci, Risk Locus, Risk Allele, Sexlinked (if 1), Program Code
    Line 2: Mutation Locus, Mutation Rate Male, Mutation Rate Female, Haplotype Frequencies (if 1)
    Line 3: Locus order

Then disease and marker loci information follows (locus type, number of alleles and allele frequencies). Usually disease loci is before marker loci. In pedigree file howto disease phenotype is before marker phenotypes, this is established practice. Example of fully penetrant dominant disease locus (with 2 alleles):

    1 2                     { locus type and number of alleles
    0.99 0.01               { gene frequencies (for normal and disease)
    1                       { number of liability classes
    0.0 1.0 1.0             { penetrances for liab. class 1, P(Aff|++), P(Aff|D+) and P(Aff|DD)

P(Aff|++) is phenocopy rate, P(Aff|D+) is penetrance for one disease allele and P(Aff|DD) is penetrance for two disease alleles.

Then marker locus information is followed by disease locus. Example of marker locus with 4 alleles:

    3 4                     { locus type and number of alleles
    0.25 0.25 0.30 0.20     { gene frequencies

Locus file can contain any number of markers! And last three lines are:

    Third last  : Sex difference, Interference (if 1 or 2) 
    Second last : Recombination values between markers
    Last        : Recombination varied, Increment value, Finishing value

Example

Let's create 3 marker locus file (autosomal, fully dominant) with makedata:

    % makedata

    Makedata 1.4 (06/17/1999). This program creates data files for linkage
    analysis. Made by Tero Juntunen

    MAX liability classes       10
    MAX allele in one loci      50
    MAX loci in one chromosome  500

    How many marker loci?
    3
    Is this set of markers?
    (1) Autosomal
    (2) Sexlinked
    1
    What is the assumed frequecy of the disease allele?
    0.001
    How many liability classes?
    1
    Please enter penetrances for liability class 1.
    P(Aff|DD)   P(Aff|D+)   P(Aff|++)
    1.0 1.0 0.0
    What is the largest number of alleles at any markers?
    5
    Do you want to specify recombination fraction between markers?
    (0=No, 1=Yes, 2=AUTOSCAN option)
    0

    indata.dat created! Run downfreq to eliminate alleles
    and calculate correct allele frequences.

    %

And file looks like:

     4 0 0 5  << NO. OF LOCI, RISK LOCUS, SEXLINKED (IF 1) PROGRAM
     0 0.0 0.0 0 << MUT LOCUS, MUT MALE, MUT FEM, HAP FREQ (IF 1)
     1 2 3 4
     1 2  << AFFECTION, NO. OF ALLELES
     0.99900  0.00100 << GENE FREQUENCIES
     1 << NO. OF LIABILITY CLASSES
     0.0000  1.0000  1.0000 << PENETRANCES
     3 5  << ALLELE NUMBERS, NO. OF ALLELES
     0.200000 0.200000 0.200000 0.200000 0.200000 << GENE FREQUENCIES
     3 5  << ALLELE NUMBERS, NO. OF ALLELES
     0.200000 0.200000 0.200000 0.200000 0.200000 << GENE FREQUENCIES
     3 5  << ALLELE NUMBERS, NO. OF ALLELES
     0.200000 0.200000 0.200000 0.200000 0.200000 << GENE FREQUENCIES
     0 0  << SEX DIFFERENCE, INTERFERENCE (IF 1 OR 2)
     0.5000 0.10000 0.10000 << RECOMBINATION VALUES
     1 0.10000 0.45000 << REC VARIED, INCREMENT, FINISHING VALUE

Marker allele frequencies are set to equal, but after this correct allele frequencies are calculated with downfreq program.

Other formats

LINKAGE format is most used locus file format, but there are other programs which uses it's own formats like Mendel, Simwalk2 and Solar. Converting to other formats from LINKAGE see Mega2 howto.

Documentation and for more info

Handbook of Human Genetic Linkage, Joseph D. Terwilliger and Jurg Ott. Johns Hopkins University Press, Baltimore (1994)