Locus file contains information about disease allele frequency, marker allele frequencies, liability classes, penetrances etc.
Simple locus files can be created with makedata and more complex with preplink. But usually all locus files can be created with makedata. Locus file structure is:
Line 1: Number of Loci, Risk Locus, Risk Allele, Sexlinked (if 1), Program Code
Line 2: Mutation Locus, Mutation Rate Male, Mutation Rate Female, Haplotype Frequencies (if 1)
Line 3: Locus order
Then disease and marker loci information follows (locus type, number of alleles and allele frequencies). Usually disease loci is before marker loci. In pedigree file howto disease phenotype is before marker phenotypes, this is established practice. Example of fully penetrant dominant disease locus (with 2 alleles):
1 2 { locus type and number of alleles
0.99 0.01 { gene frequencies (for normal and disease)
1 { number of liability classes
0.0 1.0 1.0 { penetrances for liab. class 1, P(Aff|++), P(Aff|D+) and P(Aff|DD)
P(Aff|++) is phenocopy rate, P(Aff|D+) is penetrance for one
disease allele and P(Aff|DD) is penetrance for two disease alleles.
Then marker locus information is followed by disease locus. Example of marker locus with 4 alleles:
3 4 { locus type and number of alleles
0.25 0.25 0.30 0.20 { gene frequencies
Locus file can contain any number of markers! And last three lines are:
Third last : Sex difference, Interference (if 1 or 2)
Second last : Recombination values between markers
Last : Recombination varied, Increment value, Finishing value
Let's create 3 marker locus file (autosomal, fully dominant) with makedata:
% makedata
Makedata 1.4 (06/17/1999). This program creates data files for linkage
analysis. Made by Tero Juntunen
MAX liability classes 10
MAX allele in one loci 50
MAX loci in one chromosome 500
How many marker loci?
3
Is this set of markers?
(1) Autosomal
(2) Sexlinked
1
What is the assumed frequecy of the disease allele?
0.001
How many liability classes?
1
Please enter penetrances for liability class 1.
P(Aff|DD) P(Aff|D+) P(Aff|++)
1.0 1.0 0.0
What is the largest number of alleles at any markers?
5
Do you want to specify recombination fraction between markers?
(0=No, 1=Yes, 2=AUTOSCAN option)
0
indata.dat created! Run downfreq to eliminate alleles
and calculate correct allele frequences.
%
And file looks like:
4 0 0 5 << NO. OF LOCI, RISK LOCUS, SEXLINKED (IF 1) PROGRAM
0 0.0 0.0 0 << MUT LOCUS, MUT MALE, MUT FEM, HAP FREQ (IF 1)
1 2 3 4
1 2 << AFFECTION, NO. OF ALLELES
0.99900 0.00100 << GENE FREQUENCIES
1 << NO. OF LIABILITY CLASSES
0.0000 1.0000 1.0000 << PENETRANCES
3 5 << ALLELE NUMBERS, NO. OF ALLELES
0.200000 0.200000 0.200000 0.200000 0.200000 << GENE FREQUENCIES
3 5 << ALLELE NUMBERS, NO. OF ALLELES
0.200000 0.200000 0.200000 0.200000 0.200000 << GENE FREQUENCIES
3 5 << ALLELE NUMBERS, NO. OF ALLELES
0.200000 0.200000 0.200000 0.200000 0.200000 << GENE FREQUENCIES
0 0 << SEX DIFFERENCE, INTERFERENCE (IF 1 OR 2)
0.5000 0.10000 0.10000 << RECOMBINATION VALUES
1 0.10000 0.45000 << REC VARIED, INCREMENT, FINISHING VALUE
Marker allele frequencies are set to equal, but after this correct allele frequencies are calculated with downfreq program.
LINKAGE format is most used locus file format, but there are other programs which uses it's own formats like Mendel, Simwalk2 and Solar. Converting to other formats from LINKAGE see Mega2 howto.
Handbook of Human Genetic Linkage, Joseph D. Terwilliger and Jurg Ott. Johns Hopkins University Press, Baltimore (1994)