October 5, 2005

New ZAPLO code integrates PROFILER directly to eliminate the additional processing step.

The probability distributions in this version have been modified. Given that we have have a single locus there is not phase information required for the founders in the pedigree. Thus we can use unordered genotypes rather than ordered genotypes. In practical terms this requires the probabilities of heterozygous genotypes to be doubled. The new representation reduces the complexity of the distribution by approximately a factor of 2 for each founder, thereby leading to significant computational and space savings.

New features and options:

1. ZAPLO now accepts pre-makeped formatted files as valid input. The required option is -L p. Pre-makeped format has four fewer columns that post-makeped format. Since PROFILER uses the post-makeped proband column for uses to specify the GROUP, the new version currently does not have a mechanism for defining GROUPS. Future release will have the option to read in an auxiliary file that contains group information. Currently there are 4 default GROUPS that are defined and available through command line options: (1) each individual (default, no option required), all founders (-Z F), all non-founders (-Z N), all individuals (-Z E). These options are also valid for post-makeped files.

If you want to define your own group, you must use a post-makeped file with the -Z G option.

2. ZAPLO now has several functions that can generate different types of output for easy parsing. These functions are in the source file zaplo3.c and are named: print_zaplo_profile_csv, which is a comma separated file with each joint vector and its probability on one line. After running zaplo, there will be an output file named zaplo_csv.txt. There is also another function print_zaplo_profile_pedigree,which is a tab delimited format with each individual in the group printed on a separate line. After running zaplo, the file zaplo_pedigree.txt will be created. Matching these output files and the C code in the functions will guide in writing other types of output.

There is also an option -Z A to read in a phenotype file. The format is Ped ID then any number of additional fields. This option reads in the info, finds the correct pedigree and individual, then stores the character string. This string is then appended to each line in the zaplo_pedigree.txt. This mechanism allows affection status.QTLs etc . to be added. Currently this file must be named phenotypefile.txt.

The gzipped tar file contains the latest ZAPLO program for the Linux platform. The files will untar into the directory zaplo_solaris_src or zaplo_linux_src. Source is provided to modify output routines as desired.  Recompile after changes by type make. The executable zaplo2.1 will appear in the current directory. There is an example with a pre-makeped file, phenotypefile.txt and output with the probability distribution containing all individuals in the examples directory.

If you have any questions on how to modify the code, run or program, or would like a specific type of output, contact me.

zaplo_linux_src.tar.gz

 

zaplo_solaris_src.tar.gz