INSTALL:

CImg Library is used in our package, and the include file CImg.h is included in the package, the current version is 1.08. More information about CImg Library is available from http://cimg.sourceforge.net/index.shtml

Two applications are included in this package:

1. fasta2asc:

fasta2asc will read the genome sequences and count the frequences of 9-mer WORD, then save the counts in asc file format to avoid read the long genome sequence repeatly.

2. genPortrait:

genPortrait will read the asc format file and map the frequencies to gray scale by normalizing to 0-255 or map to color by using HSV, JET, COOL, and SPRING method.

To compile the project, copy the source code to any directory, and run make to compile.

Usage:

First run fasta2asc as follows (here the input genome file is named e.coli.fna):
fasta2asc e.coli.fna
This will create an output file named e.coli.fna.9.asc. .asc file contains the frequencies of n-lettter words in a genome. As default, n is set to 9 (and that's why 9 is included in the name of the output file).
You can specify any numbers less than 9 for the parameter n, in the following way:
fasta2asc -5 e.coli.fna (for the frequencies for 5-nucleotides)
fasta2asc -3 e.coli.fna (for the frequencies for 3-nucleotides)

Next, run genPortrait in the following way:
genPortrait -sH e.coli.fna.9.asc
Here -s is to specify that the resulting picture will be saved to a bmp file. In this example, the output file will be named e.ecoli.fna.9.asc.HSV.bmp. Here -H is to specify the color scheme to use. In this example, HSV scheme will be used.
The other options are:
	-s save to bmp file
	-d show bmp file in windows
	-m using mean method instead of log method
	-B Black/White output
	-H HSV output
	-J JET output
	-C COOL output
	-S SPRING output
At least one of -s or -d options needs to be speficied (you can also specify both of them).
-m option is optional. With -m option, the upper bound of the frequencies is set to twice of the average frequency. A frequency of a n-letter words which exceed this upper bound will be considered to be the maximum frequency, i.e. 2 * (the average frequency). With the default log method, the frequencies are reformed by applying natural log. The resulting figure tends to be more smooth.
At least one option for the color scheme, either B, H, J, C, S, should be specified. You can also specify any combination of them, like:
genPortrait -sHBJ e.coli.fna.9.asc
This will create three separate bmp files in HSV, B/W, JET scheme.

if you meet any problem, please send email to lib@purdue.edu or dkihara@purdue.edu.