Please send your answers per email to gogarten@uconn.edu, or hand in a hardcopyPlease let me know, how far you got during the lab. If most students didn't finish, we will continue this next week!
You should answer the questions in red!
Preparation:
Create a directories (mkdir) named paml_branch and paml_sites.If you have your own data file to work on, use phyml and a reasonable model (HKY with ASRV and invariant sites) to find the maximum likelihood tree. Save this tree in both the paml_branch and paml_sites directories.Copy the file codeml.hv1.sites.ctl into paml_sites, and codeml.hv1.branches.ctl into paml_branches.
Create a directories (mkdir) named paml_branch and paml_sites.
If you have your own data file to work on, use phyml and a reasonable model (HKY with ASRV and invariant sites) to find the maximum likelihood tree. Save this tree in both the paml_branch and paml_sites directories.
Copy the file codeml.hv1.sites.ctl into paml_sites, and codeml.hv1.branches.ctl into paml_branches.
1) MrBayes and the 3 omega model
If you have your own dataset, add the MrBayes block described in class 12, slide 18 to the bottom of the file. Save the file on your local IMAC and upload it into web interface at http://bbcxsrv1.biotech.uconn.edu/bipod/index.html (see slide 20-21).
Else, inspect the input file for Hv1.nex .
Depending on the number of generations, the MrBayes run can take from several minutes to several hours. Therefore, you should try the following exercise using the *.p file from http://bbcxsrv1.biotech.uconn.edu/pise/tmp/A10700111431640/results.html (or here). Download the file onto your computer, and import it into MSExel (DATA - get external data ...). Follow instructions form the lecture12, slide 24.
Plot the LnL column to determine the "burnin time".How many generations did you decide to discard?
For each codon calculate the portability that it might have an omega larger than 1. Plot the results as a bar graph.
2) Sites models in PAML
Make paml_sites your working directory. Modify the command file (codeml.hv1.sites.ctl) so that it recognizes your data matrix (or Hv1.phy) and your ml-tree (or hv1.tree). If you use your own data, you might need to add an I (Interleaved option) to the first line.
IF you want to actually run the program (takes about an hour on hv1.phy) you can use the following command:/Users/jpgogarten/paml/codeml your_control_file
Inspect the output in the designated outfile and in the rst file.
What are the likelihoods for the three models?Which of these model is supported through an ml ratio test? How many degrees of freedom do you choose? Which value did you use to look up in the chi square table? What omega values were estimated?
2) Branch models in PAML
Move over to the paml_branches directory. Modify the command file (codeml.hv1.branches.ctl) so that it recognizes your data matrix (or Hv1.phy) and your ml-tree (or hv1.tree).
Inspect the output in the designated outfile. Draw both the dS and the dN tree in treeview or njplot.
Are there any branches in the phylogeny that are candidates for episodes of positive selection? What were the dN/dS ratios for these branches?