Your name: Your email address:
Answer all the questions in red in the provided boxes.
1) Maximum likelihood tests using phyml as implemented in Seaview. We will test the following models:
#of rates #of frequencies Gamma Invariant sites degrees of freedom to previous model JC 1 1 N N HKY 85 2 4 N N 4 GTR 6 4 N N 2 GTR + Gamma 6 4 Y N 1 GTR +GammaInv 6 4 Y Y 1
Open this file in Seaview. Select all sequences. Select sites Extein. Under Trees select phyml.
One important condition that has to be fulfilled before one can use a Likelihood Ratio Test (LRT) to compare two models, is that the models should be "nested". This means that the simpler model must be a constrained version of the parameter-rich model. The likelihood ratio test is performed by doubling the difference in log-likelihood scores and comparing this test statistic with the critical value from a chi-squared distribution having degrees of freedom equal to the difference in the number of estimated parameters in the two models. The parameter-rich model will always have a better fit, due to the extra parameters and will therefore have the highest log-likelihood, so the difference should be a positive number. The degree of freedom between each of the models is given in the above table - plus/minus gamma shape parameter is one parameter (even though is is approximated by 4 rate categoroies) and the % invariant sites also counts as a parameter.
Use this online chi-square calculator to determine the significance of the test.
Are all the more complex models a significant improvement over the more simple ones? JC HKY85 2*deltaLogL: P-value: GTR 2*deltaLogL: P-value: GTR + Gamma 2*deltaLogL: P-value: GTR + GammaInv 2*deltaLogL: P-value:
BIC: AICc: AIC:
Create a directory for lab9, and transfer the aligned sequences for exteins only (as a multiple fasta file created via save selection as in seaview - see above) into that directory.
When done, use filezilla to move the files created by iqtree to your desktop computer. You can open the treefile in seaview.
Open the log file in a text editor. At the and of listing of the for the lnL for the individula models, is the listing of the best models under the different criteria.
END ASIDE
Does the tree calculated under the best model correspond to the trees you obtained with seaview? What is the main difference between the models that consider Among Site Rate Variation and those that do not?
begin mrbayes; lset nst=6 rates=gamma; mcmcp filename=analysis_Extein; mcmcp samplefreq=50 printfreq=50 diagnfreq=500; mcmcp ngen=20000; mcmcp savebrlens=yes; end;
execute Yeast_vma1_intein_aligned.nxs showmodel
Read through the output after each command.
Type
mcmc
Average standard deviation of split frequencies
1) Which value did you use for the burnin/burninfraction? 2) Which branch in the tree is the longest? (check the branch lengths tables, then look up the branch in the bipartition table) 3) How long is it? (use the mean value) 4) What is the measure? (Check the label at the scale bar for the last tree image) 5) How big is the shape parameters for the intein? What is the 95% credibility interval? 6) What is the probability for a nucleotide to be an A? What is the probability to be a C? 7) Can you explain in a few words, why is it important to exclude a 'burnin' from our analyses? 1) Which value did you use for the burnin/burninfraction? 2) Which branch in the tree is the longest? (check the bipartition and branch lengths tables) 3) How long is it? 4) What is the measure? (Check the scale bar at the tree image) 5) What is the shape parameters for the intein ? What is the 95% credibility interval? 6) What is the probability for a nucleotide to be an A? What is the probability to be a C? 7) Can you explain in a few words, why is it important to exclude a 'burnin' from our analyses?
1) What are the means for the shape parameter? 2) Is the frequency of A and T different for inteins and exteins? 3) How long is it? 4) How many more subsitutions occured overall in the intein and compared to the extein? 5) Is the different significant?
Send email to your instructor (and yourself) upon submit Send email to yourself only upon submit (as a backup) Show summary upon submit but do not send email to anyone.