Assignment 8 - Getting to know Seaview

Your name:
Your email address:

Answer all the questions in red in the provided boxes.

Assignments:

1) Download and install Seaview on your computer (you can download the program at http://doua.prabi.fr/software/seaview).

Seaview includes alignment (muscle and clustalo) and phylogenetic reconstruction programs (Neighbor joining and parsimony analysis from PHYLIP, a collection of programs for phylogenetic analyses written by Joe Felsenstein, and phyml, a maximum likelihood program).

Advantages of seaview are

2) Open Seaview, download this nucleotide sequence file and load it into seaview.  The file contains a selection of nucleotide sequences that encode the vacuolar ATPase in different yeasts.  Some of these have been invaded by an intein. 

3) In the Seaview window,  select Props, place a check mark into view as protein.  If you downloaded the sequences as ORFs or from an alignment resulting from a tblastn search, you should not have any stop codons (little * in the view as proteins display).

Do you see any stop codons in your sequences? 
Delete the sequence that has stop codons (click on the name of the sequence, so that it turns white on black, then select edit -> delete sequences.   Uncheck view as protein and save the file in fst format.  Go back to view as protein. 

Select Align -> Alignment options -> muscle  then Select Align. How many alignment columns are in your alignment?     (scroll to the right click the last column, on top is tells you sequence and position in the alignment | position in the sequence).

4) The first four sequences have not been invaded by an intein.  Can you find the place where the intein begins and where it ends?  What are the first two and the last three amino acids of the intein?  

5) Create sets of sites the correspond to the extein and the intein.  First go to Sites create a set called "all sites", then duplicate this set,  call it intein.  Scroll to the right, and in the row of xxx below the alignment, click on the x below the last aa of the N-extein (the x disappears and the column is grayed out).  Then right click on any of the xxx below the N-extein, -> all the x below the N-extein should disappear.   Do the same at the end of the intein: remove the x under the first aa of the C-Extein, then right click on any of the xes to the right. 
This might be a good point in time to save your file in mase format. 
Do the same for the sites corresponding to the extein:  Sites -> all sites, then Sites -> duplicate set, call it extein.
Move to the right click on the xs below the first and the last aa of the intein, the right click on an x under the intein. (The right click removes all the x between to non-x columns.  If you right click before the last column of the intein is removed, you remove everything till the end :( ).
If you want information how to modify an alignment by hand, check out the help pages.

6) Uncheck view as protein.  Save the alignment in mase format.  Select sites -> extein.  Then highlight all the intein containing sequences.  Select Trees ->phml -> model GTR (everything else as default -> RUN.  After a minute the tree building window is complete.  If you do serious work, you want to copy all and place the results into your notebook BEFORE you click ok. 
After you click ok, the window opens with the calculated maximum likelihood tree. 
Explore the Swap and Re-root buttons on top.  These operations do not change the tree (which is calculated as an unrooted tree). 
If you click on Br support, the estimated probability that the branch is real is displayed next to the branch.  (just in case select file -> save unrooted tree and give it a name.  Also, copy past the image of the tree into your notebook.

7) Repeat this for the intein sequence:  Sites -> intein ; select the intein containing sequences,  Trees-> phyml (GTR) > RUN  (copy all and save the program output before clicking ok.  Save the tree, and compare it to the extein tree.

Do you see any similarities between the trees calculated for the intein and the extein?

8) If you have time before the midterm, rename the intein free genes (select the sequence by clicking on the name, add a prefix (e.g. N_) at the bginning of the name.  Select sites extein, select all (click on the names) sequences.) trees > phyme > RUN.

Do the intein free genes form a clann/clade?  

Finished?

Send email to your instructor (and yourself) upon submit
Send email to yourself only upon submit (as a backup)
Show summary upon submit but do not send email to anyone.

Go to HuskyCT and start the midterm.