Assignment for Friday's class:
- Go through blast slides (class 9 and 10)
- Think about how you will transfer files back and forth from the cluster
Assignment for Monday
- Consider the differnet processes that you force recombination events to be mainly equidistant to the origin of replication (A. recombiniation occurs at the same time as replication; B.) Could architecture imparting sequences (AIMS) be responsible?)
- Gene plots for comparison between different Aeromonas species are here, here and here). Which recombination events could have created these patterns?
Review
- How can one assess the number of false positives in a Blast ?
- How can one assess the number of false negatives in a Blast ?
- Is the E-value of a match independent of the size of the databank?
- If you select two sequences at random and test their significant similarity, what does the E-value signify? Is this the same as the P-value?
Brief review on HGT and the coral of life -- see additional slides here
Either:
What can give rise to recombination events occuring mainly between points that are equidistant to the origin of replication?
- See Collins Tillier
- Strand bias (slides here)
Or:
- example of simple perl scripts:
- (perl script to extract top scoring hits is here (pdf)-- extra credit: how would you modify the script, to print out not only the first reported match for a query, but all hits that have equally good E-values to the first one? Note: in Perl the operator for "logical and" is && and "not equal" for a number is != and "not equal" for a string is ne)
- (perl script to replace gi number with position on the genome is here (pdf) (faa_sample, feature_table sample)
- (perl script to make gnuplot is here (pdf), output for Thermotoga maritima vs Th. petrophila is here, Gene plots for comparison between different Aeromonas species are here, here and here). Discuss: Which recombination events could have created these patterns?
If time: