Computer lab 14
Goals
- Be able to use figtree to produce good looking and informative trees.
- Have all your questions answered. (see slides)
Class 26 (12/8)
Goals
- Be prepared for the finals
- Let the instructor know by email to gogarten@uconn.edu, if you will take the exam at the Center for Students with Disabilities!
- Do the SETs, please!
Links
Assignments for Friday (12/10)
- Complete Take-Home Exam #8 before Friday 10am
- Your notebooks are due before the final on Monday
- Let the instructor know by email to gogarten@uconn.edu, if you will take the exam at the Center for Students with Disabilities!
Class 25 (12/6)
Goals
- Appreciate the power of scripts and R
- Be able to consider the relationship between genetic drift, population size, and the accumulation of "junk DNA"
- Be aware of the discussion of "function" assigned to intergenic DNA by the ENCODE project
- Know a few examples for constructive neutral evolution
- Know about split inteins and what they are doing
- Appreciate the difficulties of using bootstrap support values as more sequences are added to an analysis
Links
Slides on Lab13, Bipartition paradox, Constructive Neutral Evolution.
Assignments for Wednesday 12/8
- Work on Take home exam #8
- Bring questions for Review session on Wednesday
- Complete SET survey
Computer lab 13
Goals
- Appreciate the power of scripts to run programs repeatedly using different input files
- Appreciate the power of scripts to extract information from the verbose output of a program
- Know how simple statistics on all protein sequences present in a genome can inform on the physiology and adaptations of an organism
- Know that analysis of the genome wide composition of proteins can provide information on optimal growth temperature and adaptation to growth at high salt concentrations.
Assignments for Monday 12/6
- Read through take home exam #8
- Read through slides on introns early versus late.
Class 24 (12/1)
Goals
- Recall the different types on introns
- Be aware that in animals and plants most genes contain more intron than exon sequences
- Know about the introns early versus introns debate
- Know about the supporting evidence for both sides
- Know how Go plots are created, and how they are used to define protein structure building blocks
- Know why the finding of the intron in Triose Phosphate Isomerase encoding gene from Culex is not an argument for introns early
- Know about the roles introns play in exon shuffling and in
the non-sense mediated mRNA decay pathway
- Appreciate the capabilities of figtree to beautify trees
Links
Slides on Figtree, Introns, Introns early versus late, and isoelectric points.
Assignments for Friday 12/3
Assignments for Monday 12/7
Read through take home exam #8
Read through slides on introns early versus late.
Class 23 (11/29)
Goals
- Know about the progress in sequencing technology.
- Understand the different approaches in Next generation sequencing and they advantages and disadvantages.
- Know about the dilemma that GTAs pose for evolutionary and micro-biologists.
Links
Slides on DNA sequencing, Assembly, and the possible role of GTAs.
Assignments for Wednesday 12/1
Computer lab 12
Goals
- Know how to do a PSI Blast search using the web interface
- Know that the construction of a PSSM can be don with a different (preferably large and diverse) databank than the final search.
- Be able to save PSSM from a web based or a command line blast search.
- Execute PSI-blast searches, and different blast searches using pre-calcualted PSSMs
- Appreciate the differences between blastp, psibast, and using tblastn with a PSSM
- Know that tblastn searches targeting proteins encoded by selfish genetic elements usually give many more results than searches of the annotated proteins.
Class 22 (11/17)
-
Know that PSIblast lowers the frequency of false negatives as compared to a normal blast search
-
Know the underlying principle of the iteration in PSI blast
-
Know what PSI and PSSM stand for
-
Understand what PSI blast and Hmmer searches might be used for
-
Be aware that making a PSSM and using it for a search can use different databases.
-
Understand why gene sharing via GTAs is not considered an evolutionary stable strategy
-
Appreciate that Y-chromosome Adam and Mitochondrial Eve were not the only contributors to the gene pool of modern human
-
Understand that the about 20,000 compatriots of mitochondrial Eve contributed their genetic information to today's human (but because of recombination these cannot be traced back in the genealogy).
-
Understand the multiple roots of the modern human populations
Links
- Slides on PSI-blast and Human Evolution.
Assignments for Friday 11/19
Class 21 (11/15)
Goals
- Know about archaic admixtures to the genome of modern humans
- Appreciate the many contributions horizontal gene transfer makes to the evolution of organisms.
- Know about the relationship between bacteriophages and GTAs
- Know that many genes that have not made a functional contribution to the organisms nevertheless show a low level of purifying selection.
- Understand how recombination accelerates evolution and prevents the accumulation.
Links
- Slides on Introgression, HGT and GTAs .
- Additional slides on the use of leeches in medicine. These are optional, don't look at them if you are squeamish vis a vie blood and injuries
Assignments for Wednesday 11/15
Read through the excellent articles on Denisovians and on Interbreeding between archaic and modern humans on Wikipedia
Computer lab 11
Goals
- Understand and be able to apply and evaluate models that test for dN/dS ratios for individual codons.
- Be able to use tracer to evaluate estimated parameters and their High Probability Density intervals from MrBayes runs.
- Highlight sites that are neutral and those that are probably under diversifying selection in protein structures dispayed in chimera.
Assignment for Monday (11/15)
- Read the review by Lang and Beaty on GTAs This Link works from inside the University. Let me know, if it does not work for you.
Optional additional reading: A more recent review on "The Distribution, Evolution, and Roles of Gene Transfer Agents in Prokaryotic Genetic Exchange" is in the Reading Materials
- Make a contribution on the discussion board regarding the question "Are some scientific facts better left uncovered?" Use the WSF on Bruce Lahn as a starting point (here or in Reading Materials.) This is a difficult topic, be courteous; there may be good arguments on both sides.
Class 20 (11/10)
Goals
- Know that because of the redundancy some mutations do not change the sequence of the encoded protein. These mutations are termed synonymous.
- Know that under condition of strict neutral evolution synonymous and non-synonymous substitutions would occur with the same rate.
- Know that dN/dS <1 reflects that selection has worked to remove some non-synonymous substitutions.
- Know that diversifying selection dN/dS >1 acts on sites in virus genomes that are recognized by the immune system.
- Know that the dN/dS>1 approach that can be used to detect positive selection is often difficult to apply in case the alignment is unreliable (which results in overestimating non-synonymous substitutions).
- Understand and appreciate Walter Fitch's contribution to identify strains that are likely the parents of next years influenza outbreak
- Know that the absence of SNPs in an allele (or surrounding an allele) can be caused by a selective sweep that erases the diversity around the site being selected for.
- Be clear about what Bruce Lahn's studies of alleles for brain development genes show and do not show
- Think about what this means for science?
Links
- Slides on Detecting positive and diversifying selection, possibly pan-genome definitions.
Assignment for Friday (11/12)
- Go through slides on dN/dS
- Check out take home exam #6
Class 19
Goals
- Know the different approaches to combine genes into a genome/species tree (supermatrix versus supertree)
- Know what the terms positive, negative, and neutral selection mean and the frequent used synonyms for these terms.
- Be able to discuss the terms positive and diversifying selection.
- Realize the limitations of natural selection with respect to mutations that increase the fitness of an individual.
- Know that genes under positive selection are fixed in a population in much shorter time intervals compared to selectively neutral mutations.
Links
- Slides on Intro populations genetics and slides on discussion Super-Trees versus Super-Matrices .
Assignments
- Read this article on selective sweeps
- Complete the exploration of population genetic simulations at http://www.radford.edu/~rsheehy/Gen_flash/popgen/ (you might need to enable flash in your browser).
- Using the same fitness and frequency for the A1 and A2 allele, explore the impact of population size on drift (w11:1; w12:1, w22=1)? [use a population size of 100, 500 generations, an initial frequency of .5, 5 populations]
- For the same population size (100), explore settings that reflect balancing selection. (w11:.9; w12:1, w22=.9) Compare these results to the above.
- What happens, if you decrease the population size? (Be drastic choose 20 or 10)
- Using a small initial frequency of allele 1 (freq. of 0.01 in a population of 50, i.e., you have one allele that conveys a 5% fitness advantage) (w11:1; w12:.95, w22=.9). Perform several simulations. (You need to look closely to see that in the majority of the populations the A1 allele, the one that conveys the 5% advantage go extinct quickly). Note that is very different from a population with infinite size.
- What does this suggest for the effectiveness of natural selection?
- Does natural selection acting on a single advantageous allele work better in a large population? To explore this, increase the population size and lower the frequency, so that you still have one beneficial allele in the population? (e.g., to have one beneficial allele in a population of 500 individuals its freq. is 0.001)**
Computer lab 10
Goals
- Know about the sump and sumt commands in MrBayes
- Understand why the burnin should be excluded from the analysis
- Know how to read the bipartition tables and trees created by the sumt command
- Know how to evaluate the .p-files with respect to high probability intervals for parameters
- Understand why less selection pressure often leads to a higher frequency of A and T in a DNA sequence
- Appreciate how total Tree Length and the shape parameter describing Among Site Rate Variation can inform about selection pressures acting on a sequence.
Assignment for Monday (11/8)
- Take home exam is due
- Browse through the Wikipedia entry on genetic drift.
Class18
Goals
- Appreciate the ease with which Bayesian approaches can estimate ranges for model parameter
- Know that mutation bias tends to increase the AT content of sequences
- Less purifying selection acting on a sequence is reflected in higher substitution rates, and in less extreme ASRV
- Be aware that the species TREES concept is problematic
Links
- Slides for an ebola virus interlude
- Slides on phylogenetic reconstruction using MrBayes and discussion of species gene tree conflicts
Assignment for Friday (11/5)
Read through take-home exam#5 - ask questions if things are not clear.
Assignment for Monday (11/8) and Wednesday (11/10)
- Take home exam is due
- Browse through the Wikipedia entry on genetic drift.
- Explore the population genetic simulations at http://www.radford.edu/~rsheehy/Gen_flash/popgen/ (you might need to enable flash in your browser).
- Using the same fitness and frequency for the A1 and A2 allele, explore the impact of population size on drift (w11:1; w12:1, w22=1)? [use a population size of 100, 500 generations, an initial frequency of .5, 5 populations]
- For the same population size (100), explore settings that reflect balancing selection. (w11:.9; w12:1, w22=.9) Compare these results to the above.
- What happens, if you decrease the population size? (Be drastic choose 20 or 10)
- Using a small initial frequency of allele 1 (freq. of 0.01 in a population of 50, i.e., you have one allele that conveys a 5% fitness advantage) (w11:1; w12:.95, w22=.9). Perform several simulations. (You need to look closely to see that in the majority of the populations the A1 allele, the one that conveys the 5% advantage go extinct quickly). Note that is very different from a population with infinite size.
- What does this suggest for the effectiveness of natural selection?
- Does natural selection acting on a single advantageous allele work better in a large population? To explore this, increase the population size and lower the frequency, so that you still have one beneficial allele in the population? (e.g., to have one beneficial allele in a population of 500 individuals its freq. is 0.001)
Class17
Goals
- Know the many reasons as to why gene and species trees might differ.
- Know the similarity and differences between parsimony and maximum likelihood based phylogenetic reconstruction
- Understand the differences between Maximum Likelihood Estimation and Bayesian approaches to phylogenetics.
- Understand the principle of obtaining posterior probabilities through MCMCMCs
Links
- Slides on phylogenetic reconstruction - Continued from class 16
Assignment for Wednesday (11/3)
Play with Paul Lewis's MCRobot. Explore a differing number of heated chains, and different probability landscapes. https://plewis.github.io/applets/mcmc-robot/
Work through Olga's webpage giving an example on Baysian thinging
Computer lab 9
Goals
- Know how to do a maximum likelihood ratio test
- Know that Maximum Likelihood approaches allow to avoid over parameterization.
- Know how different substitution models are defined,
- Know how to run iqtree on xanadu (the latest version is invoked with iqtree2, after you loaded the module).
- Know what the term Long Branch Attraction Artifact refers to, and that more sophisticated ml-approaches do much better than simple parsimony analysis to avoid LBA (it remains a sad realization that sometimes nature apparently does not follow Okkham's razor).
Assignment for Monday (11/1)
- Take the quiz from the Tree Thinking Challenge
- For additional motivation on the importance of molecular phylogenies meditate about the current pandemic. A nice interactive tree is here. There is a play button, and you can select different classification schemes.
Class16
Goals
- Know the principle behind parsimony analysis and Occam's razor (or Ockham's razor, aka lex parsimoniae)
- Know the similarity and differences between parsimony and maximum likelihood based phylogenetic reconstruction
- Understand the differences between Parsimony, and Maximum Likelihood Estimation
- Know what non parametric bootstrapping is, and how it can be applied to many different approaches of phylogenetic analysis
- Know that swapping branches around a node does not change the meaning of a phylogenetic tree
Links
- Slides Intro to phylogenetic reconstruction.
- Slides for an ebola virus interlude
Assignment for Friday
- If you have problems wrapping your head around non-parametric bootstrapping, watch this YouTube video (warning some may find the sound track a little strange)
Class15
Goals
- Know the difference between local and global alignments
- Understand how dynamic programming can guarantee an alignment with an optimal alignment score in case of a pairwise alignment.
- Understand the principle of the progressive alignment approach and the potential downstream problems caused by this analysis.
- Appreciate that multiple sequence alignments can have different goals: pleasing to the human observer; matching sites that in the 3D structure occupy the corresponding location; be certain that alignment columns only contain homologous sites (else align them to gaps).
- Know about different alignment programs (clustalw, Muscle, SATè, PRANK, MAFFT)
- Understand how dynamic programming can guarantee an alignment with an optimal alignment score in case of a pairwise alignment.
Links
- Slides on sequence and multiple sequence alignments (MSAs)
- If time:
- Slides Intro to phylogenetic reconstruction.
Assignment for Wednesday (10/27)
- Read excerpts of Chapters 5 and 6 from Li's "Molecular Evolution" on HuskyCT
- Read through the Wikipedia entry on Occam's razor
Computer lab 8
Goals
- Be able to perform dot plot alignments in Gephard
- Know how to zoom in on to parts of the dotplot and inspect sequence windows corresponding to matches
- Know how gaps, insertions, introns/inteins, repeated domains, and regions of low complexity reveal themselves in a dotplot analysis.
- Be able to load sequences into SEAVIEW, define sets of sites, and build phylogenetic trees.
- Be able to compute a codon based nucleotide sequence alignment in seaview.
MIDTERM
Class14
Goals
- understand gene plots and strand bias plots from comp. lab 7
- be ready for the midterm
Links
Slides on strand bias and gene plots // answers to additional study questions.
Assignment for Wednesday (10/20)
Assignment for Friday (10/22)
- Refresh your memory on dotplots
- read through these Slides on sequence and multiple sequence alignments (MSAs)
Computer lab 7
Goals
- Analyze Strand Bias and see its connection to bacterial genome architecture.
- Become familiar with the commandline, and moving files to and from the bioinformatics cluster
- Make searchable libraries from multiple sequence files
- understand and execute scripts that modify files and to parse and plot blast search results
- Understand and be able to produce gene plots.
Assignment for Monday (10/18)
- Go through the list of goals on this page, and if you are not sure that you reached the goal, look through the slides from the class, and if this doesn't not help, formulate a question you might want to ask in class on Wednesday.
Class13
Goals
- Know the different pathways through which gene families can expand in a genome
- Know about the fate of duplicated genes
- Understand that gene duplication followed by gene loss may be important in erecting post mating hybridization barriers.
- Know how gaps, insertions, exons/introns, repeated domains, and regions of low complexity look like in a dotplot analysis.
Links
- Slides on gene duplications
- Postponed to after the midterm -- Slides on sequence and multiple sequence alignments (MSAs)
Assignment for Friday (10/15)
- Review the slides on cladistics (class 12)
- Work on Take home exam #4 is due on Friday morning
- Complete the first part of computer lab 6. (See today's second lecture and Computer Lab 6. If the line by line execution gives you too much grief, try to complete at least one of the histograms in Excel.)
Class12
Goals
- Understand that %identity is not a good choice to assess significant sequence similarity.
- Understand the terminology used in cladistics
- Understand the concerns about not considering paraphyletic groups as proper taxonomic units
- Know why fish do not exist
- Contemplate the utility of a natural taxonomic system in light of endosymbiosis and HGT
Links
Slides on Cladistics
Slides on Photosynthesis and Computer Lab 6
Assignment for Wednesday (10/13)
- Read through take-home exam #4. A lot of the questions are on terminology, thus is make a good training set for today's lecture.
- Try to wrap your head around the discussion of shared derived vs shared primitive characters, (see the Wikipedia entry on cladistics -- Ashlock (Ernst Mayr, Lynn Margulis, and others) versus Hennig (Woese, Pace, and others --- for discussion see here, here and here; Jan Sapp's review of the history is here)
The heat of this controversy is reflected in the following excerpt:
"Oddly, the school of ‘phylogenetic systematics’ founded by Hennig (1966) grossly downplayed the phylogenetic importance of progressive change compared with splitting, seen by them as so all-important that many Hennigian devotees dogmatically insist that ancestral groups like Bacteria, Protozoa and Reptilia be banned. Hennig called such basal groups with a monophyletic origin ‘paraphyletic’ and redefined monophyly to exclude them and embrace only clades, likewise redefined as including all descendants of their last common ancestor. This redefinition of ‘clade’ is universally accepted, but Hennig's extremely confusing and unwise redefinition of monophyly is not. Though accepted by many, sadly probably the majority (especially the most vociferous and over self-confident, and those fearful of bullying anonymous referees, of whom I have encountered dozens mistakenly insisting without reasoned arguments that paraphyletic taxa are never permissible), it is rightly firmly rejected by evolutionary systematists who consider the classical distinction between polyphyly and paraphyly much more important than distinguishing two forms of monophyly (paraphyly and holophyly, using the precise terminology of Ashlock (1971), where holophyletic equals monophyletic sensu Hennig)."
Assignment for Friday 10/15
Complete the first part of computer lab 6. (See today's second lecture and Computer Lab 6. If the line by line execution gives you too much grief, try to complete at least one of the histograms in Excel.)
Computer lab 6
Goals
- Understand how to run a blast search from the command-line.
- Use simple unix commands simple commands (cat xxx yyy > zzz)
- Be able to create a searchable database from a multiple fasta sequence file
- Know about different output formats for commndline blast, and be able to import blast search results into an Excel spreadsheet.
- Understand that %identity is a horrible choice to assess significant sequence similarity.
Assignments for Monday (10/11)
Class 11 (10/6)
Goals
- Understand genome structure of prokaryotic genomes (Ori, leading/lagging strand, terminus of replication).
- Know two explanations that can explain the preponderance of recombination events between points equidistant to the origin of replication
- (co-occurrence of recombination with replication;
- Architecture Imparting Sequences (AIMS) and strand bias are not disrupted/misplaced (i.e., these are the only recombination events that do not lead to a drop in fitness)
- Appreciate the power of hashes as on the fly counters and as flexible data structures to associate keys with values.
Links
Slides on Strand Bias, Recombination and AIMS
Assignment for Today (10/6)
- An electronic copy of your Notebook is due today!
Assignments for Friday (10/8)
- Refresh your memory on Filezilla, PuTTY, and Xanadu
Class 10 (10/4)
Goals
- Know about the debates concerning the hot origins of life.
- Understand the arguments for the domain ancestors being survivors of a catastrophe (impact?) that selected to thermophyly
- appreciate that early Earth was a more violent place than today's (no warm little pond, but constant "Tsunamis"
- know about the arguments in favor of a vibrant biosphere being present already 3.8 Ga BP
- appreciate the difficulties of interpreting ancient microfossils.
- know that the temperature under which an organism lives is reflected in sequence composition of proteins and RNAs
- Understand the transitive property of homology and its limitation
Links
Slides on the transitive property of homology, the hot origin of life, and the early heavy bombardment.
Assignments for Wednesday 10/6
Comp Lab 5 (10/1)
Goals
- Learn how to log into the xanadu cluster
- Use simple unix commands (ls, cd, pwd, cat, more, less)
- learn how to redirect the output of a unix command to a file
- Use FileZilla to connect it to the transfer node at Xanadu
- Perform pairwise sequence comparisons with PRSS and blast
- Appreciate the advantage of uniprot90 and uniprot50 for some questions you might try to answer through a databank search.
- Be aware the homology does not always extend to the complete protein sequence of the query
- Use pairwise blast and interpret the dot-plot and alignment graphics.
The following pertain to the optional exercise, but the message of the exercise should have gotten across from the lecture
- Perform Databank searches using FASTA
- Appreciate the advantage of uniprot90 and uniprot50 for some questions you might try to answer through a databank search.
Assignments for Monday (10/4)
- Optional: Watch video on reconstructing the Tree of Life
- TREE of LIFE overview here, (optional: compare here, here and here (especially the discussion is interesting)).
- Read through Olga's Time-line of the Universe here
Class 9 (9/29)
Goals
- Know that BLAST is preforming local alignments only
- Know that fasta searches align complete sequences.
- Know what false positives and false negatives are in relation to a BLAST search
- Appreciate the advantages of a command line interface over a Graphics User Interface
- Know a few commands from the unix command line (cd ls pwd man cat more)
Links
- Intro to blast and unix Slides
Assignments for Friday 10/1
- Take-home exam #2 is due on Friday
Class 8 (9/27)
Goals
-
Know about Margaret Dayhoff's contributions to bioinformatics.
-
Know the difference between PAM and Blosum matrices.
-Know what Dayhoff groups of amino acids are.
-
How to measure if sequences are significantly similar.
-
Understand the difference and similarities between P and E values.
-
Know about "usual" cut-offs for Z-scores, P- and E-values.
-
Be able to discuss the processes that may lead to the decay of significance
-
Know what fishing expeditions are about
-
Know what the Bonferroni correction is, and why it is not popular.
-
Know what false positives and false negatives are in relation to a databank search
Links
- 2nd part of the Slides on Entrez, the origin of GenBank and Margaret Dayhoff
- Intro to statistics Slides
- Intro to blast and unix Slides
Assignments for Wednesday 9/29
- Take-home exam #2 is due on Friday
- Read http://en.wikipedia.org/wiki/Standard_score
- Understand the difference between false positives and false negatives (see error types)
- read through (meaning you don't need to struggle to understand the formulas, just get the overall gist of the article) Using BLAST to Teach “E-value-tionary” Concepts (ppts here).
Comp Lab 4:
Goals
- Know about bibliography software
- Know about the advantage of the databanks accessible through NCBI's Entrez.
- Be able to perform literature databank searches at Google scholar and pubmed
- Know how to retrieve full length manuscripts ;-)
- Appreciate usefulness of # of publications, # of citations, and the H-index
- Know how to access manuscripts similar to one that you know is relevant to you.
- Appreciate that GenBank is highly redundant
- Know that searches at the protein level are more effective than searches at the nucleotide level.
Assignments for Monday's Class 8 (9/27)
Class 7 (9/22)
Goals
-
Understand the power and the limitations of the tree of life image.
-
Understand the relationship between the 3 domains of life, and how the tree of life was rooted.
-
Appreciate that the organismal tree is embedded in the tangled tree or network depicting genome evolution.
-
Know about the Entrez system at the NCBI
-
Know about the advantages and disadvantages of databanks with or without a gatekeeper.
Links
- continued Slides from class 6, also see updated intro from class 6a: regarding comp lab #3
- Slides on Entrez, the origin of GenBank and Margaret Dayhoff
Note for those doing a project
Slides with instructions for the first 3 steps are at https://j.p.gogarten.uconn.edu/mcb3421_2021/Independent_project_Step1to3.pptx
Assignments
for Friday's (9/24) Computer Lab
- Read through the file on frequently used formats here
- Explore the Genbank Sample file here
- Read through http://en.wikipedia.org/wiki/FASTA_format
- Refresh your memory on Boolean operators (AND, OR, NOT) to use in advanced database searches. Here is an explanation of the Boolean operators
Class 6 (9/20)
Goals
- Know that inteins can be associated with a strong selective disadvantage
- Know that an environmentally heterogenous environment may allow for the long term persistence of parasites (and thus provide an alternative to the homing cycle).
- Understand the relation between substitutions and sequence divergence
- Know a few reasons why protein sequences work better to assess similarity than nucleotides
- Understand that only slow evolving genes that are under strong selection for function are suitable to trace early events in evolution.
- Know about the tree and coral metaphors to depict evolution
- Appreciate Lamarck's contribution to understanding evolution.
- Understand the contributions that Woese and Fox made to the classification of life, which molecule they used, and the domains (aka Urkingdoms) they discovered
- Understand the power and the limitations of the tree of life image.
- Understand the relationship between the 3 domains, and how the tree of life was rooted.
- Appreciate that the organismal tree is embedded in the tangled tree or network depicting genome evolution.
Links
- Slides on Inteins continued slide 28ff
- Slides class 6a: Comp Lab discussion, Assignments for today, Jukes Cantor model
- Slides on the tree of life in light of gene transfer, images to depict evolutionary history.
Assignments for class 7 (9/22)
Computer Lab #3 (9/17)
Goals
Know how to
- identify domains in multi domain proteins in chimera;
- inspect protein DNA interactions;
- identify the major and minor groove in a DNA molecule;
- create a multiple sequence alignment based on aligned structures in chimera
Class 5 (9/15)
Goals
- Appreciate the problems and limitations faced by attempts to define life
- Know who Lynn Margulis was.
- Understand the (outline of) Gaia hypothesis, and the problems it faces, and how the ITSNTS approach might overcome these.
- Understand the ATP occupancy in the subunits that form the hexamers in the F1 ATPases
- Know what inteins are and which enzymatic activities do they have?
- Know the scientific definition of symbiosis
- Know about the possible symbiotic relationships between organisms, genes, or protein domains?
- Know the different phases of the homing cycle.
Links
Assignments
Assignment for Monday (9/20)
- Draw a sketch for the relation between the number substitutions that occurred in evolution and the percent identity of the two sequences. (I.e. how does the observed similarity change, as more and more substitutions occur?)
- What are the endpoints (saturation levels) for 4 letter alphabet and for a 20 letter alphabet.
- How does this relationship change, if some parts of the sequence are so important that the protein becomes non-functional, if a mutation occurs in these positions (i.e., these parts of the sequence are never observed to undergo any change?
- If substitution were to occur at a rate of 10^-8 per year and per site, how long would it take for two sequences to by less than 50% identical? (do a rough estimate ignoring multiple substitutions and back mutations.)
- If you were to do a realistic calculation and you were to consider a nucleotide sequence, how long would it take to arrive at 20% identity? (tip: how similar are two random sequences that have not been aligned?)
(Note: answering these questions should not require the use of a calculator or a formula, just common sense.)
Assignment for Friday (9/17)
- go through the slides on inteins,
- watch the YouTube presentation.
Some were surprised by reading assignments that did not come with a link. These assignments are available at the huskyCT site. The easiest way to get to them is to use the course content tab on left and then select "Reading Materials". Send an email if you are not sure how to get there.
Class 4 (9/13)
Goals
- Understand that RNA can be both genetic material and catalyst
- Know item that support the RNA world concept, and difficulties faced by the RNA world
- Know that ATP binding domains can be of very different types, and what this means for our understanding of homology.
- Understand the problems and limitations faced by attempts to define life
- Know who Lynn Margulis was.
Links
- Slides on lab#2 ATP binding sites, convergent evolution, the RNA world.
- Slides on Life,Natural_Selection,and_Gaia.pptx
Assignment for Wednesday (9/15)
Read through the slides on Life, Natural Selection and Gaia. Try to contemplate the following:
- find arguments for an against a virus being considered alive.
- if being part of group that can be subject to natural selection is a criterion for being alive, why should this not apply to computer life and computer viruses?
- does the stipulation of being a "chemical"-system restrict this to "life as we know it"?
- what argues against Traube's cells not being alive?
Read through take-home exam #1 - hopefully being posted on Tuesday.
Computer Lab #2 (9/10)
Goals:
- Have an understanding of the content of a protein data bank file
- Be able to save individual subunits into distinct pdb files
- Align structures of divergent proteins
- Use the structure based alignment to align the linear sequences
- Align structures of a catalytic subunit during the catalytic cycle
- Appreciate that even 80% sequence divergence can leave the protein structure very, very similar.
- Appreciate that for important proteins substitutions occur so rarely that proteins remain recognizable similar in structure AND sequence.
Class 3 (9/8)
Goals:
- The ATPsynthase as rotary motor (Yoshida's experiment, proteolipids)
- The role of gene duplication and sequence divergence in the evolution of proteins;
- Know about the three domains of life (archaea, bacteria, eukaryotes) and how they are related to one another
- Appreciate that molecular evolution can study events that occurred before the last universal common ancestor
- Understand the role of ancient gene duplications in rooting the tree of life.
Links:
- Slides on ATPsynthase, ancient gene duplications and the Tree of Life
- Slides on Comp. Lab #1and Assignments
- Slides on Homology (continued)
Assignment for Friday (9/10)
- Read through the slides on ATPsyntesis and ancient gene duplications (above)
- If you have problems understanding the concept of chemiosmotic coupling, follow the links in the slides (they become clickable in presentation mode)
Assignment for Monday (9/13)
Computer Lab #1 (9/3)
Goals:
- Be able to launch chimera
- Display a 3 D coordinate file from the pdb (1HEW) in chimera
- Use different display settings
- Display amino acid side chains in the binding pocket of 1HEW and study the interactions between the substrate and the binding pocket.
- Calculate a Ramachandran plot, and determine where in this plot alpha helices, beta sheets, and glycine residues fall.
- Save your work as image and project.
Assignments for Wednesday (9/8) see below
Class 2 (9/1)
Goals:
- Understand the concept of homology
- Understand that significant similarity between two primary protein sequences (that are - not of low complexity) is a strong indication that the two sequences evolved from the same ancestral sequence.
- Know how the field of Bioinformatics is commonly "defined"
- Know what terms replication, transcription and translation refer to
- Know about primary, secondary and tertiary structure of proteins
Links:
Assignments for Friday (9/3):
Assignments for Wednesday (9/8)
Contemplate the following questions (see the slides on homology for inspiration):
- Are most proteins with similar function homologous?
- Are all proteins with similar function homologous?
- Are most proteins with significant sequence similarity homologous?
- Do most homologous proteins have significant sequence similarity?
- Do most homologous proteins have similar structure?
Try to answer the following questions:
- Would in your opinion maintaining a database on beetles that contains data on where was the beetle collected, its morphology, and where is it stored in the collection fall under bioinformatics?
- Would in your opinion determining the 3D structure of a protein using X-ray crystallography fall under bioinformatics?
- How many different proteins with length of 100 aa are theoretically possible?
- At most how many aa substitutions does one need to turn one of these sequence into an another one?
- Formulate a question that you could ask on Wednesday (things you didn't understand, anything you want to hear more details about).
- Read through the slides selected from Mark Gerstein's Bioinformatics Course.
Are there any items where you do not agree with Mark Gerstein's delineation?
Read the excerpt from Thomas Mann's book on Dr. Faustus (Dr Faustus) available on HuskyCT. Or at https://www.fadedpage.com/showbook.php?pid=20180329 (go to chapter III). This chapter can provide two insights
- Scientific experiments in parlors, salons and living rooms were frequent and common entertainment in the early 1900s.
- The distinction between living systems and the mineral world was not established. Apparently life could be easily created from non-living constituents. My favorite example are Traube's cells. In the past I did the experiment in class, but now you have to watch the you tube version instead: How to grow an artificial cell from water and salts ("Traube Cell" experiment).
- The membranes that form were the starting point to build the first osmometer and an important step in the development of cell theory. They clearly are not alive, but they grow and do look a lot like red algae.
Ask a question (not limited to Dr. Faustus) on the huskyCT discussion board
Class 1 (8/30)
Goals:
- Know how to contact the instructor and TA.
- Know how your performance will be assessed and graded.
- Know that take-home exams and computer lab assignments are an important part of this course, and that they will be graded.
- Know that you need to maintain an electronic notebook
Links:
Assignments for Wednesday (9/1):
Read through the [Slides on Homology] Note: "Read through" is short for read it, but don't overdo the studying. (https://j.p.gogarten.uconn.edu/mcb3421_2021/class01_2021_homology.pptx)
Make a decission on which type of electronic notebook you will use (One Note versus Joplin), set it up, and write an entry for "the most interesting thing I learned in class 1".
Assignments for Friday (9/3):