Computer lab 14
Computer-lab assignment 14.docx _ as_pdf
Slides on Phamerator.
Goals
- Know how to use phamerator
-- select genomes
-- manipulate genome maps
- Recognize inteins in phamerator generated genome alignments
Review Session and possibly Lecture 27
Goals
- Be ready for the final
- Know who Felix d'Hérelle was
- Appreciate the images of genome comparisons created in Phamerator.
Assignments
- For Friday, recall which intein containing gene you worked on in lab 4
- Study for the Final
- If you are interested, check out the Wikipedia site on Felix d'Hérelle.
Links
pdf of Exams 5-9 Sorry, not the greatest print-outs.
Training questions with answers
Slides on Felix d'Hérelle and Phamerator.
Lecture 26
Goals
- Consider the relationship between genetic drift, population size, and the accumulation of "junk DNA"
- Be aware of the discussion of "function" assigned to intergenic DNA by the ENCODE project
- Know a few examples for constructive neutral evolution
- Know about split inteins and what they are doing
Assignments
- Complete Take-Home Exam #9 before Wednesday 11.15am
- Have questions ready for review session on Wednesday
- Do the SETs, please!
- Update your class notebooks. If you do not use the shared One Note book, send it to the instructors on Friday..
Links
Slides on Constructive Neutral evolution, genetic drift and the origins of complexity.
Other recommended preparation:
- read through the list of goals of class 14 - 27. If you think you have not reached a particular goal, look through the slides, and if that does not help, ask a question on the discussion board.
- Go through the take home exams.
Computer lab 13
Computer-lab assignment 13.docx _ as_pdf
Goals
- Appreciate the power of scripts to run programs repeatedly using different input files
- Appreciate the power of scripts to extract information from the verbose output of a program
- Know how simple statistics on all protein sequences present in a genome can inform on the physiology and adaptations of an organism
- Know that analysis of the genome wide composition of proteins can provide information on optimal growth temperature and adaptation to growth at high salt concentrations.
Assignments for Monday
- Read through take home exam #9
- refresh your memory on genetic drift and constructive neutral evolution
Lecture 25
Goals
- Know about the roles introns play in exon shuffling and in the non-sense mediated mRNA decay pathway
- Appreciate the power of scripts to perform repetitive tasks
- Appreciate the power of scripts to run programs repeatedly and to extract information from the verbose output of a program
- Know how simple statistics on all protein sequences present in a genome can inform on the physiology and adaptations of an organism
- Know that analysis of the genome wide composition of proteins can provide information on optimal growth temperature and adaptation to growth at high salt concentrations.
Links
Slides The benefits of spliceosomal introns. The power of simple scripts: genome wide theoretical IEP calculation.
Assignments for Friday
Assignments for Monday
Take home exam #9
Lecture 24
Goals
- Appreciate the capabilities of figtree to beautify trees and to create publication ready (nearly) images of phylogenies
- Appreciate the pan-genome as a genetic resource for the population (homeoalleles, weakly selected functions, black queen hypothesis)
- Know a few examples of new pathways created through gene transfer.
- Know the what the terms metagenome, pan-genome and microbiome refer to.
- Recall the different types on introns
- Be aware that in animals and plants most genes contain more intron than exon sequences
- Know about the introns early versus introns debate
- Know about the supporting evidence for both sides
- Know how Go plots are created, and how they are used to define protein structure building blocks
- Know why the finding of the intron in Triose Phosphate Isomerase (TPI) encoding gene from Culex is not an argument for introns early
Links
Slides on Figtree, pan-genome Terminology, Introns, Introns early versus late.
Assignment for Wednesday:
Assignments for Friday
Computer lab 12
Computer-lab assignment 12.docx _ as_pdf
Goals
- Know how to do a PSI Blast search using the web interface
- Know that the construction of a PSSM can be done with a different (preferably large and diverse) databank than the final search.
- Be able to save a PSSM from a web based or a command line blast search.
- Execute PSI-blast searches, and different blast searches using pre-calculated PSSMs
- Appreciate the differences between blastp, psibast, and using tblastn with a PSSM
- Know that tblastn searches targeting proteins encoded by selfish genetic elements usually give many more results than searches of the annotated proteins.
- Appreciate that most eukaryotic genomes are full of remnants of retroviral fragments.
Lecture23
Goals
- Know a few examples for the creative power of Gene transfer
- Recognize that the pan-genome of a species provides a reservoir of genetic information that goes beyond the genetic content of an individual genome
- Know about the relationship between bacteriophages and GTAs
- Know that many genes that have not made a functional contribution to the organisms nevertheless show a low level of purifying selection.
- Understand how recombination accelerates evolution and prevents the accumulation.
- Be aware that rejection of the Null Hypothesis for Neutral evolution (dN/dS<1) does not prove that a gene makes a positive contribution to the fitness of an organisms.
- Understand why gene sharing via GTAs is not considered an evolutionary stable strategy
Links
- Slides1 GTAs, Sex and Recombination, the pan genome as a shared genetic resource.
- Slides2 HGT as a creative force, Niche adapting genes.
Assignment for Monday 11/27
- Read the historical article by Wally Gilbert on "why genes in pieces"
- Assignemnts for Friday see below
Lecture 22
Goals
- Know that PSIblast lowers the frequency of false negatives as compared to a normal blast search
- Know the underlying principle of the iteration in PSI blast
- Know what PSI and PSSM stand for
- Understand what PSI blast and Hmmer searches might be used for
- Be aware that making a PSSM and using it for a search can use different databases.
- Appreciate the many contributions horizontal gene transfer makes to the evolution of organisms.
- Know a few examples for the creative power of Gene transfer
Links
- Slides1 PSI-Blast and related concepts.
Assignments for Friday
Computer lab 11
Computer-lab assignment 11.docx _ as_pdf
Goals
- Understand and be able to apply and evaluate models that test for dN/dS ratios for individual codons.
- Be able to use tracer to evaluate estimated parameters and their High Probability Density intervals from MrBayes runs.
- Import the sump data (either via tracer, or directly from the sump output) into excel and determine the sites with the highest probability to be under diversifying (positve) selection
- Highlight sites that are probably under diversifying selection in protein structures dispayed in chimera.
Lecture 21
Goals
- Know that because of the redundancy some mutations do not change the sequence of the encoded protein. These mutations are termed synonymous.
- Know that under condition of strict neutral evolution synonymous and non-synonymous substitutions would occur with the same rate.
- Know that dN/dS <1 reflects that selection has worked to remove some non-synonymous substitutions.
- Know that diversifying selection dN/dS >1 acts on sites in virus genomes that are recognized by the immune system.
- Know that the dN/dS>1 approach that can be used to detect positive selection is often difficult to apply in case the alignment is unreliable (which results in overestimating non-synonymous substitutions).
- Understand and appreciate Walter Fitch's contribution to identify strains that are likely the parents of next years influenza outbreak
- Know that the absence of SNPs in an allele (or surrounding an allele) can be caused by a selective sweep that erases the diversity around the site being selected for.
- Be clear about what Bruce Lahn's studies of alleles for brain development genes show and do not show
- Think about what this means for science?
Links
- Slides1 on dN/dS, selective sweeps and introgression of archaic humans.
Assignments
- Read through the excellent articles on Denisovians and on Interbreeding between archaic and modern humans on Wikipedia
- Make a contribution on the discussion board regarding the question "Are some scientific facts better left uncovered?" Use the WSF on Bruce Lahn as a starting point (here or in Reading Materials on huskyCT.) This is a difficult topic, be courteous; there may be good arguments on both sides.
Lecture 20
Goals
- Know about quartet and bipartition spectra.
- Appreciate Neighbor Net as an alternative to trees
- Know what the terms positive, negative, and neutral selection mean and the frequent used synonyms for these terms.
- Be able to discuss the terms positive and diversifying selection.
- Realize the limitations of natural selection with respect to mutations that increase the fitness of an individual.
- Know that genes under positive selection are fixed in a population in much shorter time intervals compared to selectively neutral mutations.
Links
- Slides1 on Bipartition Spectra and embedded quartets.
- Slides2 on Detecting positive and diversifying selection.
Computer lab 10
Computer-lab assignment 10.docx _ as_pdf
Goals
- Know about the sump and sumt commands in MrBayes
- Understand why the burnin should be excluded from the analysis
- Know how to read the bipartition tables and trees created by the sumt command
- Know how to evaluate the .p-files with respect to high probability intervals for parameters
- Understand why less selection pressure often leads to a higher frequency of A and T in a DNA sequence
- Appreciate how total Tree Length and the shape parameter describing Among Site Rate Variation can inform about selection pressures acting on a sequence.
Lecture 19
Goals
- Know the different approaches to combine genes into a genome/species tree (super-matrix versus super-tree)
- If you concatenate data, make sure that the individual gene phylogenies are compatible with the tree from the concatenated analysis
- Worry about what the consensus tree like signal might actually mean
- Know the sequencing ebola virus DNA was pivotal in determining the transmission history of the virus
- Appreciate that different funeral rites in Congo and West Africa contributed to the severity of the outbreak
- Know that the outbreak in West Africa went back to a single transmission from bat to humans
Links
- Slides Super tree vs Super matrix approaches, Intro to Pop Gen
- Slides Ebola Virus interlude
Assignments
- Read this article on selective sweeps
- Complete the exploration of population genetic simulations at http://www.radford.edu/~rsheehy/Gen_flash/popgen/ (you might need to enable flash in your browser).
- Using the same fitness and frequency for the A1 and A2 allele, explore the impact of population size on drift (w11:1; w12:1, w22=1)? [use a population size of 100, 500 generations, an initial frequency of .5, 5 populations]
- For the same population size (100), explore settings that reflect balancing selection. (w11:.9; w12:1, w22=.9) Compare these results to the above.
- What happens, if you increase or decrease the population size? (Be drastic choose 20 or 1000)
- Using a small initial frequency of allele 1 (freq. of 0.01 in a population of 50, i.e., you have one allele that conveys a 5% fitness advantage) (w11:1; w12:.95, w22=.9). Perform several simulations. (You need to look closely to see that in the majority of the populations the A1 allele, the one that conveys the 5% advantage go extinct quickly). Note that is very different from a population with infinite size.
- What does this suggest for the effectiveness of natural selection?
- Does natural selection acting on a single advantageous allele work better in a large population? To explore this, increase the population size and lower the frequency, so that you still have one beneficial allele in the population? (e.g., to have one beneficial allele in a population of 500 individuals its freq. is 0.001)**
Lecture 18
Goals
- Know the many reasons as to why gene and species trees might differ, and how one can decide if a difference is due to gene transfer or duplication and loss.
- Know the similarity and differences between parsimony and maximum likelihood based phylogenetic reconstruction
- Understand the differences between Maximum Likelihood Estimation and Bayesian approaches to phylogenetics.
- Understand the principle of obtaining posterior probabilities through MCMCMCs
Links
- Slides Intro to phylogenetic reconstruction continued, Bayesian analyses
Assignment for Wednesday (11/3)
Play with Paul Lewis's MCRobot. Explore a differing number of heated chains, and different probability landscapes. https://plewis.github.io/applets/mcmc-robot/
Work through Olga's webpage giving an example on Baysian thinging
Computer lab 9
Computer-lab assignment 09
Goals
- Know how to do a maximum likelihood ratio test
- Know that Maximum Likelihood approaches allow to avoid over parameterization.
- Know how different substitution models are defined,
- Know how to run iqtree on xanadu (the latest version is invoked with iqtree2, after you loaded the module).
- Know what the term Long Branch Attraction Artifact refers to, and that more sophisticated ml-approaches do much better than simple parsimony analysis to avoid LBA (it remains a sad realization that sometimes nature apparently does not follow Okkham's razor).
Assignments for Monday see below.
Lecture 17
Goals
- Understand the differences between Parsimony, Maximum Likelihood and Bayesian approaches to phylogenetic reconstruction
- Know what non parametric bootstrapping is, and how it can be applied to many different approaches of phylogenetic analysis
- Appreciate the huge number of tree topologies for a given number of leaves, and the implications this has for heuristic searches of tree space.
- Appreciate the difference between likelihood of a tree or model and the probability of a tree or model
Links
- Slides Intro to phylogenetic reconstruction continued.
Assignment for Friday
- If you have problems wrapping your head around non-parametric bootstrapping, watch this YouTube video (warning some may find the sound track a little strange)
Assignment for Monday (10/30)
- Take the quiz from the Tree Thinking Challenge
- For additional motivation on the importance of molecular phylogenies meditate about the COVID pandemic. A nice interactive tree is here. There is a play button top left, and you can select different classification schemes and time intervals. I am particularly impressed by the long stem branches that some clades have before they were sampled and diversified (e.g. emerging lineage 22F).
- If you work on a Mac, Java sometimes does not work, also java from ORACLE is only available for free under some conditions. This document on JAVA_and_macOS_X.pdf has instructions on how to install open java version 11.
- Updated information on the CURE project / honors conversion project is no the discussion board.
Lecture16
Goals
- Know the principle behind parsimony analysis and Occam's razor (or Ockham's razor, aka lex parsimoniae)
- Know the similarity and differences between parsimony and maximum likelihood based phylogenetic reconstruction
- Understand the differences between Parsimony, and Maximum Likelihood Estimation
- Know what non parametric bootstrapping is, and how it can be applied to many different approaches of phylogenetic analysis
- Know that swapping branches around a node does not change the meaning of a phylogenetic tree
Links
- Slides Intro to phylogenetic reconstruction.
Assignment for Wednesday (10/27)
- Read excerpts of Chapters 5 and 6 from Li's "Molecular Evolution" on HuskyCT
- Read through the Wikipedia entry on Occam's razor
Computer lab 8
Computer-lab assignment 08
Goals
- be able to use nucmer and dotplot to illustrate the relations between multiple genomes in an all against all comparison.
- know how to perform blast searches from the command line.
- know how to process the tabular blast output files.
Lecture 15
Goals
- Know the different pathways through which gene families can expand in a genome
- Know about the fate of duplicated genes
- Understand that gene duplication followed by gene loss may be important in erecting post mating hybridization barriers.
- Know how gaps, insertions, exons/introns, repeated domains, and regions of low complexity look like in a dotplot analysis.
- Appreciate that dotplots (such as Gepard or Mummer) are useful to visualize the comparison between multiple genomes.
Links
slides on gene duplications and dotplots and discussion of lab 6
Assignments
For Friday
- Refresh your memory on blast searches and dotplots
For Monday
- Contemplate the different ways genes can be duplicated, and how they can persist over long periods of time
- Try to understand how pseudogenization can lead to a post mating barrier for diverging populations.
Lecture 14
Goals
- Know about some processes in evolution that go beyond natural selection acting on gradual changes
- Be aware that many scientific heroes were children of their time
- Be aware of the criticisms of the Modern Synthesis
Links
slides on Mutualism and Mutual Aid.
Assignments for Wednesday
Computer lab 7
Computer-lab assignment 07
Goals
- Know how cumulative strand bias helps to infer genome structure of prokaryotic genomes (Ori, leading/lagging strand, terminus of replication).
- Appreciate the power of hashes as on the fly counters and as flexible data structures to associate keys with values.
- Appreciate to perform repetitive tasks
- Know how to run mummer and create and read mummer plots.
Assignments
- See below for suggested readings on cladistics controversies
- Take home exam #4 is due on Monday.
Midterm
Lecture 12
Goals
- Appreciate that many important characteristics (such as photosynthesis) of living organisms were transferred horizontally
- Understand the terminology used in cladistics
- Understand the concerns about not considering paraphyletic groups as proper taxonomic units
- Know why fish do not exist
- Contemplate the utility of a natural taxonomic system in light of endosymbiosis and HGT
Links
Slides on photosynthesis in the ToL and cladistics.
Assignment for Monday (10/16)
The heat of this controversy is reflected in the following excerpt from from Tom Cavalier Smith http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2842702/ :
"Oddly, the school of ‘phylogenetic systematics’ founded by Hennig (1966) grossly downplayed the phylogenetic importance of progressive change compared with splitting, seen by them as so all-important that many Hennigian devotees dogmatically insist that ancestral groups like Bacteria, Protozoa and Reptilia be banned. Hennig called such basal groups with a monophyletic origin ‘paraphyletic’ and redefined monophyly to exclude them and embrace only clades, likewise redefined as including all descendants of their last common ancestor. This redefinition of ‘clade’ is universally accepted, but Hennig's extremely confusing and unwise redefinition of monophyly is not. Though accepted by many, sadly probably the majority (especially the most vociferous and over self-confident, and those fearful of bullying anonymous referees, of whom I have encountered dozens mistakenly insisting without reasoned arguments that paraphyletic taxa are never permissible), it is rightly firmly rejected by evolutionary systematists who consider the classical distinction between polyphyly and paraphyly much more important than distinguishing two forms of monophyly (paraphyly and holophyly, using the precise terminology of Ashlock (1971), where holophyletic equals monophyletic sensu Hennig)."
Computer lab 6
Computer-lab assignment 06
Goals
- align sequences in mafft and muscle
- be able to define sites in Seaview
- safe sections (intein and exteins) from a Multiple Sequence Alignment (MSA) into separate files.
- Use chimerax to predict the stucture of putative inteins and theii hostprotein in alphafold
- increase your familiarity with FileZilla and the Commnad Line
Assignment for Monday see below.
Lecture 11 (10/6)
Goals
- Understand genome structure of prokaryotic genomes (Ori, leading/lagging strand, terminus of replication).
- Know two explanations that can explain the preponderance of recombination events between points equidistant to the origin of replication
- (co-occurrence of recombination with replication;
- Architecture Imparting Sequences (AIMS) and strand bias are not disrupted/misplaced (i.e., these are the only recombination events that do not lead to a drop in fitness)
- Appreciate the power of hashes as on the fly counters and as flexible data structures to associate keys with values.
Links
Slides on Strand Bias, Recombination and AIMS
Assignments
for Friday (10/6)
- An electronic copy of your Notebook is due on Friday!
- Take-home exam #3 is due on Friday!
- Refresh your memory on FileZilla, ssh, and Xanadu
- read through the alphafold entry on wikipedia to get an idea of why this is a big deal (even though to a large part it is still homology modeling).
for Monday (10/9)
- complete the reading assignments for Wednesday 10/4 and Friday 10/6
- if you have the resources, complete the computer-lab exercise #6.
Lecture 10, 10/2
Goals
- Understand the transitive property of homology and its limitation
- Know about the debates concerning the hot origins of life.
- Understand the arguments for the domain ancestors being survivors of a catastrophe (impact?) that selected to thermophyly
- appreciate that early Earth was a more violent place than today's (no warm little pond, but constant "Tsunamis"
- know about the arguments in favor of a vibrant biosphere being present already 3.8 Ga BP
- appreciate the difficulties of interpreting ancient microfossils.
- know that the temperature under which an organism lives is reflected in sequence composition of proteins and RNAs
Links
- Discussion of the transitive property of homology, makeblastdb and blast on the command line, the early evolution of life and impact frustration of an earlier biosphere: Slides
Assignments for Wednesday 10/4
- Read through article from Tillier and Collins on Genome rearrangement by replication-directed translocation (also available on HuskyCT). Try to understand Figure 1 and 2. Can you think of alternative explanations?
- Make sure that your electronic notebook is in good shape. In particular, you should have reflective statements on every lecture. If you do not use the OneNote class notebook (to which I have access), share the notebook with the instructors via email.
I thought this the notebook sharing was due today, but I was told somewhere it said 10/6. 10/6 before 10am is the final deadline.
- Optional, incase you are interested in the early evolution of life, here are a few articles that give more details:
Computer lab 5
Computer-lab assignment 05
Goals
- know when an when not the transitive property of homology applies
- be able to log into your student account on Xanadu
- become familiar with Filezilla and ssh
- Be able to align sequences using Mafft or Muscle
Assignment for Monday
read as much of this introduction to the unix shell as you can digest.
Lecture 9 (9/29)
Goals
- Know that for pairwise alignments, given a substitution matrix and gap penalties one can find alignments with an optimal alignment score. However, there might be several alignments with the same optimal score.
- Have a rough idea of the intricacies of creating multiple sequence alignments
- Appreciate the advantages of a command line interface over a Graphics User Interface
- Know a few commands from the unix command line (cd ls pwd man cat more)
- Know about compute and head-notes on computing clusters
Links
- Intro to blast and unix Slides
Assignments for Friday 10/1
- Take-home exam #2 is due on Friday
Lecture 8
- Know about Margaret Dayhoff's contributions to bioinformatics.
- Know the difference between PAM and Blosum matrices.
- Know what Dayhoff groups of amino acids are.
- How to measure if sequences are significantly similar.
- Understand the difference and similarities between P and E values.
- Know about "usual" cut-offs for Z-scores, P- and E-values.
- Be able to discuss the processes that may lead to the decay of significance
- Know what fishing expeditions are about
- Know what the Bonferroni correction is, and why it is not popular.
- Know what false positives and false negatives are in relation to a databank search
- Be able to discuss the processes that may lead to the decay of significance
Links
- Slides Statistics, sequence alignment, and blast searches -- Note: at best we will get through the first 40 slides
Assignments for Wednesday 9/27
Computer lab 4
Computer-lab assignment 04:
Goals
- Know about bibliography software
- Know about the advantage of the databanks accessible through NCBI's Entrez.
- Be able to perform literature databank searches at Google scholar, SCOPUS and pubmed
- Know how to retrieve full length manuscripts ;-)
- Know that the # of publications, # of citations, and the H-index are frequently used to measure productivity and impact of scientists.
- Know how to access manuscripts similar to one that you know is relevant to you.
- Appreciate that GenBank is highly redundant.
- Know that searches at the protein level are more effective than searches at the nucleotide level.
Lecture 7 (9/20)
Goals
- Know about Margaret Dayhoff's contributions to bioinformatics.
- Know about the Entrez system at the NCBI
- Know about the advantages and disadvantages of databanks with or without a gatekeeper.
- Know the difference between PAM and Blosum matrices.
- Know what Dayhoff groups of amino acids are.
- How to measure if sequences are significantly similar.
- Understand the difference and similarities between P and E values.
- Know about "usual" cut-offs for Z-scores, P- and E-values.
- Know about the tree and coral metaphors to depict evolution
- Know what false positives and false negatives are in relation to a databank search
Links
- Slides on Entrez, the origin of GenBank and Margaret Dayhoff; blast searches
Assignments
for Friday's (9/22) Computer Lab
- Read through the file on frequently used formats to depict sequences here
- Explore the Genbank Sample file here
- Read through http://en.wikipedia.org/wiki/FASTA_format
- Refresh your memory on Boolean operators (AND, OR, NOT) to use in advanced database searches. Here is an explanation of the Boolean operators
for Monday's Class 8 (9/25)
Lecture 6 (9/18)
Goals
- Understand the relation between substitutions and sequence divergence
- Know a few reasons why protein sequences work better to assess similarity than nucleotides
- Understand that only slow evolving genes that are under strong selection for function are suitable to trace early events in evolution.
- Appreciate Lamarck's contribution to understanding evolution.
- Understand the contributions that Woese and Fox made to the classification of life, which molecule they used, and the domains (aka Urkingdoms) they discovered
- Understand the power and the limitations of the tree of life image.
- Understand the relationship between the 3 domains, and how the tree of life was rooted.
- Appreciate that the organismal tree is embedded in the tangled tree or network depicting genome evolution.
Links
Assignments for Wednesday class 7 (9/20)
Computer lab 3 (9/15)
Computer-lab assignment 03
Inteins: splicing and homing endonuclease domains; proien DNA interactions.
Goals
- Know how to identify domains in multi domain proteins in chimera;
- create a multiple sequence alignment based on aligned structures in chimera
- align structures of very divergent proteins;
- inspect protein DNA interactions;
- identify the major and minor groove in a DNA molecule;
For the assignments for Monday see class 5 below.
Lecture 5 (9/15)
Goals
- Appreciate the problems and limitations faced by attempts to define life
- Know who Lynn Margulis was.
- Understand the (outline of) Gaia hypothesis, and the problems it faces, and how the ITSNTS approach might overcome these.
- Understand the ATP occupancy in the subunits that form the hexamers in the F1 ATPases
- Know what inteins are and which enzymatic activities do they have?
- Know the scientific definition of symbiosis
- Know about the possible symbiotic relationships between organisms, genes, or protein domains?
- Know the different phases of the homing cycle.
- now that inteins can be associated with a strong selective disadvantage
- Know that an environmentally heterogenous environment may allow for the long term persistence of parasites (and thus provide an alternative to the homing cycle).
Links
Assignments
for Monday (9/18)
- Draw a sketch for the relation between the number substitutions that occurred in evolution and the percent identity of the two sequences. (I.e. how does the observed similarity change, as more and more substitutions occur?)
- What are the endpoints (saturation levels) for 4 letter alphabet and for a 20 letter alphabet assuming a perfect alignment that alignes homologous positions.
- How does this relationship change, if some parts of the sequence are so important that the protein becomes non-functional, if a mutation occurs in these positions (i.e., these parts of the sequence are never observed to undergo any change?
- If you were to do a realistic calculation and you were to consider a nucleotide sequence, how long would it take to arrive at 20% identity? (tip: how similar are two random sequences that have not been aligned?)
(Note: answering these questions should not require the use of a calculator or a formula, just common sense.)
for Friday (9/15)
- go through the slides on inteins,
- watch the YouTube presentation.
Lecture 4 (9/13)
Goals
- Understand that RNA can be both genetic material and catalyst
- Know item that support the RNA world concept, and difficulties faced by the RNA world
- Know that ATP binding domains can be of very different types, and what this means for our understanding of homology.
- Understand the problems and limitations faced by attempts to define life
Links
- Slides on lab#2 ATP binding sites, convergent evolution, the RNA world.
- Slides on Life, Natural Selection, and Gaia.pptx
Assignments:
Contemplate the following:
- find arguments for an against a virus being considered alive.
- if being part of group that can be subject to natural selection is a criterion for being alive, why should this not apply to computer life and computer viruses?
- does the stipulation of being a "chemical"-system restrict this to "life as we know it"?
- what argues against Traube's cells not being alive?
Read through the slides on Life, Natural Selection and Gaia. You can follow the links, if you are in presentation mode.
Read through take-home exam #1 - Wednesday is the last chance to discuss this in class before the dues date. Remember to work on the exam on your own. THIS IS NOT A TEAM BASED LEARNING EXERCISE!
Computer lab 2 (9/8)
Computer-lab assignment 02:
Aligning divergent sequences and structures in Chimera
Goals:
- Have an rough understanding of the content of a protein data bank file
- Be able to save individual subunits into distinct pdb files
- Align structures of divergent proteins
- Use the structure based alignment to align the linear sequences
- Align structures of a catalytic subunit during the catalytic cycle
- Appreciate that even 80% sequence divergence (or more) can leave the protein structure very, very similar.
- Appreciate that for important proteins substitutions occur so rarely that proteins remain recognizable similar in structure AND sequence.
For the assignments for Monday see class 3 below.
Lecture 3 (9/6)
Goals:
- The ATPsynthase as rotary motor (Yoshida's experiment, proteolipids)
- The role of gene duplication and sequence divergence in the evolution of proteins;
- Know about the three domains of life (archaea, bacteria, eukaryotes) and how they are related to one another
- Appreciate that molecular evolution can study events that occurred before the last universal common ancestor
- Understand the role of ancient gene duplications in rooting the tree of life.
Links:
- Slides on ATPsynthase, ancient gene duplications and the Tree of Life, and Inteins
- Slides on Comp. Lab #1and Assignments
- Slides on Homology (continued)
Assignment for Friday (9/8)
- Read through the slides on ATPsynthesis and ancient gene duplications (above), watch the movie at https://www.youtube.com/watch?v=_GPDsQnnvrA
- I hope to provide you with a first view of intein on Friday. I will help if you continue the above slides to the end
- If you have problems understanding the concept of chemiosmotic coupling, follow the links in the slides (they become clickable in presentation mode)
Assignment for Monday (9/11)
- Read the chapter on Evolution as algorithm from "Darwin's Dangerous Idea" by Daniel C. Dennett (husky CT)
- We hope to post the first take-home exam on HuskyCT over the weekend (or Friday). Have look. If you have questions post them to the discussion board.
- If you have time and are interested, listen to the RNA world discussion at the library of congress https://www.loc.gov/item/webcast-7353
Computer Lab #1 (9/1)
Computer-lab assignment 01:
Intro to Chimera - Binding Pocket Substrate Interactions
Goals:
- Be able to launch chimera
- Display a 3 D coordinate file from the pdb (1HEW) in chimera
- Use different display settings
- Display amino acid side chains in the binding pocket of 1HEW and study the interactions between the substrate and the binding pocket.
- Calculate a Ramachandran plot, and determine where in this plot alpha helices, beta sheets, and glycine residues fall.
- Save your work as image and project.
Assignments for Wednesday (9/8) see below
Lecture 2 (8/30)
Goals:
- Understand the concept of homology
- Understand that significant similarity between two primary protein sequences (that are - not of low complexity) is a strong indication that the two sequences evolved from the same ancestral sequence.
- Know how the field of Bioinformatics is commonly "defined"
- Know what terms replication, transcription and translation refer to
- Know about primary, secondary and tertiary structure of proteins
Links:
Assignments for Friday (9/3):
Assignments for Wednesday (9/6)
Contemplate the following questions (see the slides on homology for inspiration):
- Are most proteins with similar function homologous?
- Are all proteins with similar function homologous?
- Are most proteins with significant sequence similarity homologous?
- Do most homologous proteins have significant sequence similarity?
- Do most homologous proteins have similar structure?
Try to answer the following questions:
- Would in your opinion maintaining a database on beetles that contains data on where was the beetle collected, its morphology, and where is it stored in the collection fall under bioinformatics?
- Would in your opinion determining the 3D structure of a protein using X-ray crystallography fall under bioinformatics?
- How many different proteins with length of 100 aa are theoretically possible?
- At most how many aa substitutions does one need to turn one of these sequence into an another one?
- Formulate a question that you could ask on Wednesday (things you didn't understand, anything you want to hear more details about).
- Read through the slides selected from Mark Gerstein's Bioinformatics Course in the Intro slides class 1
Are there any items where you do not agree with Mark Gerstein's delineation?
Read the excerpt from Thomas Mann's book on Dr. Faustus (Dr Faustus) available on HuskyCT. Or at https://www.fadedpage.com/showbook.php?pid=20180329 (go to chapter III). This chapter can provide two insights
- Scientific experiments in parlors, salons and living rooms were frequent and common entertainment in the early 1900s.
- The distinction between living systems and the mineral world was not established. Apparently life could be easily created from non-living constituents. My favorite example are Traube's cells. In the past I did the experiment in class, but now you have to watch the you tube version instead: How to grow an artificial cell from water and salts ("Traube Cell" experiment).
- The membranes that form were the starting point to build the first osmometer and an important step in the development of cell theory. They clearly are not alive, but they grow and do look a lot like red algae.
Ask a question (not limited to Dr. Faustus) on the huskyCT discussion board
Lecture 1 (8/28)
Goals:
- Know how to contact the instructor and TA.
- Know how your performance will be assessed and graded.
- Know that take-home exams and computer lab assignments are an important part of this course, and that they will be graded.
- Know that you need to maintain an electronic notebook
Links:
Assignments for Wednesday (8/30):
- Study the Syllabus! Ask questions, if expectations are not clear.
- Consider if you want to participate in the CURE (course based undergraduate research experience) project.
- Read through the [Slides on Homology] Note: "Read through" is short for read it, but don't overdo the studying. (https://j.p.gogarten.uconn.edu/mcb3421_2023/class01_2023_homology.pptx)
- Make yourself familiar with the OneNote electronic notebook and write an entry for "lecture 1" (or set up your notebook in Joplin).
Assignments for Friday (9/1):