Training questions to prepare for the final here
Here is the same with answers, it remains a good idea to go through this and try to figure out the answers non your own.
Computer lab 14
Computer-lab assignment 14.docx _ as_pdf
Goals
- Appreciate that scripts allow you to run programs repeatedly using different input files
- Appreciate the power of scripts to extract information from the verbose output of a program
- Know how simple statistics on all protein sequences present in a genome can inform on the physiology and adaptations of an organism
- Know that analysis of the genome wide composition of proteins can provide information on optimal growth temperature and adaptation to growth at high salt concentrations.
- Appreciate the capabilities of figtree to beautify trees and to create publication ready (nearly) images of phylogenies
Assignments for Monday
Be ready for the final
Lecture 25
Goals
- Know about quartet and bipartition spectra.
- Appreciate Neighbor Net as an alternative to trees
- Appreciate that the assumption of general time reversibility can give rise to problems.
- Appreciate the power of scripts to run programs repeatedly and to extract information from the verbose output of a program
- Know how simple statistics on all protein sequences present in a genome can inform on the physiology and adaptations of an organism
- Know that analysis of the genome wide composition of proteins can provide information on optimal growth temperature and adaptation to growth at high salt concentrations.
Links
-
Slides1 on Bipartition Spectra and embedded quartets.
-
Slides2 on EMBOSS, pepstats, theoretal IEP..
Lecture 24
Goals
- Appreciate the capabilities of figtree to beautify trees and to create publication ready (nearly) images of phylogenies
- Know about the roles introns play in exon shuffling and in the non-sense mediated mRNA decay pathway
- Recall the different types on introns
- Be aware that in animals and plants most genes contain more intron than exon sequences
- Know about the introns early versus introns debate
- Know about the supporting evidence for both sides
- Know how Go plots are created, and how they are used to define protein structure building blocks
- Know why the finding of the intron in Triose Phosphate Isomerase was not considered a convincing argument for introns early
Links
- Slides1 Creating nice trees in Figtree.
- Slides2 The "Introns early versus Introns late" debate
Computer lab 13
Computer-lab assignment 13; pdf
Goals
- Know how to do a PSI Blast search using the web interface
- Know that the construction of a PSSM can be done with a different (preferably large and diverse) databank than the final search.
- Be able to save a PSSM from a command line blast search.
- Execute PSI-blast searches, and different blast searches using pre-calculated PSSMs
- Appreciate the differences between blastp, psibast, and using tblastn with or without a PSSM
- Know that tblastn searches that target proteins encoded by selfish genetic elements usually give many more results than searches of the annotated proteins.
- Appreciate that most eukaryotic genomes are full of remnants of retroviral fragments.
Assignment for the Monday after Thanksgiving
Go over the lectures, labs, and take-home exams, especially those after the midterm. Ask questions if things are not clear.
Lecture 23
Goals
- Understand why gene sharing via GTAs is not considered an evolutionary stable strategy
- Know that many genes that have not made a functional contribution to the organisms nevertheless show a low level of purifying selection.
- Understand how recombination accelerates evolution and prevents the accumulation.
- Be aware that rejection of the Null Hypothesis for Neutral evolution (dN/dS<1) does not prove that a gene makes a positive contribution to the fitness of an organisms.
- Know a few examples for the creative power of Gene transfer
- Recognize that the pan-genome of a species provides a reservoir of genetic information that goes beyond the genetic content of an individual genome
Links
- Slides1 GTAs, Sex and Recombination, the pan genome as a shared genetic resource.
- Slides2 HGT as a creative force, Niche adapting genes.
Lecture 22
Goals
- Know that PSIblast lowers the frequency of false negatives as compared to a normal blast search
- Know the underlying principle of the iteration in PSI blast
- Know what PSI and PSSM stand for
- Understand what PSI blast and Hmmer searches might be used for
- Be aware that making a PSSM and using it for a search can use different databases.
Possibly, depends how far we get in the lecture:
- Know about the relationship between bacteriophages and GTAs
- Understand why gene sharing via GTAs is not considered an evolutionary stable strategy
Links
Assignments for Wednesday
Assignments for Friday
Computer lab 12
Computer-lab assignment 12; pdf
Goals
- Understand and be able to apply and evaluate models that test for dN/dS ratios for individual codons.
- Be able to use tracer to evaluate estimated parameters and their High Probability Density intervals from MrBayes runs.
- Import the sump data (either via tracer, or directly from the sump output) into excel and determine the sites with the highest probability to be under diversifying (positive) selection
- Highlight sites that are probably under diversifying selection in protein structures displayed in chimera.
Assignments
- review what you learned about the
- types of selection (purifying, neutral, positive, aka Darwinian or diversifying)
- speciation (incl. complex speciation and introgression)
- the impact of genetic drift on the frequency with which beneficial mutations are fixed in a population.
Lecture 21
Goals
- Think about how speciation works in real life
- Know about "complex speciation" and
- archaic admixtures.
- Be clear about what Bruce Lahn's studies of alleles for brain development genes show and do not show
- Think about what this means for science?
Links
- Slides1 on selective sweeps and introgression of archaic humans.
Assignments
- Read through the excellent articles on Denisovians and on Interbreeding between archaic and modern humans on Wikipedia
- Make a contribution on the discussion board regarding the question "Are some scientific facts better left uncovered?" Use the WSF on Bruce Lahn as a starting point (here or in Reading Materials on huskyCT.) This is a difficult topic, be courteous; there may be good arguments on both sides.
Lecture 20
Goals
- Know that because of the redundancy some mutations do not change the sequence of the encoded protein. These mutations are termed synonymous.
- Know that under condition of strict neutral evolution synonymous and non-synonymous substitutions would occur with the same rate.
- Know that dN/dS <1 reflects that selection has worked to remove some non-synonymous substitutions.
- Know that diversifying selection dN/dS >1 acts on sites in virus genomes that are recognized by the immune system.
- Know that the dN/dS>1 approach that can be used to detect positive selection is often difficult to apply in case the alignment is unreliable (which results in overestimating non-synonymous substitutions).
- Understand and appreciate Walter Fitch's contribution to identify strains that are likely the parents of next years influenza outbreak
- Know that the absence of SNPs in an allele (or surrounding an allele) can be caused by a selective sweep that erases the diversity around the site being selected for.
Links
- Slides on types of selection, dN/dS, selective sweeps and introgression of archaic humans.
Assignments
Computer lab 11
Computer-lab assignment 11; pdf
Goals
- Know about the sump and sumt commands in MrBayes
- Understand why the burnin should be excluded from the analysis
- Know how to read the bipartition tables and trees created by the sumt command
- Know how to evaluate the .p-files with respect to high probability intervals for parameters
- Understand why less selection pressure often leads to a higher frequency of A and T in a DNA sequence
- Appreciate how total Tree Length and the shape parameter describing Among Site Rate Variation can inform about selection pressures acting on a sequence.
Assignments for Monday see below.
Lecture 19
Goals
- Know the different approaches to combine genes into a genome/species tree (super-matrix versus super-tree)
- If you concatenate data, make sure that the individual gene phylogenies are compatible with the tree from the concatenated analysis
- Worry about what the consensus tree like signal might actually mean
Links
Assignments
- Read this article on selective sweeps
- Complete the exploration of population genetic simulations at http://www.radford.edu/~rsheehy/Gen_flash/popgen/ (you might need to enable flash in your browser).
- Using the same fitness and frequency for the A1 and A2 allele, explore the impact of population size on drift (w11:1; w12:1, w22=1)? [use a population size of 100, 500 generations, an initial frequency of .5, 5 populations]
- For the same population size (100), explore settings that reflect balancing selection. (w11:.9; w12:1, w22=.9) Compare these results to the above.
- What happens, if you increase or decrease the population size? (Be drastic choose 20 or 1000)
- Using a small initial frequency of allele 1 (freq. of 0.01 in a population of 50, i.e., you have one allele that conveys a 5% fitness advantage) (w11:1; w12:.95, w22=.9). Perform several simulations. (You need to look closely to see that in the majority of the populations the A1 allele, the one that conveys the 5% advantage go extinct quickly). Note that is very different from a population with infinite size.
- What does this suggest for the effectiveness of natural selection?
- Does natural selection acting on a single advantageous allele work better in a large population? To explore this, increase the population size and lower the frequency, so that you still have one beneficial allele in the population? (e.g., to have one beneficial allele in a population of 500 individuals its freq. is 0.001)**
Lecture 18
Goals
- Be aware that LBA is a systematic and reproducible artifact.
- Know the many reasons as to why gene and species trees might differ, and how one can decide if a difference is due to gene transfer or duplication and loss.
- Understand the differences between Maximum Likelihood Estimation and Bayesian approaches to phylogenetics.
- Understand the principle of obtaining posterior probabilities through MCMCMCs
Links
- Slides Bayesian thinking, supertree vs supermatrix approaches, intro to population genetics.
Assignment for Wednesday (11/5)
Play with Paul Lewis's MCRobot. Explore a differing number of heated chains, and different probability landscapes. https://plewis.github.io/applets/mcmc-robot/
Work through Olga's webpage giving an example on Baysian thinging
Computer lab 10
Computer-lab assignment 10; pdf
Goals
- Know how to do a maximum likelihood ratio test
- Know that Maximum Likelihood approaches allow to avoid over parameterization.
- Know how different substitution models are defined,
- Know how to run iqtree on xanadu (the latest version is invoked with iqtree2, after you loaded the module).
- Know what the term Long Branch Attraction Artifact refers to, and that more sophisticated ml-approaches do much better than simple parsimony analysis to avoid LBA (it remains a sad realization that sometimes nature apparently does not follow Okkham's razor).
Assignments for Monday see below.
Lecture17
Goals
- Understand the differences between Parsimony, Maximum Likelihood and Bayesian approaches to phylogenetic reconstruction
- Know what non parametric bootstrapping is, and how it can be applied to many different approaches of phylogenetic analysis
- Appreciate the huge number of tree topologies for a given number of leaves, and the implications this has for heuristic searches of tree space.
- Appreciate the difference between likelihood of a tree or model and the probability of a tree or model
- Know how the the posterior probability is related to the Prior and the likelihood of the data.
Links
- Slides Intro to phylogenetic reconstruction continued.
Assignment for Friday
- If you have problems wrapping your head around non-parametric bootstrapping, watch this YouTube video (warning some may find the sound track a little strange)
Assignment for Monday (10/30)
- Take the quiz from the Tree Thinking Challenge
- For additional motivation on the importance of molecular phylogenies meditate about the COVID pandemic. A nice interactive tree is here. There is a play button top left, and you can select different classification schemes and time intervals. I am particularly impressed by the long stem branches that some clades have before they were sampled and diversified (e.g. emerging lineage 22F). Note the the piecharts on the geographic regions are updated in sync with the phylogeny.
Lecture16
Goals
- Know that swapping branches around a node does not change the meaning of a phylogenetic tree
- Know the principle behind parsimony analysis and Occam's razor (or Ockham's razor, aka lex parsimoniae)
- Know the similarity and differences between parsimony and maximum likelihood based phylogenetic reconstruction
- Understand the differences between Parsimony and Maximum Likelihood Estimation
- Know that algorithmic approaches (neighbor joining) are fast, whereas parsimony and maximum likelihood need to do a heuristic exploration of tree space.
Links
- Slides Intro to phylogenetic reconstruction.
Assignment for Wednesday
Computer lab 9
Computer-lab assignment 09; pdf
Goals
- explore different alignment approaches
- edit and dissect sequence alignments in seaview
- Compare phylogenetic trees calculated for intein and extein sequences
- use intein sequences dissected from a multiple sequence alignment to predict the structure
of the intein
- discuss the predicted structure in terms of the presence of self splicing domains and LAGLIDADG domains.
Assignment for Monday
- Contemplate the different ways genes can be duplicated, and how they can persist over long periods of time
- Try to understand how pseudogenization can lead to a post mating barrier for diverging populations.
- Read excerpts of Chapters 5 and 6 from Li's "Molecular Evolution" on HuskyCT
Lecture 15
Goals
- Know the different pathways through which gene families can expand in a genome
- Know about the fate of duplicated genes
- Understand that gene duplication followed by gene loss may be important in erecting post mating hybridization barriers.
- Appreciate that the concept of mutual aid and natural selection are not mutually exclusive concepts.
Links
SLIDES on lab 8, Gene and Genome Duplications, Mutual aid
Assignments for Friday
- Recall the intein / extein searches from lab 4. From this class you should have a phamily of sequences with homologs to the intein, and it the intein-free version was not in the same phamily, you should have these in a second phamily. We will provide sequences for you to work with, but it might be nice to work on your own set of sequences.
Computer lab 8
Computer-lab assignment 08; pdf
Goals
- know how to perform blast searches from the command line
- know how to create a blast searchable databank
- be able to retrieve matching target sequences from a databank
- know how to process the tabular blast output files.
Lecture 14
Goals
- Know about some processes in evolution that go beyond natural selection acting on gradual changes
- Be aware that many scientific heroes were children of their time
- Know some short-comings of the Modern Synthesis
Links
slides on Images to depict evolution, Mutualism and Mutual Aid
Assignments for Monday
Lecture 13
Goals
- Appreciate that many important characteristics (such as photosynthesis) of living organisms were transferred horizontally
- Understand the terminology used in cladistics
- Understand the concerns about not considering paraphyletic groups as proper taxonomic units
- Know why fish do not exist
- Contemplate the utility of a natural taxonomic system in light of endosymbiosis and HGT
Links
Slides on lab 7, photosynthesis in the ToL and cladistics.
The heat of this controversy is reflected in the following excerpt from from Tom Cavalier Smith http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2842702/ :
"Oddly, the school of ‘phylogenetic systematics’ founded by Hennig (1966) grossly downplayed the phylogenetic importance of progressive change compared with splitting, seen by them as so all-important that many Hennigian devotees dogmatically insist that ancestral groups like Bacteria, Protozoa and Reptilia be banned. Hennig called such basal groups with a monophyletic origin ‘paraphyletic’ and redefined monophyly to exclude them and embrace only clades, likewise redefined as including all descendants of their last common ancestor. This redefinition of ‘clade’ is universally accepted, but Hennig's extremely confusing and unwise redefinition of monophyly is not. Though accepted by many, sadly probably the majority (especially the most vociferous and over self-confident, and those fearful of bullying anonymous referees, of whom I have encountered dozens mistakenly insisting without reasoned arguments that paraphyletic taxa are never permissible), it is rightly firmly rejected by evolutionary systematists who consider the classical distinction between polyphyly and paraphyly much more important than distinguishing two forms of monophyly (paraphyly and holophyly, using the precise terminology of Ashlock (1971), where holophyletic equals monophyletic sensu Hennig)."
Computer lab 6
Computer-lab assignment 07 docx
Computer-lab assignment 07 pdf
Goals
- Be able to submit scripts to the queue on Xanadu
- Interpret dotplots that compare genomes from closely related specis, even if the two genome sequences are not given with homologous starting points.
- Be able to adjust dotplots interactively to highlight similarity between to sequences.
- Be aware in dotplots noise appears in the form of short diagonals.
Assignment for Monday
Read through the letter of Ernst Mayr criticizing Carl Woese's three domain classification. Woese's reply is here but it is rather lengthy. Instead, read the shorter argument to abolish the term prokaryotes by Norm Pace here.
Midterm
In TLS 301, bring a pencil and eraser.
Lecture 12 (10/7)
Goals
- Be prepared for the midterm
- Understand how to infer within genome recombination events in circular genomes from gene - and nucmer plots
- Know that recent data suggest rapid intein spread in local phage populations.
Links
Slides gene drive in phage populations, inferring within genome recombination events from gene and nucmer plots
Assignment
for Friday (10/11)
Make sure that you know where you saved the sequences you retrieved and analyzed in lab 4 (the phage intein and extein sequences) and lab 6 (the genome sequences of selected bacteria)
Computer lab 6
Computer-lab assignment 06 docx
Computer-lab assignment 06 pdf
Goals
- Know how cumulative strand bias helps to infer genome structure of prokaryotic genomes (Ori, leading/lagging strand, terminus of replication).
- Appreciate the power of hashes as on the fly counters and as flexible data structures to associate keys with values.
- Appreciate scripts to perform repetitive tasks
Assignment
for Monday
Think about questions to ask before the midterm.
Lecture 11 (10/2)
Goals
- Know about the debates concerning the hot origins of life.
- Understand the arguments for the domain ancestors being survivors of a catastrophe (impact?) that selected to thermophyly
- appreciate that early Earth was a more violent place than today's Earth
- know that the temperature under which an organism lives is reflected in sequence composition of proteins and RNAs
- Understand genome structure of prokaryotic genomes (Ori, leading/lagging strand, terminus of replication).
- Know two explanations that can explain the preponderance of recombination events between points equidistant to the origin of replication
- co-occurrence of recombination with replication;
- Architecture Imparting Sequences (AIMS) and strand bias are not disrupted/misplaced (i.e., these are the only recombination events that do not lead to a drop in fitness)
Links
Slides on arguments against a hyperthermophylic LUCA, Strand Bias, Recombination and AIMS
Assignments
for Friday (10/4)
- Refresh your memory on FileZilla, ssh, and Xanadu
- if you have difficulties wrapping your head around the dot -plot comparison between sequences, a very thorough explanation is in the first 14 minutes of this youtube video (45 seconds to 13.4 minutes)
for Monday (10/7)
- Think about questions you would like to have answered before the midterm.
Optional: See the articles listed in the notes to Lecture 10
Lecture 10, 9/30
Goals
- Understand the transitive property of homology and its limitation
- Appreciate the differences between different approaches to calculate MSAs
- know about the arguments in favor of a vibrant biosphere being present before 3.8 Ga BP
- appreciate the difficulties of interpreting ancient microfossils.
Links
- Discussion of the transitive property of homology, MSAs, the early evolution of life and impact frustration of an earlier biosphere: Slides
Assignments for Wednesday 10/2
-
Read through article from Tillier and Collins on Genome rearrangement by replication-directed translocation (also available on HuskyCT). Try to understand Figure 1 and 2. Can you think of alternative explanations?
-
Make sure that your electronic notebook is in good shape. In particular, you should have reflective statements on every lecture. If you do not use the OneNote class notebook (to which I have access), share the notebook with the instructors via email. (Different dates for this are floating around, try to update things by 10/2/2024 )
-
Optional, incase you are interested in the early evolution of life, here are a few articles that give more details:
Computer lab 5
Computer-lab assignment 05 docx
Computer-lab assignment 05 pdf
Goals
- know when an when not the transitive property of homology applies
- be able to log into your student account on Xanadu
- become familiar with Filezilla and ssh
- Be able to align sequences using Mafft or Muscle
Assignment for Monday
read as much of this [introduction to the unix shell]
Lecture 9 (9/25)
Goals
- Know about the command line and the UNIX operating system
- Have a rough idea of the intricacies of creating multiple sequence alignments
- Appreciate the advantages of a command line interface over a Graphics User Interface
- Know a few commands from the unix command line (cd ls pwd man cat more)
- Know about compute and head-notes on computing clusters
Links
- Intro to alignments, blast, and unix Slides
Assignments for Friday 10/1
- Go through today's Slides
- If you are motivated check out the software carpentry webpage on the commandline. If you have a mac, the terminal application, is a unix shell.
Lecture 8
Goals:
- Know the difference between PAM and Blosum matrices.
- Know what Dayhoff groups of amino acids are.
- How to measure if sequences are significantly similar.
- Understand the difference and similarities between P and E values.
- Know about "usual" cut-offs for Z-scores, P- and E-values.
- Be able to discuss the processes that may lead to the decay of significance
- Know what fishing expeditions are about
- Know what the Bonferroni correction is, and why it is not popular.
- Know what false positives and false negatives are in relation to a databank search
- Be able to discuss the processes that may lead to the decay of significance
Links
- Slides Statistics, sequence alignment, and blast searches
Assignments for Wednesday 9/25
Computer lab 4
Computer-lab assignment 04, pdf
Goals
- Know about bibliography software
- Know about the advantage of the databanks accessible through NCBI's Entrez.
- Be able to perform literature databank searches at Google scholar, SCOPUS and pubmed
- Know how to retrieve full length manuscripts ;-)
- Know that the # of publications, # of citations, and the H-index are frequently used to measure productivity and impact of scientists.
- Know how to access manuscripts similar to one that you know is relevant to you.
- Appreciate that GenBank is highly redundant.
- Know that searches at the protein level are more effective than searches at the nucleotide level.
- Be able to perform blast searches at the NCBI and phagesDB
- Recognize intein containing queries in th the graphical overview of blast searches.
Lecture 7 (9/18)
Goals
- Know about Margaret Dayhoff's contributions to bioinformatics.
- Know about the Entrez system at the NCBI
- Know about the advantages and disadvantages of databanks with or without a gatekeeper.
- Know the difference between PAM and Blosum matrices.
- Know what Dayhoff groups of amino acids are.
- How to measure if sequences are significantly similar.
- Understand the difference and similarities between P and E values.
- Know about "usual" cut-offs for Z-scores, P- and E-values.
- Know what false positives and false negatives are in relation to a databank search
Links
- Slides on Entrez, the origin of GenBank and Margaret Dayhoff; blast searches
Assignments
for Friday's (9/20) Computer Lab
- Read through the file on frequently used formats to depict sequences here
- Explore the Genbank Sample file here
- Read through http://en.wikipedia.org/wiki/FASTA_format
- Refresh your memory on Boolean operators (AND, OR, NOT) to use in advanced database searches. Here is an explanation of the Boolean operators
for Monday's Class 8 (9/23)
Lecture 6 (9/16)
Goals
- Understand the relation between substitutions and sequence divergence
- Know a few reasons why protein sequences work better to assess similarity than nucleotides
- Understand that only slow evolving genes that are under strong selection for function are suitable to trace early events in evolution.
- Appreciate Lamarck's contribution to understanding evolution.
- Understand the contributions that Woese and Fox made to the classification of life, which molecule they used, and the domains (aka Urkingdoms) they discovered
- Understand the power and the limitations of the tree of life image.
- Know about the tree and coral metaphors to depict evolution
- Understand the relationship between the 3 domains, and how the tree of life was rooted.
- Appreciate that the organismal tree is embedded in the tangled tree or network depicting genome evolution.
Links
Assignments for Wednesday class 7 (9/20)
Computer lab 3 (9/13)
Assignment 03.docx , pdf:
Characterizing homing endonuclease and self-splicing domains in inteins using chimeraX / using alphafold to predict protein structures.
Lecture 5 (9/11)
Goals
- Know what inteins are and which enzymatic activities do they have?
- Know the scientific definition of symbiosis
- Know about the possible symbiotic relationships between organisms, genes, or protein domains?
- Know the different phases of the homing cycle.
- now that inteins can be associated with a strong selective disadvantage
- Know that an environmentally heterogenous environment may allow for the long term persistence of parasites (and thus provide an alternative to the homing cycle).
Links
Assignments
for Friday (9/15)
- go through the slides on inteins,
- Bring your Google username and password.
for Monday (9/16)
- Draw a sketch for the relation between the number substitutions that occurred in evolution and the percent identity of the two sequences. (I.e. how does the observed similarity change, as more and more substitutions occur?)
- What are the endpoints (saturation levels) for 4 letter alphabet and for a 20 letter alphabet assuming a perfect alignment that aligns homologous positions.
- How does this relationship change, if some parts of the sequence are so important that the protein becomes non-functional, if a mutation occurs in these positions (i.e., these parts of the sequence are never observed to undergo any change?
- If you were to do a realistic calculation and you were to consider a nucleotide sequence, how long would it take to arrive at less than 25% identity? (tip: how similar are two random sequences that have not been aligned?)
(Note: answering these questions should not require the use of a calculator or a formula, just common sense.)
Lecture 4 (9/9)
Goals
- Know that ATP binding domains can be of very different types, and what this means for our understanding of homology.
- Have a clear understanding of homology versus sequence similarity
- know about Levinthal's paradox
- Understand the problems and limitations faced by attempts to define life
- Know who Lynn Margulis was.
- Understand the (outline of) Gaia hypothesis, and the problems it faces, and how the ITSNTS approach might overcome these.
Links
- Slides on discussion of lab#2, ATP binding sites, convergent evolution.
- Slides on Life, Natural Selection, and Gaia.pptx
Note For those who joined the course recently
Look through the slides linked below, and if they do not make sense, check the recordings on HukyCT. Ask the instructors in some things remain unclear.
Assignments:
-
Contemplate the following:
-- find arguments for an against a virus being considered alive.
-- if being part of group that can be subject to natural selection is a criterion for being alive, why should this not apply to computer life and computer viruses?
-- does the stipulation of being a "chemical"-system restrict this to "life as we know it"?
-- what argues against Traube's cells not being alive?
-
Read through the slides on Life, Natural Selection and Gaia. You can follow the links, if you are in presentation mode.
-
Read through take-home exam #1 - Wednesday is the last chance to discuss this in class before the dues date. Remember to work on the exam on your own.
THIS IS NOT A TEAM BASED LEARNING EXERCISE!
Computer lab 2 (9/8)
Assignment 02.docx , pdf:
Aligning divergent sequences and structures in Chimera
Goals:
- Have an rough understanding of the content of a protein data bank file
- Be able to save individual subunits into distinct pdb files
- Align structures of divergent proteins
- Use the structure based alignment to align the linear sequences
- Align structures of a catalytic subunit during the catalytic cycle
- Appreciate that even 80% sequence divergence (or more) can leave the protein structures very, very similar.
- Appreciate that for important proteins substitutions occur so rarely that proteins remain recognizable similar in structure AND sequence.
Assignments for Monday:
Read through the Slides on the ATP synthase (skip the intein slides), and try to understand how the evolution of ATP subunits (and other ancient duplicated genes) informs us on the early evolution of life.
Lecture 3 (9/4)
Goals:
- The ATPsynthase as rotary motor (Yoshida's experiment, proteolipids)
- The role of gene duplication and sequence divergence in the evolution of proteins;
- Know about the three domains of life (archaea, bacteria, eukaryotes) and how they are related to one another
- Appreciate that molecular evolution can study events that occurred before the last universal common ancestor
- Understand the role of ancient gene duplications in rooting the tree of life.
- Understand that RNA can be both genetic material and catalyst
- Know item that support the RNA world concept, and difficulties faced by the RNA world
Links:
-
Slides on Comp. Lab #1 and Assignments
-
Slides on ATPsynthase, ancient gene duplications and the Tree of Life
-
Slides on Homology (continued)
Assignments for Friday
- Try to wrap your head around the homology concept and its relation to significant similarity.
- Read through the slides on the ATP synthase, and try to understand how the evolution of ATP subunits (and other ancient duplicated genes) informs us on the early evolution of life.
Computer Lab #1 (9/1)
Computer-lab assignment 01 docx; pdf
Intro to Chimera: Binding Pocket - Substrate Interactions
Goals:
- Be able to launch chimera
- Display a 3 D coordinate file from the pdb (1HEW) in chimera
- Use different display settings
- Display amino acid side chains in the binding pocket of 1HEW and study the interactions between the substrate and the binding pocket.
- Calculate a Ramachandran plot, and determine where in this plot alpha helices, beta sheets, and glycine residues fall.
- Save your work as image and project.
Assignments for Wednesday see below
Lecture 2 (8/30)
Goals:
- Understand the concept of homology
- Understand that significant similarity between two primary protein sequences (that are - not of low complexity) is a strong indication that the two sequences evolved from the same ancestral sequence.
- Know how the field of Bioinformatics is commonly "defined"
- Know what terms replication, transcription and translation refer to
- Know about primary, secondary and tertiary structure of proteins
Links:
Assignments for Friday (8/30):
Assignments for Wednesday (9/4)
Contemplate the following questions (see the slides on homology for inspiration):
- Are most proteins with similar function homologous?
- Are all proteins with similar function homologous?
- Are most proteins with significant sequence similarity homologous?
- Do most homologous proteins have significant sequence similarity?
- Do most homologous proteins have similar structure?
Try to answer the following questions:
- Would in your opinion maintaining a database on beetles that contains data on where was the beetle collected, its morphology, and where is it stored in the collection fall under bioinformatics?
Would your assessment change, if partial sequences of the mitochondrial cytochome C oxidase were included for each beetle?
- Would in your opinion determining the 3D structure of a protein using X-ray crystallography fall under bioinformatics?
- How many different proteins with length of 100 aa are theoretically possible?
- At most how many aa substitutions does one need to turn one of these sequence into an another one?
- Formulate a question that you could ask on Wednesday (things you didn't understand, anything you want to hear more details about).
- Read through the slides selected from Mark Gerstein's Bioinformatics Course in the Intro slides class 1
Are there any items where you do not agree with Mark Gerstein's delineation?
Read the excerpt from Thomas Mann's book on Dr. Faustus (Dr Faustus) available on HuskyCT. Or at https://www.fadedpage.com/showbook.php?pid=20180329 (go to chapter III). This chapter can provide two insights:
- Scientific experiments in parlors, salons and living rooms were frequent and common entertainment in the early 1900s.
- The distinction between living systems and the mineral world was not established. Apparently life could be easily created from non-living constituents. My favorite example are Traube's cells. In the past I did the experiment in class, but now you have to watch the you tube version instead: How to grow an artificial cell from water and salts ("Traube Cell" experiment).
- The membranes that form were the starting point to build the first osmometer and an important step in the development of cell theory. They clearly are not alive, but they grow and do look a lot like red algae.
Ask a question (not limited to Dr. Faustus) on the huskyCT discussion board
Lecture 1 (8/26)
Goals:
- Know how to contact the instructor and TA.
- Know how your performance will be assessed and graded.
- Know that take-home exams and computer lab assignments are an important part of this course, and that they will be graded.
- Know that you need to maintain an electronic notebook
- Be able to define/circumscribe the field of Bioinformatics
Links:
Assignments for Wednesday (8/30):
- Study the Syllabus! Ask questions, if expectations are not clear.
- Consider if you want to participate in the CURE (course based undergraduate research experience) project.
- Read through the [Slides on Homology] Note: "Read through" is short for read it, but don't overdo the studying. (https://j.p.gogarten.uconn.edu/mcb3421_2024/class01_2024_homology.pptx)
- Make yourself familiar with the OneNote electronic notebook and write an entry for "lecture 1" (or set up your notebook in Joplin).
Assignments for Friday (8/30)