Lecture 7 (9/18)
Goals
- Know about Margaret Dayhoff's contributions to bioinformatics.
- Know about the Entrez system at the NCBI
- Know about the advantages and disadvantages of databanks with or without a gatekeeper.
- Know the difference between PAM and Blosum matrices.
- Know what Dayhoff groups of amino acids are.
- How to measure if sequences are significantly similar.
- Understand the difference and similarities between P and E values.
- Know about "usual" cut-offs for Z-scores, P- and E-values.
- Know what false positives and false negatives are in relation to a databank search
Links
- Slides on Entrez, the origin of GenBank and Margaret Dayhoff; blast searches
Assignments
for Friday's (9/20) Computer Lab
- Read through the file on frequently used formats to depict sequences here
- Explore the Genbank Sample file here
- Read through http://en.wikipedia.org/wiki/FASTA_format
- Refresh your memory on Boolean operators (AND, OR, NOT) to use in advanced database searches. Here is an explanation of the Boolean operators
for Monday's Class 8 (9/23)
Lecture 6 (9/16)
Goals
- Understand the relation between substitutions and sequence divergence
- Know a few reasons why protein sequences work better to assess similarity than nucleotides
- Understand that only slow evolving genes that are under strong selection for function are suitable to trace early events in evolution.
- Appreciate Lamarck's contribution to understanding evolution.
- Understand the contributions that Woese and Fox made to the classification of life, which molecule they used, and the domains (aka Urkingdoms) they discovered
- Understand the power and the limitations of the tree of life image.
- Know about the tree and coral metaphors to depict evolution
- Understand the relationship between the 3 domains, and how the tree of life was rooted.
- Appreciate that the organismal tree is embedded in the tangled tree or network depicting genome evolution.
Links
Assignments for Wednesday class 7 (9/20)
Computer lab 3 (9/13)
Assignment 03.docx , pdf:
Characterizing homing endonuclease and self-splicing domains in inteins using chimeraX / using alphafold to predict protein structures.
Lecture 5 (9/11)
Goals
- Know what inteins are and which enzymatic activities do they have?
- Know the scientific definition of symbiosis
- Know about the possible symbiotic relationships between organisms, genes, or protein domains?
- Know the different phases of the homing cycle.
- now that inteins can be associated with a strong selective disadvantage
- Know that an environmentally heterogenous environment may allow for the long term persistence of parasites (and thus provide an alternative to the homing cycle).
Links
Assignments
for Friday (9/15)
- go through the slides on inteins,
- Bring your Google username and password.
for Monday (9/16)
- Draw a sketch for the relation between the number substitutions that occurred in evolution and the percent identity of the two sequences. (I.e. how does the observed similarity change, as more and more substitutions occur?)
- What are the endpoints (saturation levels) for 4 letter alphabet and for a 20 letter alphabet assuming a perfect alignment that aligns homologous positions.
- How does this relationship change, if some parts of the sequence are so important that the protein becomes non-functional, if a mutation occurs in these positions (i.e., these parts of the sequence are never observed to undergo any change?
- If you were to do a realistic calculation and you were to consider a nucleotide sequence, how long would it take to arrive at less than 25% identity? (tip: how similar are two random sequences that have not been aligned?)
(Note: answering these questions should not require the use of a calculator or a formula, just common sense.)
Lecture 4 (9/9)
Goals
- Know that ATP binding domains can be of very different types, and what this means for our understanding of homology.
- Have a clear understanding of homology versus sequence similarity
- know about Levinthal's paradox
- Understand the problems and limitations faced by attempts to define life
- Know who Lynn Margulis was.
- Understand the (outline of) Gaia hypothesis, and the problems it faces, and how the ITSNTS approach might overcome these.
Links
- Slides on discussion of lab#2, ATP binding sites, convergent evolution.
- Slides on Life, Natural Selection, and Gaia.pptx
Note For those who joined the course recently
Look through the slides linked below, and if they do not make sense, check the recordings on HukyCT. Ask the instructors in some things remain unclear.
Assignments:
-
Contemplate the following:
-- find arguments for an against a virus being considered alive.
-- if being part of group that can be subject to natural selection is a criterion for being alive, why should this not apply to computer life and computer viruses?
-- does the stipulation of being a "chemical"-system restrict this to "life as we know it"?
-- what argues against Traube's cells not being alive?
-
Read through the slides on Life, Natural Selection and Gaia. You can follow the links, if you are in presentation mode.
-
Read through take-home exam #1 - Wednesday is the last chance to discuss this in class before the dues date. Remember to work on the exam on your own.
THIS IS NOT A TEAM BASED LEARNING EXERCISE!
Computer lab 2 (9/8)
Assignment 02.docx , pdf:
Aligning divergent sequences and structures in Chimera
Goals:
- Have an rough understanding of the content of a protein data bank file
- Be able to save individual subunits into distinct pdb files
- Align structures of divergent proteins
- Use the structure based alignment to align the linear sequences
- Align structures of a catalytic subunit during the catalytic cycle
- Appreciate that even 80% sequence divergence (or more) can leave the protein structures very, very similar.
- Appreciate that for important proteins substitutions occur so rarely that proteins remain recognizable similar in structure AND sequence.
Assignments for Monday:
Read through the Slides on the ATP synthase (skip the intein slides), and try to understand how the evolution of ATP subunits (and other ancient duplicated genes) informs us on the early evolution of life.
Lecture 3 (9/4)
Goals:
- The ATPsynthase as rotary motor (Yoshida's experiment, proteolipids)
- The role of gene duplication and sequence divergence in the evolution of proteins;
- Know about the three domains of life (archaea, bacteria, eukaryotes) and how they are related to one another
- Appreciate that molecular evolution can study events that occurred before the last universal common ancestor
- Understand the role of ancient gene duplications in rooting the tree of life.
- Understand that RNA can be both genetic material and catalyst
- Know item that support the RNA world concept, and difficulties faced by the RNA world
Links:
-
Slides on Comp. Lab #1 and Assignments
-
Slides on ATPsynthase, ancient gene duplications and the Tree of Life
-
Slides on Homology (continued)
Assignments for Friday
- Try to wrap your head around the homology concept and its relation to significant similarity.
- Read through the slides on the ATP synthase, and try to understand how the evolution of ATP subunits (and other ancient duplicated genes) informs us on the early evolution of life.
Computer Lab #1 (9/1)
Computer-lab assignment 01 docx; pdf
Intro to Chimera: Binding Pocket - Substrate Interactions
Goals:
- Be able to launch chimera
- Display a 3 D coordinate file from the pdb (1HEW) in chimera
- Use different display settings
- Display amino acid side chains in the binding pocket of 1HEW and study the interactions between the substrate and the binding pocket.
- Calculate a Ramachandran plot, and determine where in this plot alpha helices, beta sheets, and glycine residues fall.
- Save your work as image and project.
Assignments for Wednesday see below
Lecture 2 (8/30)
Goals:
- Understand the concept of homology
- Understand that significant similarity between two primary protein sequences (that are - not of low complexity) is a strong indication that the two sequences evolved from the same ancestral sequence.
- Know how the field of Bioinformatics is commonly "defined"
- Know what terms replication, transcription and translation refer to
- Know about primary, secondary and tertiary structure of proteins
Links:
Assignments for Friday (8/30):
Assignments for Wednesday (9/4)
Contemplate the following questions (see the slides on homology for inspiration):
- Are most proteins with similar function homologous?
- Are all proteins with similar function homologous?
- Are most proteins with significant sequence similarity homologous?
- Do most homologous proteins have significant sequence similarity?
- Do most homologous proteins have similar structure?
Try to answer the following questions:
- Would in your opinion maintaining a database on beetles that contains data on where was the beetle collected, its morphology, and where is it stored in the collection fall under bioinformatics?
Would your assessment change, if partial sequences of the mitochondrial cytochome C oxidase were included for each beetle?
- Would in your opinion determining the 3D structure of a protein using X-ray crystallography fall under bioinformatics?
- How many different proteins with length of 100 aa are theoretically possible?
- At most how many aa substitutions does one need to turn one of these sequence into an another one?
- Formulate a question that you could ask on Wednesday (things you didn't understand, anything you want to hear more details about).
- Read through the slides selected from Mark Gerstein's Bioinformatics Course in the Intro slides class 1
Are there any items where you do not agree with Mark Gerstein's delineation?
Read the excerpt from Thomas Mann's book on Dr. Faustus (Dr Faustus) available on HuskyCT. Or at https://www.fadedpage.com/showbook.php?pid=20180329 (go to chapter III). This chapter can provide two insights:
- Scientific experiments in parlors, salons and living rooms were frequent and common entertainment in the early 1900s.
- The distinction between living systems and the mineral world was not established. Apparently life could be easily created from non-living constituents. My favorite example are Traube's cells. In the past I did the experiment in class, but now you have to watch the you tube version instead: How to grow an artificial cell from water and salts ("Traube Cell" experiment).
- The membranes that form were the starting point to build the first osmometer and an important step in the development of cell theory. They clearly are not alive, but they grow and do look a lot like red algae.
Ask a question (not limited to Dr. Faustus) on the huskyCT discussion board
Lecture 1 (8/26)
Goals:
- Know how to contact the instructor and TA.
- Know how your performance will be assessed and graded.
- Know that take-home exams and computer lab assignments are an important part of this course, and that they will be graded.
- Know that you need to maintain an electronic notebook
- Be able to define/circumscribe the field of Bioinformatics
Links:
Assignments for Wednesday (8/30):
- Study the Syllabus! Ask questions, if expectations are not clear.
- Consider if you want to participate in the CURE (course based undergraduate research experience) project.
- Read through the [Slides on Homology] Note: "Read through" is short for read it, but don't overdo the studying. (https://j.p.gogarten.uconn.edu/mcb3421_2024/class01_2024_homology.pptx)
- Make yourself familiar with the OneNote electronic notebook and write an entry for "lecture 1" (or set up your notebook in Joplin).
Assignments for Friday (8/30)