Discussion of class expectations

Class 10 - Genome Dot Plots:

Comparison of ORFs (as encoded in amino acid sequences) in two completely sequenced genomes

  1. Download 2 genomes of your choice from NCBI genomes (in FASTA format, amino acid sequences)
  2. Install stand-alone BLAST on your computer. Free download for different platforms is at: ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.8/
  3. Format one genome (as a database) [execute "formatdb -i my_genome_file_name -o T -p T"]
  4. Do a BLAST search of one genome (a query genome) against the other genome (a database genome) [blastall -i genome1_filename -d genome2_filename -e 0.0001 -pblastp -I T -m 8 -o output_filename]
  5. Save each BLAST hit per ORF (you will have to parse the results of the BLAST search obtained above)
  6. Determine the order of ORF appearances in the genome (hint: you can extract that info from the exercise we did last class: by parsing the GenBank file of the genome)
  7. Plot ORFs (in order of their appearance in the genome) of one genome on X axis, the other genome on the Y axis.
  8. Interesting to plot a genome against itself (to find duplications and inversions in the genome).

For example of genome dot plot click here (scroll to the bottom)