Discussion of class expectations
Class 10 - Genome Dot Plots:
Comparison of ORFs (as encoded in amino acid sequences) in two completely sequenced genomes
- Download 2 genomes of your choice from NCBI genomes (in FASTA format, amino acid sequences)
- Install stand-alone BLAST on your computer. Free download for different platforms is at:
ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.8/
- Format one genome (as a database) [execute "formatdb -i my_genome_file_name -o T -p T"]
- Do a BLAST search of one genome (a query genome) against the other genome (a database genome)
[blastall -i genome1_filename -d genome2_filename -e 0.0001 -pblastp -I T -m 8 -o output_filename]
- Save each BLAST hit per ORF (you will have to parse the results of the BLAST search obtained above)
- Determine the order of ORF appearances in the genome (hint: you can extract that info from the exercise we did last class:
by parsing the GenBank file of the genome)
- Plot ORFs (in order of their appearance in the genome) of one genome on X axis, the other genome
on the Y axis.
- Interesting to plot a genome against itself (to find duplications and inversions in the genome).
For example of genome dot plot click here
(scroll to the bottom)