HOMEWORK ASSIGNMENT #6:

Read chapter 5
Write a script that reads in a nucleotide sequence from a file in Genbank format, and puts out a file in FASTA format. Implement an informative annotation line in the FASTA formated file.
Improve your count bases in genome program
Add a counter of nucleotide excesses (A over T, or G over C, or keto over amino base excess ((G+T)-(A+C))). Print the cumulative excess into a table and plot the result with gnuplot.
What does the result mean? Which of the above measures (any others you could try?) shows most bias?

Extra challenge:
Does the same work for dinucleotide bias? How about larger oligonucleotides? Try to implement the former, and, if you have energy to spare, write some "pseudocode" for the latter (oligonucleotides).