HOMEWORK ASSIGNMENT #6:
- Read chapter 5
- Write a script that reads in a nucleotide sequence from a file in Genbank
format, and puts out a file in FASTA format. Implement an informative annotation
line in the FASTA formated file.
- Improve your count bases in genome program
- Add a counter of nucleotide excesses (A over T, or G over C, or keto over
amino base excess ((G+T)-(A+C))). Print the cumulative excess into a table
and plot the result with gnuplot.
- What does the result mean? Which of the above measures (any others you
could try?) shows most bias?
Extra challenge:
Does the same work for dinucleotide bias? How about larger oligonucleotides?
Try to implement the former, and, if you have energy to spare, write some "pseudocode"
for the latter (oligonucleotides).