There was
a problem with oligo2.pl - I think.
The modified version is called oligo3.pl. It
removes all complements, including palindromes
(e.g.,TTTAAA). This is ok in our case, because they don't have
any strand bias anyhow.
Using the
new oligo6 and the master.pl program I got the
following results:
AquifexE000657
252 AAAGAA
242 AAGAAA GAAGAA
Thermotoga_maritimaMS8
365 CTTTTC
324 GAAAAA
Ecoli_cft073AE014075 (gamma prot)
764 CAGCGC
740 CCACCA
EcoliO157BA000007 (gamma
prot)
767 CAGCGC
743 CCACCA
Ecoli_K12U00096 (gamma
prot)
704 CAGCGC
609 CGCCAC
587 CCACCA
Salmonella_typhiAE014613 (gamma
prot)
824 CCGCCA
711 CGCCAG
Salmonella_typhimuriumAE006468 (gamma
prot)
937 CCGCCA
918 CAGCGC
Shigella_flexneriAE005674 (gamma
prot)
689 CAGCGC
598 CCACCA
vibrio_choleraeAE003852 (gamma
prot)
505 CACCAC
491 CCACCA
399 CATCAA
Vibrio_parahaemBA000031 (gamma
prot)
562 CACCAA
514 CCACCA
509 TCACCA
vibrio_vulnificusAE016795 (gamma
prot)
673 CACCAA
669 CCACCA
611 CACCAC
Shewanella_oneidensisAE014299 (gamma
prot)
692 CACCAC
628 AAATCA
Xanthomonas_citriAE008923 (gamma
prot)
945 CACCGC
854 CAGCGC
804 CAGCAC
xylella_fastAE009442 (gamma
prot)
1095 CAACAC
881 CAGCAA
877 CAACAA
Buchnera_spBA000003 (gamma
prot)
1538 AAAAAA
722 AAAAAT
wigglesworthiaBA000021 (gamma
prot)
727 AAAAAA
464 AAAAAT
WolbachiaAE017196 (gamma
prot)
487 AAAAAA
277 AAGAAA
272 GAAAAA
Hemophilus1
281 AATAAA
231 AACCAA
P_gingivalis
171 AAGAAA
155 AAAAAA
153 CCGGCA
** Is there a limit to how large one could make the oligos? Why would it be meaningless to calculate the bias for a 30mer?
** Do different organisms have the same or similar oligo strand bias? How quickly does the bias change (with respect to the oligo sequence), when one moves to less related organisms. Possible questions to address: Do archaea have the same bias as bacteria? How about genomes for the same species or the same genus?
** To what extent can the oligo starnd bias be explained by the nucleotide stand bios? Can one calculate how big an oligo bias should be expected from a given single nucleotide bias?