The output of PRSS for two related sequences (V-ATPase A-subunits from fly and Giardia) is given here:
s-w est
< 38 0 0:
40 0 0:
42 0 0:
44 1 1:*
46 5 3:==*==
48 7 5:====*==
50 12 8:=======*====
52 11 9:========*==
54 10 10:=========*
56 10 10:=========*
58 9 10:=========*
60 4 8:==== *
62 4 7:==== *
64 7 6:=====*=
66 3 5:=== *
68 4 4:===*
70 3 3:==*
72 2 2:=*
74 1 2:=*
76 0 1:*
78 1 1:*
80 2 1:*=
82 1 1:*
84 0 0:
86 0 0:
88 0 0:
90 0 0:
92 1 0:=
94 2 0:==
96 0 0:
98 0 0:
100 0 0:
102 0 0:
>104 0 0: O
giardiaA.txt, 654 aa vs DrosA.txt
61400 residues in 100 sequences,
BLOSUM50 matrix, gap penalties: -12,-2
unshuffled s-w score: 1861; shuffled score range: 45 - 96
Lambda: 0.1405 K: 0.0059872; P(1861)= 7.1435e-111
For 100 sequences, a score >=1861 is expected 7.14e-109 times
The stars * denote the distribution fitted to the randomized data.
The “probability” of the actual alignment score (or better) is calculated based on this distribution.
The histogram improves when more shuffling rounds are included, but the bottom line stays the same:
giardiaa.txt, 654 aa vs drosA.txt
// stuff deleted
96 0 1:*
98 0 0:
100 1 0:=
102 0 0:
104 0 0:
106 0 0:
108 0 0:
>110 0 0: O
614000 residues in 1000 sequences,
BLOSUM50 matrix, gap penalties: -12,-2
unshuffled s-w score: 1861; shuffled score range: 42 - 102
Lambda: 0.15491 K: 0.012682; P(1861)= 3.4764e-122
For 1000 sequences, a score >=1861 is expected 3.48e-119 times
Comparing two less related sequences (ATPase involved in protein export versus V-ATPase A-subunit) one obtains:
giardiaa.txt, 654 aa vs flii.txt
s-w est
< 36 0 0:
38 0 0:
40 0 0:
42 2 1:*=
44 4 3:==*=
46 12 5:====*=======
48 3 7:=== *
50 7 9:======= *
52 10 10:=========*
54 10 10:=========*
56 9 9:========*
58 7 9:======= *
60 7 7:======*
62 8 6:=====*==
64 7 5:====*==
66 3 4:===*
68 4 3:==*=
70 0 3: *
72 1 2:=*
74 2 2:=*
76 0 1:*
78 0 1:*
80 1 1:*
82 1 1:*
84 1 0:=
86 0 0:
88 0 0:
90 0 0:
92 0 0:
94 1 0:=
96 0 0:
98 0 0:
100 0 0:
102 0 0:
>104 0 0: O
43400 residues in 100 sequences,
BLOSUM50 matrix, gap penalties: -12,-2
unshuffled s-w score: 269; shuffled score range: 43 - 96
Lambda: 0.13635 K: 0.0055017; P(269)= 1.9668e-13
For 100 sequences, a score >=269 is expected 1.97e-11 times
And for sequences whose relationship is either not detected or they are unrelated, the output looks as follows:
test2.txt, 565 aa vs flii.txt
s-w est
< 36 0 0:
38 0 0:
40 0 0:
42 3 1:*==
44 8 4:===*====
46 6 7:======*
48 13 11:==========*==
50 11 12:===========*
52 9 13:========= *
54 15 12:===========*=== O
56 8 10:======== *
58 6 8:====== *
60 6 6:=====*
62 7 5:====*==
64 2 3:==*
66 3 2:=*=
68 2 2:=*
70 0 1:*
72 0 1:*
74 0 1:*
76 0 0:
78 0 0:
80 0 0:
82 0 0:
84 0 0:
86 0 0:
88 1 0:=
90 0 0:
92 0 0:
94 0 0:
96 0 0:
> 98 0 0:
43400 residues in 100 sequences,
BLOSUM50 matrix, gap penalties: -12,-2
unshuffled s-w score: 54; shuffled score range: 43 - 90
Lambda: 0.17371 K: 0.032327; P(54)= 0.5179
For 100 sequences, a score >=54 is expected 52 times