Bootstrapping and non-informative sites

It is a good idea to only use those positions for phylogenetic analyses for which one is sure that the sites are indeed homologous. However, in order to obtain a bootstrap value better than 90% for a branch one only needs three sites that change along this branch. These bootstrap values are not lowered by adding non-informative sites to the alignment. An example of four sequences is discussed that has 5 positions supporting the central branch in one orientation and 2 positions each supporting the two alternatives:

spec1 ACGTG AC CG

spec2 ACGTG GG AA

spec3 TGCAC GG CG

spec4 TGCAC AC AA

 Regardless how many non informative sites are added, the grouping of spec1 with spec2 (and spec3 with spec4) is found in about 80% of the bootstrapped samples.

 

Example of sequence file with added non-informative sites:

spec1 AAGCAGCTGT AAGCAGCTGT AAGCAGCTGT AAGCAGCTGT AAGCAGCTGT ACGTG ACCG

spec2 AACCGCCTGT AACCGCCTGT AACCGCCTGT AACCGCCTGT AACCGCCTGT ACGTG GGAA

spec3 AACGACCTCT AACGACCTCT AACGACCTCT AACGACCTCT AACGACCTCT TGCAC GGCG

spec4 ATCCACCAGA ATCCACCAGA ATCCACCAGA ATCCACCAGA ATCCACCAGA TGCAC ACAA

 

Table of bootstrap support:

Spec1 with 2

Spec1 with 3 or 1 with 4

5 informative sites support this branch

2 informative sites each
support this branch

+0 non-informative sites

82.5

12.17

77.83

3.67

84.17

11

6.5

11.83

10.33

Mean of 3 replicates with 100 bootstrapped samples each

81.50

9.25

Standard deviation

3.29

3.41

+50 non-informative sites

77.00

13.00

77.00

10.00

78.83

14.00

9.00

11.33

9.83

Mean of 3 replicates with 100 bootstrapped samples each

77.61

11.19

Standard deviation

1.06

1.96

+200 non-informative sites

78.83

14.33

78.33

6.83

76.50

12.83

8.83

14.50

9.00

Mean of 3 replicates with 100 bootstrapped samples each

77.89

11.05

Standard deviation

1.23

3.25

+2000 non-informative sites

79.67

10.67

83.33

9.67

81.33

10.33

6.33

10.83

7.83

Mean of 3 replicates with 100 bootstrapped samples each

81.44

9.28

Standard deviation

1.83

1.81