Baron Karl Friedrich Hieronymus von Münchhausen |
Bootstrapping
is one of the most popular ways to assess the reliability of branches.The
term bootstrapping goes back to the Baron Münchhausen (pulled himself
out of a swamp by his shoe laces). Briefly, positions of the aligned
sequences are randomly sampled from the multiple sequence alignment
with replacements. The sampled
positions are assembled into new data sets, the so-called bootstrapped
samples. Each position has an about 63% chance to make it into a particular
bootstrapped sample. If a grouping has a lot of support, it will be
supported by at least some positions in each of the bootstrapped samples,
and all the bootstrapped samples will yield this grouping. Bootstrapping
can be applied to all methods of phylogenetic reconstruction. Bootstrapping has become very popular to assess the reliability of reconstructed phylogenies. Its advantage is that it can be applied to different methods of phylogenetic reconstruction, and that it assigns a probability-like number to every possible partition of the dataset (= branch in the resulting tree). Its disadvantage is that the support for individual groups decreases as you add more sequences to the dataset, and that it just measures how much support for a partition is in your data given a method of analysis. If the method of reconstruction falls victim to a bias or an artifact, this will be reproduced for every of the bootstrapped samples, and it will result in high bootstrap support values. For information on bootstrapping and non-informative sites go here.
|
Creating a bootstrapped sample Joe Felsenstein describes the bootstrap procedure in his manual to the seqboot program (part of the PHYLIP package, the manual is here, the citations here) as follows:
The sample input and output of the seqboot program illustrates the generation of the bootstrapped samples:
TEST DATA SET
CONTENTS OF OUTPUT FILE(If Replicates are set to 10 and seed to 4333)
|