Return to Home

Minimal Values for Reliability of Bootstrap and Jackknife Proportions, Decay Index, and Bayesian Posterior Probability: Supplement to Poster Session at Molecular Systematics of Bryophytes: Progress, Problems and Perspectives, MBG, September 6, 2003

 

Richard H. Zander

Missouri Botanical Garden, PO Box 299, St. Louis, MO 63166--0299 USA; September 6, 2003

 


Nonparametric bootstrap and jackknife proportions (BP and JK), the Decay Index (DI), and Bayesian posterior probabilities (BPP) were obtained from artificial 4-taxon data sets predetermined to have .95 confidence limits through an exact binomial calculation. The binomial confidence interval (CI) is 1 minus the chance of the data occurring randomly, the null being a star, and the alternative hypothesis is shared ancestry as a explanation of the optimal branch arrangement. AB is the support in steps for the tree ((AB)C,D), which is here always the optimal tree), AC is the support for ((AC)B,D), and BC for ((BC)A, D). Once we have the optimal tree, the number of steps represented by AC and BC is assumed generated randomly by parallelism (only one arrangement can be supported by shared ancestry). The common measures of reliability all respond to variation in AC:BC ratio, with higher ratios yielding lower reliability values, but the binomial CI does not vary with AC:BC ratio. Therefore, only a nearly equal ratio of AC:BC will unambiguously signal the minimum BP, JK, DI and BPP values that correspond to a .95 binomial CI.



Table of optimal branch lengths and the minimum values of some common reliability measures needed to unambiguously attain a binomial confidence interval of .95. Interpolation may be needed.


Length of
Optimal Branch

Max. AB:AC:BC

Needed for .95 CI

Min. BP
Needed

Min. JP
Needed

Min. DI
Needed

Min. BPP.
Needed

3

03:00:00

1.00

1.00

3

.99

4

04:01:00

.95

.79

3

.99

5

05:01:01

.95

.90

4

1.00

10

10:04:04

.91

.92

6

1.00

15

15:08:07

.91

.91

7

.99

20

20:12:11

.89

.89

8

.98

25

25:16:15

.89

.89

9

.98

30

30:20:19

.89

.89

10

.97

35

35:24:23

.88

.89

11

.95

40

40:28:27

.89

.89

12

.96

45

45:32:32

.87

.88

13

.95

50

50:37:36

.87

.87

13

.93

55

55:41:40

.87

.87

14

.92

60

60:45:45

.88

.88

15

.91

 


 

Thus, if you know the both the branch length and reliability value for internodes, you can gauge the binominal CI. If any weighting of steps genuinely reflects expected likelihood of individual evolutionary events, then this should work for molecularly based cladograms.

 

Addendum: For those interested in binomial CI's of .90 and .99, tables for BP and BPP are given below. The local DI's are easy to calculate as the difference between AB and AC.

 

 

90% binomial CI

Len. AB:AC:BC BP BPP

05 05:02:01 .87 .98

10 10:05:05 .85 .98

15 15:09:09 .82 .96

20 20:13:13 .82 .95

25 25:17:17 .82 .94

30 30:22:21 .82 .94

35 35:26:25 .80 .91

40 40:30:30 .80 .90

45 45:35:34 .80 .88

50 50:39:39 .79 .86

55 55:44:43 .80 .84

60 60:48:48 .79 .83

 

 

99% binomial CI

Len. AB:AC:BC BP BPP

05 n.a . n.a 1.00

10 10:03:02 .98 1.00

15 15:06:05 1.00 1.00

20 20:09:09 .97 1.00

25 25:13:12 .97 1.00

30 30:16:16 .97 1.00

35 35:20:20 .96 1.00

40 40:24:24 .97 .99

45 45:28:28 .95 .99

50 50:32:32 .96 .99

55 55:36:36 .96 .99

60 60:41:40 .95 .98

 


Some Relevant Literature

 

BAYESIAN POSTERIOR PROBABILITY

Huelsenbeck, J. P., B. Larget, R. E. Miller, and F. Ronquist. 2002. Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. 51: 673--688.

Huelsenbeck, J. P., B. Rannala, and B. Larget. 2000. A Bayesian framework for the analysis of cospeciation. Evolution 54: 352--364.

Lewis, P. O. 2001. Phylogenetic systematics turns over a new leaf. Trends Ecol. Evol. 16:30--37.

 

 

BINOMIAL CONFIDENCE INTERVAL

Zander, R. H. 2001. A conditional probability of reconstruction measure for internal cladogram branches. Syst. Biol. 50:425--437.

Zander, R. H. 2003. Reliable phylogenetic resolution of morphological data can be better than that of molecular data. Taxon 52: 109--112.

 

 

BOOTSTRAP CORRECTION FORMULAE

Efron, B., E. Halloran, and S. Holmes. 1996. Bootstrap confidence intervals for phylogenetic trees. Proc. Natl. Acad. Sci. USA 93:7085--7090.

Rodrigo, A. G. 1993. Calibrating the bootstrap test of monophyly. Int. J. Parasitol. 23:507--514.

Salamin, N. T. R. Hodkinson, and V. Savolainen. 2002. Building supertrees: an empirical assessment using the grass family (Poaceae). Syst. Biol. 51:136--150.

Sanderson, M. J., and M. F. Wojciechowski. 2000. Improved bootstrap confidence limits in large-scale phylogenies, with an example from neo-Astragalus (Leguminosae). Syst. Biol. 49:671--685.

Zharkikh, A., and W.-H. Li. 1995. Estimation of confidence in phylogeny: Complete-and-partial bootstrap technique. Mol. Phylogen. Evol. 4:44--63.

 

 

DECAY INDEX

Bremer, K. 1988. The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42:795--803.

Bremer, K. 1994. Branch support and tree stability. Cladistics 10:295-304

DeBry, R.W. 2001. Improving interpretation of the decay index for DNA sequence data. Syst. Bio. 50:742--752.

Goloboff, P. A., And J. S. Farris. 2001. Methods for quick consensus estimation. Cladistics 17: 526-534.

Morgan, D. R. 1997. Decay analysis of large sets of phylogenetic data. Taxon 46:509--517.

Oxelman, B, M. Backlund, and B. Bremer. 1999. Relationships of the Buddlejaceae s. l. investigated using parsimony, jackknife and branch support analysis of chloroplast ndhF and rbcL sequence data. Syst. Bot. 24: 164--182.

Rice, K. A., M. J. Donoghue, and R. G. Olmstead. 1997. Analyzing large data sets: rbcL 500 revisited. Syst. Biol. 46: 554--563.

Yee, M. S. Y. 2000. Tree robustness and clade significance. Syst. Biol. 49: 829--836.

 

 

EVALUATING CONTRARY SUPPORT

Wilkinson, M., F.-J. Lapointe, and D. J. Gower. 2003. Branch lengths and support. Syst. Biol. 52:127--130.

 

 

OVER AND UNDER CREDIBILITY OF BBP

Wilcox, T. P., D. J. Zwickl, T. Heath, and D. M. Hillis. 2002. Phylogenetic relationship of the dwarf boas and a comparison of Bayesian and bootstrap measures of phylogenetic support. Molecular Phylogenetics and Evolution 25:361--371.

Yoshiyuki, S., G. V. Glazko, and M. Nei. 2002. Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc. Nat. Acad. Sci. 99: 16138--16143.

 

 

PROBLEMS WITH BOOTSTRAP

Hillis, D. M., and J. J. Bull. 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42:182--192.

Douady, C. J., F. Delsuc, Y. Boucher, W. F. Doolittle, and E. J. Douzery. 2003. Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol. Biol. Evol. 20: 248--254.

Sanderson, M. J. 1989. Confidence limits on phylogenies: the bootstrap revisited. Cladistics 5:113--129.

Sanderson, M. J. 1995. Objections to bootstrapping phylogenies: a critique. Syst. Biol. 44:299--320.

 

 

SOFTWARE

Hammer, ., and D. A. T. Harper. 2003. PAST v. 1.12. http://folk.uio.no/hammer/past

Huelsenbeck, J. P., and F. Ronquist. 2001. MrBayes: v30B4. Bayesian Analysis of Phylogeny. University of California, San Diego, and Dept. of Systematic Zoology, Uppsala University.

Lowry, R. 2000. VassarStats: Web site for statistical computation. Department of Psychology, Vassar College, Poughkeepsie, New York. http://faculty.vassar.edu/~lowry/VassarStats.html, Jan. 25, 2000.

Swofford, D. L. 1998. PAUP*. Phylogenetic Analysis Using Parsimony (* and Other Methods). Ver. 4. Sinauer Associates, Sunderland, Massachusetts.