Skip to main content
Log in

Decision Theory, Relative Plausibility and the Criminal Standard of Proof

  • Original Paper
  • Published:
Criminal Law and Philosophy Aims and scope Submit manuscript

Our approach does not eliminate the need for you to make judgments and to express preferences; anyone who so claims is a charlatan.

(Irving H. LaValle, Fundamentals of Decision Analysis, 1978, at 13, italics as in original).

Abstract

The evolution of the understanding of evidence-based proof and decision processes in the law, especially criminal law, and standards of proof in this area, has a long-standing and controversial history. Competing accounts cause the legal scholarship to engage in critical and thoughtful exchanges. Some of the divergent views reflect different methodological perspectives similarly recognized in other fields, such as applied psychology and economy, and the broader interdisciplinary research fields of judgment and decision-making, system analysis and decision science. One such methodological perspective asserts that accounts of juridical proof should provide a description and explanation of how the legal system actually works as a whole. Other—more mathematical and analytical accounts—concentrate on how, ideally, legal decision-making under uncertainty ought to be made in order be considered sensible. This paper focuses on the relative plausibility (RP) account advocated by Professors Allen and Pardo as an example of the former perspective. Its logical structure and argumentative implications are analysed using elements of decision theory, which is the prime representative of the latter, more mathematical approach to legal proof. Using formal diagrammatic schemes to depict the structural relationships between the core elements of the two accounts, it is demonstrated in what sense they can be considered logically related and congruent. The demonstration shows that the principal disagreements among the proponents of the two examined theories derive from differences in (1) the criteria used to judge the adequacy of competing accounts of legal decision-making, and (2) the level of formalization of the bases of decisions in each candidate account. This structural analysis supports the view that adherence to one or the other of the examined perspectives does not imply a contradiction, but reflects the coverage of different aspects of the same overall decision architecture. Using decision-theoretic notions, our analyses also provide a way to explain RP decisions through an explicit criterion, thus providing a reply to the recurrent critique that RP theory lacks specific means to justify its decisional framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. E.g., Ronald J. Allen & Michael S. Pardo, Relative Plausibility and its Critics, 23 The International Journal of Evidence & Proof 5–59 (2018).

  2. Ronald J. Allen & Michael S. Pardo, Clarifying Relative Plausibility: A Rejoinder, 23 The International Journal of Evidence & Proof 205–217 (2019) (“relative plausibility is not only about jury decision-making and the decision rules at trial. Its scope is much broader. It is about the entire process of proof” at 207), Allen & Pardo, supra note 1 (“What is being observed is the entire litigation process. (…) what is the best explanation of the data, where “the data” are observations of how the American legal system structures proof at trial” at 7).

  3. Michael S. Pardo & Ronald J. Allen, Juridical Proof and the Best Explanation, 27 Law and Philosophy 223–268 (2008) (“Understanding the standards in terms of competing explanations more accurately describes what occurs at trial, [and] is consistent with the [sic] our best understanding of the reasoning processes of jurors” at 261), Allen & Pardo, supra note 2 (“The primary message of relative plausibility is that from the beginning to end the legal system pushes the parties to provide competing explanations, and these explanations structure the decision that is subsequently made (even if the decision is based on an explanation not advanced by the parties).” at 208).

  4. The critical reader might immediately invoke the question of how one is to understand ‘better’. This question will be discussed later in § 2.1.

  5. Allen & Pardo, supra note 1, at 16.

  6. Id. (“any plausible explanation consistent with innocence (one that has not been disproven or eliminated) is sufficient to raise a reasonable doubt” at 27).

  7. Id., at 29.

  8. Dale A. Nance, The Limitations of Relative Plausibility Theory, 23 The International Journal of Evidence & Proof 154–160 (2019), at 157.

  9. Id., emphasis as in original.

  10. Id.

  11. Allen & Pardo, supra note 2, at 215. A number of evidence scholars have already questioned the intelligibility of an all-or-nothing choice between atomistic and holistic approaches. For more discussion see e.g. William Twining, Hot Air in the Redwoods, A Sequel to The Wind in the Willows, 86 Michigan Law Review 1523–1547 (1988), at 1543.

  12. Allen & Pardo, supra note 1, at 6. Similarly, at 34, the same authors note that “the fact-finder is essentially asked to decide which [source of causation] is more plausible”.

  13. Allen & Pardo, supra note 2, at 210.

  14. The first formal decision-theoretic account of legal proof is widely attributed to John Kaplan, Decision Theory and the Factfinding Process, 20 Stanford Law Review 1065–1092 (1968).

  15. For a decision-theoretic analysis of relative plausibility decisions in civil cases, see Alex Biedermann & Joëlle Vuille, The Decisional Nature of Probability and Plausibility Assessments in Juridical Evidence and Proof, 16 International Commentary on Evidence 1–30 (2018).

  16. Supra note 14.

  17. E.g., In re Winship, 397 U.S. 358 (1970), at 370 (Harlan, J., concurring); United States v. Parr, 516 F.2d 458 (5th Cir. 1975), at 464; California ex rel. Cooper v. Mitchell Brothers’ Santa Ana Theater, 454 U.S. 90 (1981), at 93.

  18. E.g., Richard S. Bell, Decision Theory and Due Process: A Critique of the Supreme Court’s Lawmaking for Burdens of Proof, 78 The Journal of Criminal Law & Criminology 557–585 (1987); Bernard Grofman, Mathematical Models of Juror and Jury Decision-Making: The State of the Art, in: Bruce D. Sales (Ed.), The Trial Process 305–351 (1981); David H. Kaye, Clarifying the Burden of Persuasion: What Bayesian Decision Rules Do and Do Not Do, 3 The International Journal of Evidence & Proof 1–28 (1999); Richard O. Lempert, Modeling Relevance, 75 Michigan Law Review 1021–1057 (1977).

  19. Bell (1987) supra note 18, at 558.

  20. E.g., Dale A. Nance, The Burdens of Proof: Discriminatory Power, Weight of Evidence, and Tenacity of Belief, 2016, at 23; Richard D. Friedman, The Elements of Evidence (4th Ed.), 2017, at 592 (presenting a more general formula).

  21. E.g., Ronald J. Allen, The Error of Expected Loss Minimization, 2 Law, Probability & Risk 1–7 (2003); Allen & Pardo, supra note 1.

  22. Kaye (1999), supra note 18, at 27.

  23. Note that for shortness of notation, we do not include the conditioning on background information I and the entirety of the evidence E available to the decision-maker at the time when the decision needs to be made. A more complete notation for the probability of a proposition H would be Pr(H|I,E) where ‘|’ denotes ‘conditioned on’.

  24. Decision-theoretic analyses may be extended to more than two decisions. Also, there may be more than two propositions.

  25. Technically, though, fact-finders do not ‘find’ for the defendant in case of an acquittal in criminal adjudication, for the defendant is presumed innocent. The defendant’s default status is merely preserved.

  26. In this context, ‘better’ means having the smaller expected loss.

  27. Equation (2) can also be found, in similar form, in statistical literature, e.g., James O. Berger, Statistical Decision Theory and Bayesian Analysis (2nd Ed.) 1985, at 164; José M. Bernardo & Adrian F. M. Smith, Bayesian Theory (2nd Ed.) 2000, at 391; Giovanni Parmigiani, Modeling in Medical Decision Making, A Bayesian Approach, 2002, at 87; Giovanni Parmigiani & Lurdes Inoue, Decision Theory: Principles and Approaches, 2009, at 139.

  28. See, e.g., Kaye (1999), supra note 18, at 1, for a statement of the same result, but using utilities instead of losses.

  29. See Terry Connolly, Decision Theory, Reasonable Doubt, and the Utility of Erroneous Acquittals, 11 Law and Human Behavior 101–112 (1987) for a detailed empirical investigation and discussion of value assignments for decision consequences, including unusual hypothetical situations such as the assignment of a higher utility to an erroneous acquittal than to a correct acquittal.

  30. Bell (1987), supra note 18, at 561.

  31. This terminology is based on considering the defense’s case as the null hypothesis and defining a type I error the false rejection of a null hypothesis (e.g., Grofman (1981), supra note 18, at 308, and Edward K. Cheng, Reconceptualizing the Burden of Proof, 122 The Yale Law Journal 1254–1279 (2013), at 1260).

  32. Paul Roberts & Adrian Zuckerman, Criminal Evidence, 2nd. Ed., 2010, at 226.

  33. For a detailed analysis and discussion see Michael L. DeKay, The Difference Between Blackstone-Like Error Ratios and Probabilistic Standards of Proof, 21 Law & Social Inquiry 95–132 (1996), and previously Michael O. Finkelstein, Quantitative Methods in Law: Studies in the Application of Mathematical Probability and Statistics to Legal Problems 65–78 (1978).

  34. As noted, e.g., by Ronald A. Howard (1 Bulletin of the American Mathematical Society 784–787 (1979), at 786) in his review of LaValle (Fundamentals of Decision Analysis, 1978), and in Ronald A. Howard, Decision Analysis in Systems Engineering, in: The Principles and Applications of Decision Analysis, Vol. 1: General Collection, Ronald A. Howard & James E. Matheson (Eds.), 59–93 (1983): “We all want good outcomes. (…) Everyone wants a good rather than a bad (…) – the question is how do we get there. The only thing you can control is the decision and how you go about making that decision. That is the key” (at 92–93).

  35. See also Eq. (1).

  36. For example, Kaye (1999), supra note 18, insists on that the distributions of decision-makers’ probabilities across the two types of cases, meritorious and non-meritorious, are “fantasies” (at 24).

  37. As noted above, this is an unknowable factor.

  38. Ronald A. Howard, From Influence to Relevance to Knowledge, in: Robert M. Oliver & James Q. Smith (Eds.), Influence Diagrams, Belief Nets and Decision Analysis, 3–23 (1990); Jim Q. Smith, Decision Analysis: A Bayesian Approach, 1988, at 64.

  39. E.g., John Aitchison, Choice Against Chance: An Introduction to Statistical Decision Theory, 1970, at 202–213; Rex Brown, Rational Choice and Judgment: Decision Analysis for the Decider, 2005, at 62–64; Simon French, Readings in Decision Analysis, 1989, at 27–29; Simon French, John Maule & Nadia Papamichail, Decision Behaviour, Analysis and Support, 2009, at 13–21; C. Jackson Grayson, Decisions Under Uncertainty: Drilling Decisions by Oil and Gas Operators, 1960, at 323–336 (using the term “Information flow diagram”); Dennis V. Lindley, Making Decisions, 1971, at 140–163; James E. Matheson & Ronald A. Howard, An Introduction to Decision Analysis, in: Howard & Matheson (1983), supra note 34, 17–55, at 47–51; George E. Monahan, Management Decision Making: Spreadsheet Modeling, Analysis, and Application, 2000, at 451–525; Parmigiani & Inoue (2009), supra note 27, at 126–131; Howard Raiffa, Decision Analysis, Introductory Lectures on Choices under Uncertainty, 1968; Howard Raiffa, Decision Analysis: A Personal Account of How it Got Started and Evolved, 50 Operations Research (2000), at 179–185; Jim Q. Smith (1988), supra note 38, at 10–22; Michael D. Resnik, Choices: An Introduction to Decision Theory, 1990, at 17–19; Robert Schlaifer, Analysis of Decisions Under Uncertainty, 1969 (using the term “decision diagram”, at 37–38); Howard Thomas, Decision Theory and the Manager, 1972, at 43–75; Stephen R. Watson & Dennis M. Buede, Decision Synthesis: The Principles and Practice of Decision Analysis, 1987, at 36; Robert L. Winkler, An Introduction to Bayesian Inference and Decision, 1972, at 219–295; Detlof von Winterfeldt & Ward Edwards, Decision Analysis and Behavioral Research, 1986, at 63–89.

  40. E.g., Paul Brest & Linda Hamilton Krieger, Problem Solving, Decision Making, and Professional Judgment, A Guide for Lawyers and Policymakers, 2010, at 462–473; David P. Hoffer, Decision Analysis as a Mediator’s Tool, 1 Harvard Negotiation Law Review 113–137 (1996); Howell E. Jackson, Louis Kaplow, Steven M. Shavell, W. Kip Viscusi, David Cope, Analytical Methods for Lawyers, 2003, at 1–33; Jeffrey M. Senger, Decision Analysis: Decision Analysis in Negotiation, 87 Marquette Law Review 723–735 (2004); Marc B. Victor, The Proper Use of Decision Analysis to Assist Litigation Strategy, 40 The Business Lawyer 617–929 (1985).

  41. Trees similar in general structure are given in Larry Laudan & Harry D. Saunders, Re-Thinking the Criminal Standard of Proof: Seeking Consensus About the Utilities of Trial Outcomes, 7 International Commentary on Evidence 1–34 (2009), at 5, and Reid Hastie & Robyn M. Dawes, Rational Choice in an Uncertain World, 2001, at 35.

  42. More generally, note that there can be more than two branches, depending on the number of available decision options.

  43. “The only thing you can control is the decision (…)” (Howard (1983), supra note 34, at 93).

  44. Lindley (1971), supra note 39, at 148.

  45. French, Maule & Papamichail (2009), supra note 39, at 21.

  46. Ronald A. Howard, James E. Matheson, Miley W. (Lee) Merkhofer, Allen C. Miller & D. Warner North, Comment on Influence Diagram Retrospective, 3 Decision Analysis 117–119 (2005).

  47. E.g., Ronald A. Howard & James E. Matheson, Influence Diagrams, in: The Principles and Applications of Decision Analysis, Vol. 2: Professional Collection, Ronald A. Howard & James E. Matheson (Eds.), 1983, at 719–762; Robert M. Oliver & James Q. Smith (Eds.), Influence Diagrams, Belief Nets and Decision Analysis, 1990 (Proceedings of the Conference entitled ‘Influence Diagrams for Decision Analysis, Influence and Prediction’, Engineering Systems Research Center, University of California at Berkeley, 1988); Ross D. Shachter, Evaluating Influence Diagrams, 34 Operations Research (1986), at 871–882.

  48. E.g., Robert G. Cowell, A. Philip Dawid, Steffen L. Lauritzen & David J. Spiegelhalter, Probabilistic Networks and Expert Systems: Exact Computational Methods for Bayesian Networks, 1999, at 155–188; Uffe B. Kjærulff & Anders B. Madsen, Bayesian networks and Influence Diagrams, A Guide to Construction and Analysis, 2008, at 74–91; Finn V. Jensen & Thomas D. Nielsen, Bayesian Networks and Decision Graphs, Second Edition, 2007, at 279–428; Richard E. Neapolitan, Learning Bayesian Networks, 2004, at 252–265.

  49. E.g., Robert T. Clemen & Terence Reilly, Making Hard Decisions with Decision Tools, 2001, 52–69; Monahan (2000), supra note 39, at 11–13; Kevin B. Korb & Ann E. Nicholson, Bayesian Artificial Intelligence, Second Edition, 2011; Kevin Murphy, Machine Learning: A Probabilistic Perspective, 2012, at 330–334; Stuart Russell & Peter Norvig, Artificial Intelligence: A Modern Approach, Third Edition, 2010, 626–636 (using the term “decision network”); Franco Taroni, Alex Biedermann, Silvia Bozza, Paolo Garbolino & Colin Aitken, Bayesian Networks for Probabilistic Inference and Decision Analysis in Forensic Science, 2014.

  50. Different scales may be used and, in some applications, it may be suitable to quantify costs, gains and rewards etc. directly in monetary terms.

  51. This may be different, for example, in a medical context where the probability of a patient’s future health status may be affected by the decisions made regarding medical treatment.

  52. Stated otherwise, in decision theory, it is assumed that the decision-maker can express probabilities and utilities (losses), for which the theory then provides instructions for their coherent combination.

  53. Throughout what follows, we mean RP as advocated by Professors Allen and Pardo (supra note 1).

  54. This assumption has been justified in §2.2.

  55. We acknowledge that in a ‘typical’ criminal case, the prosecution will have (almost by definition) a plausible account, at least from their own point of view, because otherwise there would not be a trial. This is, of course, not a conceptual problem, but a doctrinal and practical one. Jurisdictions have safety measures in place to prevent trials when the evidence against the defendant is weak or implausible, in order to raise efficiency (and save public money). Here, we include d1 as a decision in order to ensure generality of the argument.

  56. Unlike RP theory, conventional explanations of the criminal process would consider sufficiency as relative to the criminal standard of proof in law.

  57. Allen & Pardo, supra note 1, at 27.

  58. Id.

  59. The reader may choose other values, but should keep in mind that the general idea is to suppose that the decision-maker holds stronger beliefs in Hd than in Hp being true, so that Pr(Hd)> Pr(Hp), and prefers accurate over erroneous decision consequences, thus assigning loss values accordingly.

  60. In fact, according to Eq. 1, EL(dp) = L(dpp)Pr(Hp) + L(dpp)Pr(Hd) = 0·0.3 + 1·0.7 = 0.7, and EL(dd) = L(ddp)Pr(Hp) + L(ddd)Pr(Hd) = 0.1·0.3 + 0·0.7 = 0.03.

  61. The analysis for other RP decisions is developed later in this section.

  62. As further explained below, the dp-branch is not sketched because it is not an admissible decision under d1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex Biedermann.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors thank Ronald Allen of Northwestern University Pritzker School of Law and Paolo Garbolino of Iuav Università di Venezia for their helpful comments and suggestions. Alex Biedermann gratefully acknowledges the support of the Swiss National Science Foundation through grant No. BSSGI0_155809.

Appendix: Further Properties of the Decision-Theoretic Account of Relative Plausibility

Appendix: Further Properties of the Decision-Theoretic Account of Relative Plausibility

Consider again the example presented in §3.2.2. Fig. 6 shows the expected losses for the RP decisions d1, d2 and d3 not only for the current case where the probabilities for the prosecution’s and defence’s accounts are, respectively, 0.3 and 0.7, but for the full range of values between 0 and 1. As may be seen, RP decisions d1 (P has no PA) and d2 (P has PA, and D has a PA) are optimal, i.e. have the minimum expected loss, for a broad range of probabilities, including probabilities greater than 0.5 (i.e., when the probability of the prosecution’s account is greater than that of the defence).

Fig. 6
figure 6

Expected losses (on y-axis) of RP decisions d1 (PA has no PA), d2 (PA has PA, and D has PA) and d3 (PA has PA, and D has no PA) as a function of the probability of the prosecution’s case (x-axis), Pr(Hp), for the example discussed in the text. The dotted vertical line at Pr(Hp) = 0.3 indicates the expected losses for decisions d1 and d2, i.e. 0.03, and decision d3, i.e. 0.7. Note that these values correspond to the values illustrated in the decision tree shown in Fig. 5. The dotted vertical line at Pr(Hp) = 0.91 indicates the transition point where, for probabilities greater than this value, decision d3 (PA has PA, and D has no PA) has a smaller expected loss, and hence is better than the alternative decisions d1 and d2. The bold line indicates, for every probability Pr(Hp) between 0 and 1, the decision(s) with the minimum expected loss

There is a transition point, however, toward the right-hand side of Fig. 6. For probabilities greater than this change point, the expected loss of decision d3 (P has PA, D has no PA) is smaller than that of the alternative decisions d1 and decision d2. This threshold probability corresponds to the minimum probability necessary in the classic decision theoretic account to ensure that the decision dp (finding for the prosecution; conviction) has a smaller expected loss than the decision dd (Acquit), given a particular loss ratio as specified by Equation (2). Recall that in the case here, an erroneous finding for the prosecution (Cpd) is considered ten times worse than an erroneous acquittal (Cdp). Thus, following Equation (2), the odds defining the transition point are 10:1, corresponding to Pr(Hp) = 0.91.

It is important to keep this result in mind because when looking at Fig. 6, the skeptical reader may ask how it can be possible that for a probability of the prosecution’s account, Pr(Hp), as high as 0.8, or even 0.9, the decisions d1 (P has no PA) and d2 (P has PA, and D has a PA) have a lower expected loss, and hence are preferable to decision d3 (P has PA, and D has no PA). As noted above, the explanation for this observation stems from the chosen loss function, in particular the ratio of the loss associated with an erroneous finding for the prosecution (Cpd) and the loss associated with an erroneous acquittal (Cdp). In the case here we have chosen, for the sole purpose of illustration, a ratio of 10:1. Hence, a finding for the prosecution is not warranted, in decision theoretic terms, for situations in which the probability for the prosecution’s case, Pr(Hp), is smaller than 0.91. Case examples with loss ratios so that values of Pr(Hp) as high as 0.8 or 0.9 would be sufficient for the RP decision d3 (P has PA, and D has no PA) to be optimal in a decision-theoretic sense can be found in Table 1. More generally, note that we do not suggest, at this point, that a full numerical quantification be imposed on practical RP decisions. The sole point we seek to make is that it is possible to give a formal (mathematical) justification for the intuition that the higher the stakes involved (i.e., the more one of the two ways of deciding erroneously is considered worse than the other), the lower should be one’s quantum of doubt.

As a last example, consider a case in which the prosecution’s account Hp is considerably more probable that the account presented by the defence, Hd. Specifically, let Pr(Hp) be 0.95. Table 2 summarises a few examples of cases with different loss ratios (columns 1 to 3). The optimal RP decision and related verdict for each example is given in columns 6 and 7. Note, in particular, that for loss ratios such as 100:1, or more, a current belief of 0.95 is not sufficient to warrant the RP decision d3 (P has a PA, and D has no PA).

Table 2 Extension of Table 1 with optimal RP decision(s) (in column 6) and associated verdict (in column 7) for a hypothetical case in which the decision-maker’s probability for the prosecution’s account, Hp, is 0.95

As an aside, note that throughout this paper, all examples considered a loss ratio x ≥ 1, which means that the loss incurred by an erroneous decision dp (i.e., wrongful conviction) is at least equal or greater than the loss incurred by an erroneous decision dd (i.e., erroneous acquittal). Consideration of opposite cases seems unnecessary given current understandings of the values upheld in contemporary legal orders.

We also emphasise the important conclusion that there is no single and absolute probability to warrant a particular RP decision d3 (P has a PA, and D has no PA) and hence a finding for the prosecution (decision dp, conviction). In our decision-theoretic account of RP, the minimum probability necessary to warrant a conclusion that the prosecution has a plausible account (and that the defence does not have a plausible account) is found by weighing the odds of the two competing accounts against the losses that may be incurred by the two ways in which a verdict may turn out erroneously. This offers an answer to the recurrent critique of the RP theory that it does not provide an explicit criterion for decision-makers to determine when exactly the prosecution’s account has reached the required level of proof and, in addition, the defences’ account is insufficient to raise a reasonable doubt.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Biedermann, A., Caruso, D. & Kotsoglou, K.N. Decision Theory, Relative Plausibility and the Criminal Standard of Proof. Criminal Law, Philosophy 15, 131–157 (2021). https://doi.org/10.1007/s11572-020-09527-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11572-020-09527-8

Keywords

Navigation