Judicial Gatekeeping in Massachusetts

by John D. Daley and Thomas F. Allen, Jr. - Harvard Law School '99

     Expert witnesses have become fixtures in today's courts. From fiber comparisons to economic projections and psychiatric evaluations, the range of offered expertise covers the span of human knowledge. Hardly a case of any consequence goes to trial without expert witnesses of some kind.

     What is the trial judge's role in overseeing the testimony of expert witnesses? Unlike lay witnesses, whose testimony a jury can evaluate with common sense and experience, expert witnesses offer conclusions based on practices and knowledge beyond the ken of the average juror. As a consequence, testimony by unpoliced expert witnesses can have a potentially prejudicial effect on jurors, who may be inclined to believe the experts based solely on their status as such. How is the trial judge to know whether the expert is merely speculating, or whether the evidence on which the expert bases his or her testimony is sufficient to support the conclusion? Judges certainly prevent lay witnesses from speculating, and are expected to exclude a witness offering wholly speculative testimony. Why should the situation differ when the witness purports to be an "expert?"

     If one accepts the proposition that the trial judge has a duty to exclude unreliable experts, a host of concerns inevitably follow. How is a trial judge to assess the scientific or technical adequacy of expert testimony if even a cursory understanding of the issues requires specialized training? Can anyone without such training fully understand the issues and come to a rational conclusion as to their validity? To what degree can the trial judge rely on the expert's own assertions about his or her qualifications? All of these concerns carry heavy weight. For in most cases the trial judge is hardly a more qualified assessor of scientific credibility than the jury itself. 

     For more than half a century, American courts relied on the scientific community for assistance in their gatekeeping endeavor. In Frye v. United States, the Court of Appeals for the District of Columbia held that in order to be admissible, the basis for expert testimony must be "sufficiently established to have gained general acceptance in the particular field in which it belongs."(1) This became the standard in Federal and most state courts for seventy years. The reasoning was simple: science was to be left to the scientists. The Frye test ensured that "those most qualified to assess the general validity of a scientific method will have the determinative voice."(2)

     Yet oftentimes the "general acceptance" test seemed to be been in tension with changes in federal trial procedures. The Federal Rules of Evidence, passed in 1975, adopted a "liberal" approach for Federal courtrooms. Rule 402 states that "All relevant evidence is admissible," and coupled with Rule 702, many began to question whether the Frye test was in tune with the prevailing law.(3) The major criticism of Frye was that novel, yet otherwise completely reliable evidence was often excluded because scientists had not yet "accepted" its validity. Aggrieved parties were forced to lay in wait, as the scientific community meticulously examined the details of an otherwise "logically reliable" method.(4)

     The Supreme Court sought to remedy this problem in 1993. In Daubert v. Merrell Dow Pharmaceuticals, the Court held that the Federal Rules of Evidence superseded the Frye test.(5) The Court held that the essential concerns of the Rules, and of Rule 702 in particular, were reliability and relevancy. While Frye was designed to flush out unreliable testimony, its test was not determinative and did not preclude further analysis. The Court held that when faced with an expert witness, the trial judge has a responsibility to determine "whether the expert is proposing to testify to (1) scientific knowledge that (2) will assist the trier of fact to understand or determine a fact issue."(6)

     The Court went on to hold that this "gatekeeping" role requires the trial judge to make a "preliminary assessment of whether the reasoning or methodology underlying the testimony is scientifically valid, i.e., whether it is reliable."(7) Furthermore, the evidence must be "relevant to the issue involved."(8) In order for expert testimony to be admissible, both prongs of the test must be met. Finally, the Court cautioned the judge to be sure that the inquiry only attacks the reliability of the underlying methods, and not the correctness of the expert's final conclusion - a determination that is reserved for the jury.(9)

     Suddenly, the Federal judge could no longer rely on the scientific community for a determination of an expert's credibility. Instead, judges are to make the determination on their own. Federal trial judges were forced to decide for themselves whether an expert's methodology -- the often highly complicated basis for the conclusions -- was reliable. 

     Although the Daubert case itself deals only with trials conducted under the Federal Rules of Evidence, most states have chosen to adopt its test in one form or another. Massachusetts, for example, adopted a variant of the Daubert test in Commonwealth v. Lanigan.(10) While few state trial judges will ever hear a "mass toxic tort" case such as Daubert, this does not diminish the need to understand the Daubert rulings. Much like the evidence offered, the judicial mechanisms for dealing with a Daubert challenge are still in the frontier stage. Attorneys will develop new means of both offering and challenging expert testimony, and the trial judge must be prepared to deal with them. 

     It is the intent of this paper and those that follow to explore the many questions surrounding the judge's gatekeeping role. Acting as a roadmap for the expansive gatekeeping terrain, the remainder of this introduction outlines the history of judicial gatekeeping in Massachusetts, and a synopsis of the more detailed discussions to follow.

Gatekeeping in Massachusetts
     In Commonwealth v. Fatalo, the Supreme Judicial Court adopted the "general acceptance" test of Frye.(11) For the most part, Massachusetts courts relied on the test in deciding whether evidence produced by a scientific theory or process was admissible.(12) Over the years, however, the courts did not seem to apply the Frye test with any semblance of regularity. Some types of "scientific" evidence were subject to scrutiny, while others fell from the scope of Frye's analysis. By 1994, commentators noted that "no principle adequately explains or justifies the Massachusetts approach of subjecting some expert testimony to the rigors of Frye, while granting other testimony a 'free pass' to the jury."(13)

     These observations did not go unheeded, and in 1994 the Supreme Judicial Court adopted the Daubert "reasoning," acknowledging that "[i]n some instances, perhaps without adequately articulated reasons, we simply have decided that Frye principles do not apply in deciding the admissibility of expert testimony apparently based on a scientific theory or process."(14) In an effort to clarify the standards for admissibility, the Court accepted "the basic reasoning of the Daubert opinion" in Commonwealth v. Lanigan ("Lanigan II").(15)

     In Lanigan I, a defendant charged with child sexual offenses filed motions in limine to prevent the Commonwealth from introducing evidence based on deoxyribonucleic acid (DNA) tests at trial.(16) The Supreme Judicial Court upheld the exclusion of the evidence, ruling that the process used to estimate the frequency with which the defendant's "DNA profile" would have occurred in the population (known as the "product rule") had not been "generally accepted in the field of population genetics."(17)

     On remand, the Commonwealth offered evidence based on a different process for determining the likelihood of a DNA match. This test was ruled to be admissible by the trial judge, and the defendant was convicted on all counts. Questioning the decision to admit the DNA evidence, the defendant sought direct review in the Supreme Judicial Court.(18)

Lanigan II and Daubert
     DNA testing isolates "alleles," or specific gene sites, in blood or semen samples in order to determine whether the defendant's "pattern" matches that of the incriminating evidence. But the "match" does not definitively link the defendant to the crime. Science is only able to test for a limited number of allele sites, and humans have millions of alleles. Consequently, a match only proves that the set of alleles found in one test was present in both samples. But this does not necessarily mean that all of the other allele sequences are identical. By testing for "unique" alleles, and increasing the number of tests (i.e. the number of unique allele sites that are matched) the probability of another person having the same "profile" as the defendant decreases. But the "match" will always be a matter of probability, not certainty. As a consequence, the statistical analysis is the focal point of the trial judge's investigation, and "[e]vidence of a match based on currently used testing processes is meaningless without evidence indicating the significance of the match."(19)

     In Lanigan I, the Commonwealth sought to enter testimony based on a statistical method known as the "product rule." Under this test, the frequency in the population base of each allele disclosed in the DNA test is multiplied to produce the frequency of the combination of all alleles found.(20) This rule is based on the assumption that each person's alleles constitute statistically independent random selections from a random gene pool, and that certain subpopulations (particular racial groups, for example) do not have their own set of distinct allele frequencies or "subpools" of genes.(21)

     That assumption, according to the Court, was not "generally accepted" in the relevant scientific community.(22) There was strong evidence among population geneticists that certain racial groups exhibited unique alleles, so that while a certain gene might have a frequency of 100,000:1 in the general population, the incidence of appearance among members of the racial group was much higher (e.g., 15,000:1). Thus, the estimated probability of a "match" calculated under the product rule was suspect. The method was disfavored among scientists because it ignored the possibility of "subpopulations" in gene pools.(23)

     In Lanigan II, the Commonwealth relied on a different statistical method, one that was recommended by the National Academy of Sciences.(24) This test allegedly accounted for the possibility of "subpopulations" in the calculation of the statistical significance of a DNA "match." The "ceiling principle" utilized the frequencies calculated for the subpopulation with the highest observed frequency for each allele.(25) The test was designed to produce aggregate frequencies that are as high as possible (giving the most conservative estimate for the likelihood of a false "match").(26) The Academy of Sciences recommended the ceiling principle because "any error in calculating the profile frequencies that is caused by population substructure should accrue to the benefit of the individual against whom the DNA testing is being used."(27)

     The defendant again argued that the evidence should have been excluded because it lacked the requisite "general acceptance" under the Frye/Fatalo standard. Relying on several scientific opinions and studies criticizing the ceiling principle, he asserted that there was enough opposition among population geneticists to negate a finding of "general acceptance."(28) Rather than pass on the scientific community's "acceptance" of the ceiling principle, the SJC adopted a new standard for assessing the reliability of expert testimony.(29)

     The Court began by noting that the Frye test "has a practical usefulness because if there is general acceptance in the relevant scientific community, the prospects are high, but not certain, that the theory or process is reliable."(30) The Court asserted that the ultimate test, however, had always been the "reliability of the theory or process underlying the expert's testimony."(31) While the "general acceptance" standard usually provided an accurate measure of reliability, the Court recognized the "risk that reliable evidence might be kept from the factfinder" under Frye if the "scientific community has not yet digested and approved the foundation of a theory or process."(32)

     The Court specifically adopted Daubert, and held that a proponent of scientific evidence was allowed to demonstrate the reliability or validity on an underlying theory or process despite a lack of "general acceptance."(33) In adopting the Daubert test, however, the court noted that "we suspect that general acceptance in the relevant scientific community will continue to be the significant, and often the only, issue."(34) Nevertheless, the court held that the ceiling principle was sufficiently reliable, despite the dispute over its accuracy among scientists. 

After Lanigan II
     The most striking difference between Daubert and Lanigan is the SJC's intention to retain as much of the "general acceptance" test as possible. Unlike in Daubert, where "general acceptance" is one of many factors, the Lanigan Court places heavy reliance on the old Frye test. Under the Lanigan formula, an inquiry into the methodological reliability of an expert's testimony is only necessary if the litigant is unable to convince the trial judge of its "acceptance" among scientists. Presumably, "general acceptance" is sufficient to withstand any challenge to methodological sufficiency. In contrast, the touchstone of Daubert is reliability, with general acceptance relegated to an auxiliary role as one of many factors to be considered by the court.(35)

     The bulk of the post-Lanigan decisions in Massachusetts have been in the criminal setting. Primarily, the debate is over DNA based testimony, not only with respect to statistical significance, but with the physical and chemical methods through which the experts "match" their samples. But Lanigan has also been raised in other fields. For example, in Commonwealth v. Sands, the SJC held that evidence of a Horizontal Gaze Nystagmus sobriety field test was improperly admitted in a drunk driving case, because the trial judge failed to conduct an inquiry into the "reliability" of the underlying theory.(36)

     Similarly, a Massachusetts Appeals Court upheld an exclusion of expert testimony based on clinical diagnosis of a train crash victim. Applying Lanigan, the trial judge noted that the doctor's theory had not yet been generally accepted, and that her "hypothesis had not been, and currently could not be tested."(37)

     Should evidence from "field sobriety tests" be subjected to the rigors of Daubert? Should a medical doctor's clinical diagnosis be withheld from a jury because it is novel and is not yet verifiable to a degree of certainty, or should another doctor's opinion simply attack it's credibility? These are a just a few of the issues surrounding the Daubert dilemma.

Joiner and the Future of Judicial Gatekeeping [Click here for the Joiner update.] 
     As Massachusetts Courts struggle with Lanigan and Daubert, the U.S. Supreme Court will attempt to clarify the "gatekeeping" role when it hears Joiner v. General Electric next term. While the Joiner case deals only with a standard-of-review question, it seems likely that the court will use this opportunity to clarify the issues left open in Daubert. A firm grasp of the substantive issues surrounding expert testimony is even more important now that the Supreme Court is set to re-examine its elusive "gatekeeping" decision. 

      Will Massachusetts adopt Joiner part-in-parcel, or will the SJC instead retain the Lanigan framework? Will the SJC even accept the Joiner determination as to the proper standard for appellate review of Daubert hearings? Currently, Massachusetts utilizes an abuse of discretion standard for evidentiary rulings. If the Supreme Court formulates a de novo rule, will the SJC follow suit? What about the other holdings of Daubert and its progeny? A trial judge versed in all of the Daubert issues will be well-prepared to deal with the bar's reaction to Joiner and the changes that will surely follow.

     It is with this purpose in mind that we turn to the substantive issues. What remains is a list of the major concerns surrounding Daubert. In the papers that follow, each of these issues is examined in detail, as is its importance to the trial judge in Massachusetts.

1. How has the role of judicial gatekeeping evolved in the United States. How has the balance of power between judge and jury shifted in reaction to trends in American history? Should this have any bearing on how the gatekeeping role is formulated today?

2. What is the influence of summary judgement on the development of judicial gatekeeping? Gatekeeping clearly is rooted in the traditional division of responsibilities between judge and jury and on the distinction between fact and law. One way to view Daubert gatekeeping is as another means by which the judge may assert his role as arbiter of questions of law. The gatekeeping role, then, echoes the decisions a judge makes when ruling on a motion for summary judgement.

3. What is the precise issue to be determined in a Daubert hearing? Different courts interpret the meaning and scope of "methodology" in different ways. Does methodology describe an expert's general approach, or does it include all the steps underlying that expert's conclusion? Once reliability is established, how much weight is to be given to the evidence for purposes of determining "fit?" Is a judge permitted to conclude that the testimony is so weak that it lacks relevance in any meaningful sense of the word? 

4. Is Daubert a liberalizing or constraining change from Frye? When the Supreme Court first decided the case, some immediately viewed the decision as a liberalizing departure from the "general acceptance" test. At the same time, however, post-Daubert trial courts are now equipped with a host of criteria on which they can potentially exclude "questionable" scientific testimony. Has the "liberalizing decision" been transformed into a more restrictive test?

5. Does Daubert apply to areas other than new science? Is science merely a special case or does Daubert-type gatekeeping also apply to experts in such fields as economics and psychiatry? If so, might the judge's evaluation of social science methodology differ from that of a "hard" science like biology or chemistry?

6. What is the relationship between legal and scientific standards of proof? How much weight should judges lend to statistical measures of scientific proof? More specifically, how important are "significance levels" and "confidence intervals" in determining the reliability of a certain study? Nowhere is the uneasy interface between science and legal evidence more apparent.

7. In toxic tort cases such as Daubert, the plaintiffs all attempted to establish causation through the use of epidemiological studies, which endeavor to measure a statistical relationship between exposure to an agent and the incidence of a harmful condition. Are such studies necessary to prove causation? Are they helpful? Similarly, how should a judge evaluate a physician's testimony based on differential ("clinical") diagnosis?

8. What are the procedural issues surrounding judicial gatekeeping? Practically speaking, how is the judge to accomplish his or her task?

9. Can Daubert hearings be judicially noticed or otherwise be given precedential value? What effect has Daubert had on pre-existing precedents? Should one judge's assessment of certain scientific evidence preclude a different opinion by other judges, or can the issue of adequacy of an expert's testimony be repeatedly litigated? How much should judges worry about conflicting court decisions on the validity of certain types of methodology?

10. What is the standard for appellate review of a Daubert hearing? Should a trial judge's evidentiary determination be subject to the traditional discretionary standard? Or should de novo review be applied so as to create more consistent treatment of scientific evidence? The current split in federal courts surrounding this issue will be decided next term in the Joiner case. What are the policy implications of the Court's options?

11. Should differing standards of stringency apply in determining admissibility depending upon the type of case, evidence, or witness before the court? For example, should different standards be applied in civil and criminal cases?

12. How should judges treat the testimony of professional witnesses? On the issue of admissibility of the expert's opinion, should it matter that the witness primarily makes a living from testifying? What if the expert always testifies for one side? 


