1
|
Albright TD, Scurich N. A call for open science in forensics. Proc Natl Acad Sci U S A 2024; 121:e2321809121. [PMID: 38781227 PMCID: PMC11181113 DOI: 10.1073/pnas.2321809121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024] Open
Abstract
The modern canon of open science consists of five "schools of thought" that justify unfettered access to the fruits of scientific research: i) public engagement, ii) democratic right of access, iii) efficiency of knowledge gain, iv) shared technology, and v) better assessment of impact. Here, we introduce a sixth school: due process. Due process under the law includes a right to "discovery" by a defendant of potentially exculpatory evidence held by the prosecution. When such evidence is scientific, due process becomes a Constitutional mandate for open science. To illustrate the significance of this new school, we present a case study from forensics, which centers on a federally funded investigation that reports summary statistics indicating that identification decisions made by forensic firearms examiners are highly accurate. Because of growing concern about validity of forensic methods, the larger scientific community called for public release of the complete analyzable dataset for independent audit and verification. Those in possession of the data opposed release for three years while summary statistics were used by prosecutors to gain admissibility of evidence in criminal trials. Those statistics paint an incomplete picture and hint at flaws in experimental design and analysis. Under the circumstances, withholding the underlying data in a criminal proceeding violates due process. Following the successful open-science model of drug validity testing through "clinical trials," which place strict requirements on experimental design and timing of data release, we argue for registered and open "forensic trials" to ensure transparency and accountability.
Collapse
Affiliation(s)
| | - Nicholas Scurich
- Department of Psychological Science, University of California, Irvine, CA92697
- Department of Criminology, Law and Society, University of California, Irvine, CA92697
| |
Collapse
|
2
|
Almazrouei MA, Kukucka J, Morgan RM, Levy I. Unpacking workplace stress and forensic expert decision-making: From theory to practice. Forensic Sci Int Synerg 2024; 8:100473. [PMID: 38737991 PMCID: PMC11087230 DOI: 10.1016/j.fsisyn.2024.100473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 04/18/2024] [Accepted: 04/18/2024] [Indexed: 05/14/2024]
Abstract
Workplace stress can affect forensic experts' job satisfaction and performance, which holds financial and other implications for forensic service providers. Therefore, it is important to understand and manage workplace stress, but that is not simple or straightforward. This paper explores stress as a human factor that influences forensic expert decision-making. First, we identify and highlight three factors that mitigate decisions under stress conditions: nature of decision, individual differences, and context of decision. Second, we situate workplace stress in forensic science within the Challenge-Hindrance Stressor Framework. We argue that stressors in forensic science workplaces can have a positive or a negative impact, depending on the type, level, and context of stress. Developing an understanding of the stressors, their sources, and their possible impact can help forensic service providers and researchers to implement context-specific interventions to manage stress at work and optimize expert performance.
Collapse
Affiliation(s)
- Mohammed A. Almazrouei
- Center for Neurocognition and Behavior, Wu Tsai Institute, Yale University, New Haven, CT, USA
- Department of Comparative Medicine, Yale University, New Haven, CT, USA
| | - Jeff Kukucka
- Department of Psychology, Towson University, Towson, MD, USA
| | - Ruth M. Morgan
- Centre for the Forensic Sciences, University College London, London, UK
- Department of Security and Crime Science, University College London, London, UK
| | - Ifat Levy
- Center for Neurocognition and Behavior, Wu Tsai Institute, Yale University, New Haven, CT, USA
- Department of Comparative Medicine, Yale University, New Haven, CT, USA
| |
Collapse
|
3
|
Swofford H, Lund S, Iyer H, Butler J, Soons J, Thompson R, Desiderio V, Jones J, Ramotowski R. Inconclusive decisions and error rates in forensic science. Forensic Sci Int Synerg 2024; 8:100472. [PMID: 38737990 PMCID: PMC11087963 DOI: 10.1016/j.fsisyn.2024.100472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/14/2024]
Abstract
In recent years, there has been discussion and controversy relating to the treatment of inconclusive decisions in forensic feature comparison disciplines when considering the reliability of examination methods and results. In this article, we offer a brief review of the various viewpoints and suggestions that have been recently put forth, followed by a solution that we believe addresses the treatment of inconclusive decisions. We consider the issues in the context of method conformance and method performance as two distinct concepts, both of which are necessary for the determination of reliability. Method conformance relates to an assessment of whether the outcome of a method is the result of the analyst's adherence to the procedures that define the method. Method performance reflects the capacity of a method to discriminate between different propositions of interest (e.g., mated and non-mated comparisons). We then discuss implications of these issues for the forensic science community.
Collapse
Affiliation(s)
- H. Swofford
- National Institute of Standards and Technology (NIST), USA
| | - S. Lund
- National Institute of Standards and Technology (NIST), USA
| | - H. Iyer
- National Institute of Standards and Technology (NIST), USA
| | - J. Butler
- National Institute of Standards and Technology (NIST), USA
| | - J. Soons
- National Institute of Standards and Technology (NIST), USA
| | - R. Thompson
- National Institute of Standards and Technology (NIST), USA
| | - V. Desiderio
- National Institute of Standards and Technology (NIST), USA
| | - J.P. Jones
- National Institute of Standards and Technology (NIST), USA
| | - R. Ramotowski
- National Institute of Standards and Technology (NIST), USA
| |
Collapse
|
4
|
Gutierrez RE, Prokesch EJ. The false promise of firearms examination validation studies: Lay controls, simplistic comparisons, and the failure to soundly measure misidentification rates. J Forensic Sci 2024. [PMID: 38684627 DOI: 10.1111/1556-4029.15531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/18/2024] [Accepted: 04/15/2024] [Indexed: 05/02/2024]
Abstract
Several studies have recently attempted to estimate practitioner accuracy when comparing fired ammunition. But whether this research has included sufficiently challenging comparisons dependent upon expertise for accurate conclusions regarding source remains largely unexplored in the literature. Control groups of lay people comprise one means of vetting this question, of assessing whether comparison samples were at least challenging enough to distinguish between experts and novices. This article therefore utilizes such a group, specifically 82 attorneys, as a post hoc control and juxtaposes their performance on a comparison set of cartridge case images from one commonly cited study (Duez et al. in J Forensic Sci. 2018;63:1069-1084) with that of the original participant pool of professionals. Despite lacking the kind of formalized training and experience common to the latter, our lay participants displayed an ability, generally, to distinguish between cartridge cases fired by the same versus different guns in the 327 comparisons they performed. And while their accuracy rates lagged substantially behind those of the original participant pool of professionals on same-source comparisons, their performance on different-source comparisons was essentially indistinguishable from that of trained examiners. This indicates that although the study we vetted may provide useful information about professional accuracy when performing same-source comparisons, it has little to offer in terms of measuring examiners' ability to distinguish between cartridge cases fired by different guns. If similar issues pervade other accuracy studies, then there is little reason to rely on the false-positive rates they have generated.
Collapse
Affiliation(s)
- Richard E Gutierrez
- Organization of Scientific Area Committees Legal Task Group, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
- Forensic Science Division, Law Office of the Cook County Public Defender, Chicago, Illinois, USA
- Academy Standards Board, Firearms and Toolmarks Consensus Body, Colorado Springs, Colorado, USA
| | - Emily J Prokesch
- Organization of Scientific Area Committees Legal Task Group, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
- Discovery and Forensic Support Unit, New York State Defenders Association, Albany, New York, USA
- Columbia School of Law, New York, New York, USA
| |
Collapse
|
5
|
Li Z, Xie L, Song H. Two heads are better than one: Dual systems obtain better performance in facial comparison. Forensic Sci Int 2023; 353:111879. [PMID: 37948948 DOI: 10.1016/j.forsciint.2023.111879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 10/05/2023] [Accepted: 11/01/2023] [Indexed: 11/12/2023]
Abstract
Forensic facial image comparison based on recognition algorithms has been widely applied in forensic science. Previous researches have been concentrating on the cases of using single system during comparison, while how to use multiple systems has not yet been studied. In this paper, a dual-systems model (including SeetaFace and FaceNet) for facial comparison was constructed, and Bayesian networks were utilized as the basic frame. In order to prove its superiority, a large-scale experiment (on the dataset CelebA) has been carried on to evaluate the score-based likelihood ratio. We used three likelihood ratio evaluation tools (Empirical Cross-Entropy, Cost Likelihood Ratio, Limit Tippett Plots) to assess the performance of the model. The Wasserstein distance was also used to evaluate the detailed likelihood ratio performance. The experimental results show that the likelihood ratio performance of our dual-systems model is better than single system. Besides, our method of model building and evaluation can also be used in the condition of triple or more systems.
Collapse
Affiliation(s)
- Zhihui Li
- Institute of Forensic Science, Ministry of Public Security, China.
| | - Lanchi Xie
- Institute of Forensic Science, Ministry of Public Security, China; Department of Electronic Engineering, Tsinghua University, China
| | - Huaqing Song
- Institute of Forensic Science, Ministry of Public Security, China
| |
Collapse
|
6
|
Abegg C, Hoxha F, Campana L, Ekizoglu O, Schranz S, Egger C, Grabherr S, Besse M, Moghaddam N. Measuring pelvises in 3D surface scans and in MDCT generated virtual environment: Considerations for applications in the forensic context. Forensic Sci Int 2023; 352:111813. [PMID: 37742459 DOI: 10.1016/j.forsciint.2023.111813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 07/18/2023] [Accepted: 08/26/2023] [Indexed: 09/26/2023]
Abstract
Virtual Anthropology (VA) transposes the traditional methods of physical anthropology to virtual environments using imaging techniques and exploits imaging technologies to devise new methodological protocols. In this research, we investigate whether the measurements used in the Diagnose Sexuelle Probabiliste (DSP) and Ischio-Pubic Index (IPI) differ significantly when 3D models of a bone are generated using 3D surface scans (3DSS) and Multidetector Computed Tomography (MDCT) scans. Thirty pelvises were selected from the SIMON identified skeletal collection. An equal ratio of females to males was sought, as well as a good preservation of the bones. The pelvises were scanned using an MDCT scanner and a 3D surface scanner. The measurements of the DSP and IPI methods on the dry bones (referred to as macroscopic measurements here), and then to the 3D models. The intra- and interobserver, using the Technical Error of Measurement (TEM) and relative Technical Error of Measurement (rTEM) error was assessed, and we aimed to observe if the measurements made on the MDCT and 3DSS generated models were significantly different from those taken on the dry bones. Additionally, the normality of the data was tested (Shapiro-Wilk test) and the differences in measurements was evaluated using parametric (Student t-tests) and non-parametric (Wilcoxon) tests. The TEM and rTEM calculations show high intra and interobserver consistency in general. However, some measurements present insufficient inter- and intraobserver agreement. Student t and Wilcoxon tests indicate potentially significant differences of some measurements between the different environments. The results show that especially in the virtual environment, it is not easy to find the right angle for some of the DSP measurements, However, when comparing the measurement differences between dry and virtual bones, the results show that most of the differences are less than or equal to 2.5 mm. Considering the IPI, the landmarks are already difficult to determine on the dry bone, but they are even more difficult to locate in the virtual environment. Nevertheless, this study shows that quantitative methods may be better suited for application in the virtual environment, but further research using different methods is needed.
Collapse
Affiliation(s)
- Claudine Abegg
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Lausanne University Hospital and University of Lausanne, Switzerland
| | - Fatbardha Hoxha
- Laboratory for Prehistoric Archaeology and Anthropology, Department F.-A. Forel for Environmental and Aquatic Sciences, Faculty of Science, University of Geneva, Switzerland; Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Geneva University Hospital and University of Geneva, Switzerland
| | - Lorenzo Campana
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Lausanne University Hospital and University of Lausanne, Switzerland
| | - Oguzhan Ekizoglu
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Geneva University Hospital and University of Geneva, Switzerland; Tepecik Training and Research Hospital, Department of Forensic Medicine, Izmir, Turkey
| | - Sami Schranz
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Geneva University Hospital and University of Geneva, Switzerland
| | - Coraline Egger
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Geneva University Hospital and University of Geneva, Switzerland
| | - Silke Grabherr
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Lausanne University Hospital and University of Lausanne, Switzerland; Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Geneva University Hospital and University of Geneva, Switzerland
| | - Marie Besse
- Laboratory for Prehistoric Archaeology and Anthropology, Department F.-A. Forel for Environmental and Aquatic Sciences, Faculty of Science, University of Geneva, Switzerland
| | - Negahnaz Moghaddam
- Unit of Forensic Imaging and Anthropology, University Centre of Legal Medicine Lausanne-Geneva, Lausanne University Hospital and University of Lausanne, Switzerland; Swiss Human Institute of Forensic Taphonomy, University Centre of Legal Medicine Lausanne-Geneva, Lausanne University Hospital and University of Lausanne, Switzerland
| |
Collapse
|
7
|
Dror IE. The most consistent finding in forensic science is inconsistency. J Forensic Sci 2023; 68:1851-1855. [PMID: 37658789 DOI: 10.1111/1556-4029.15369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 08/15/2023] [Accepted: 08/15/2023] [Indexed: 09/05/2023]
Abstract
The most consistent finding in many forensic science domains is inconsistency (i.e., lack of reliability, reproducibility, repeatability, and replicability). The lack of consistency is a major problem, both from a scientific and a criminal justice point of view. Examining forensic conclusion data, from across many forensic domains, highlights the underlying cognitive issues and offers a better understanding of the issues and challenges. Such insights enable the development of ways to minimize these inconsistencies and move forward. The aim is to highlight the problem, so that it can be minimized and the reliability of forensic science evidence can be improved.
Collapse
Affiliation(s)
- Itiel E Dror
- Cognitive Consultants International (CCI-HQ), London, UK
| |
Collapse
|
8
|
Stout P. The secret life of crime labs. Proc Natl Acad Sci U S A 2023; 120:e2303592120. [PMID: 37782808 PMCID: PMC10576105 DOI: 10.1073/pnas.2303592120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open
Abstract
Houston TX experienced a widely known failure of its police forensic laboratory. This gave rise to the Houston Forensic Science Center (HFSC) as a separate entity to provide forensic services to the City of Houston. HFSC is a very large forensic laboratory and has made significant progress at remediating the past failures and improving public trust in forensic testing. HFSC has a large and robust blind testing program, which has provided many insights into the challenges forensic laboratories face. HFSC's journey from a notoriously failed lab to a model also gives perspective to the resource challenges faced by all labs in the country. Challenges for labs include the pervasive reality of poor-quality evidence. Also that forensic laboratories are necessarily part of a much wider system of interdependent functions in criminal justice making blind testing something in which all parts have a role. This interconnectedness also highlights the need for an array of oversight and regulatory frameworks to function properly. The major essential databases in forensics need to be a part of blind testing programs and work is needed to ensure that the results from these databases are indeed producing correct results and those results are being correctly used. Last, laboratory reports of "inconclusive" results are a significant challenge for laboratories and the system to better understand when these results are appropriate, necessary and most importantly correctly used by the rest of the system.
Collapse
Affiliation(s)
- Peter Stout
- Houston Forensic Science Center, Houston, TX77002
| |
Collapse
|
9
|
Scurich N, Faigman DL, Albright TD. Scientific guidelines for evaluating the validity of forensic feature-comparison methods. Proc Natl Acad Sci U S A 2023; 120:e2301843120. [PMID: 37782809 PMCID: PMC10576079 DOI: 10.1073/pnas.2301843120] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open
Abstract
When it comes to questions of fact in a legal context-particularly questions about measurement, association, and causality-courts should employ ordinary standards of applied science. Applied sciences generally develop along a path that proceeds from a basic scientific discovery about some natural process to the formation of a theory of how the process works and what causes it to fail, to the development of an invention intended to assess, repair, or improve the process, to the specification of predictions of the instrument's actions and, finally, empirical validation to determine that the instrument achieves the intended effect. These elements are salient and deeply embedded in the cultures of the applied sciences of medicine and engineering, both of which primarily grew from basic sciences. However, the inventions that underlie most forensic science disciplines have few roots in basic science, and they do not have sound theories to justify their predicted actions or results of empirical tests to prove that they work as advertised. Inspired by the "Bradford Hill Guidelines"-the dominant framework for causal inference in epidemiology-we set forth four guidelines that can be used to establish the validity of forensic comparison methods generally. This framework is not intended as a checklist establishing a threshold of minimum validity, as no magic formula determines when particular disciplines or hypotheses have passed a necessary threshold. We illustrate how these guidelines can be applied by considering the discipline of firearm and tool mark examination.
Collapse
Affiliation(s)
- Nicholas Scurich
- Department of Psychological Science, Department of Criminology, Law and Society, University of California, Irvine, CA92697
| | - David L. Faigman
- University of California College of the Law, San Francisco, CA94102
| | | |
Collapse
|
10
|
Koehler JJ, Mnookin JL, Saks MJ. The scientific reinvention of forensic science. Proc Natl Acad Sci U S A 2023; 120:e2301840120. [PMID: 37782789 PMCID: PMC10576124 DOI: 10.1073/pnas.2301840120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open
Abstract
Forensic science is undergoing an evolution in which a long-standing "trust the examiner" focus is being replaced by a "trust the scientific method" focus. This shift, which is in progress and still partial, is critical to ensure that the legal system uses forensic information in an accurate and valid way. In this Perspective, we discuss the ways in which the move to a more empirically grounded scientific culture for the forensic sciences impacts testing, error rate analyses, procedural safeguards, and the reporting of forensic results. However, we caution that the ultimate success of this scientific reinvention likely depends on whether the courts begin to engage with forensic science claims in a more rigorous way.
Collapse
Affiliation(s)
| | | | - Michael J. Saks
- Sandra Day O’Connor College of Law, Arizona State University, Phoenix, AZ85004
| |
Collapse
|
11
|
Monson KL, Smith ED, Peters EM. Repeatability and reproducibility of comparison decisions by firearms examiners. J Forensic Sci 2023; 68:1721-1740. [PMID: 37393551 DOI: 10.1111/1556-4029.15318] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 06/12/2023] [Accepted: 06/13/2023] [Indexed: 07/04/2023]
Abstract
In a comprehensive study to assess various aspects of the performance of qualified forensic firearms examiners, volunteer examiners compared both bullets and cartridge cases fired from three different types of firearms. They rendered opinions on each comparison according to the Association of Firearm & Tool Mark Examiners (AFTE) Range of Conclusions, as Identification, Inconclusive (A, B, or C), Elimination, or Unsuitable. In this part of the study, comparison sets used previously to characterize the overall accuracy of examiners were blindly resubmitted to examiners to assess the repeatability (105 examiners; 5700 comparisons of bullets and cartridge cases) and reproducibility (191 examiners of bullets, 193 of cartridge cases; 5790 comparisons) of firearms examinations. Data gathered using the prevailing AFTE Range were also recategorized into two hypothetical scoring systems. Consistently positive differences between observed agreement and expected agreement indicate that the repeatability and reproducibility of examiners exceed chance agreement. When averaged over bullets and cartridge cases, the repeatability of comparison decisions (involving all five levels of the AFTE Range) was 78.3% for known matches and 64.5% for known nonmatches. Similarly averaged reproducibility was 67.3%% for known matches and 36.5% for known nonmatches. For both repeatability and reproducibility, many of the observed disagreements were between a definitive and inconclusive category. Examiner decisions are reliable and trustworthy in the sense that identifications are unlikely when examiners are comparing non-matching items, and eliminations are unlikely when they are comparing matching items.
Collapse
Affiliation(s)
- Keith L Monson
- Federal Bureau of Investigation Laboratory, Quantico, Virginia, USA
| | - Erich D Smith
- Federal Bureau of Investigation Laboratory, Quantico, Virginia, USA
| | - Eugene M Peters
- Federal Bureau of Investigation Laboratory, Quantico, Virginia, USA
| |
Collapse
|
12
|
Warren EM, Sheets HD. The inconclusive category, entropy, and forensic firearm identification. Forensic Sci Int 2023; 349:111741. [PMID: 37279628 DOI: 10.1016/j.forsciint.2023.111741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/15/2023] [Accepted: 05/30/2023] [Indexed: 06/08/2023]
Abstract
There has been extensive recent discussion of the difficulty in estimating meaningful error rates in forensic firearms examinations, and other areas of pattern evidence. The 2016 President's Council of Advisors on Science and Technology (PCAST) report was clear in criticizing many forensic disciplines as lacking the types of studies that would provide error rate measurements seen in other scientific fields. However, there is a substantial lack of consensus on the approach to measuring an "error rate" for fields such as forensic firearm examination that include in the conclusion scale the "inconclusive" category, as occurs in the Association of Firearm and Tool Mark Examiners (AFTE) Range of Conclusions and many other such fields. Many authors appear to assume the error rate calculated in the binary decision model is the only appropriate way to report errors, but there have been attempts made to adapt the error rate from the binary decision model to scientific fields in which the inconclusive category is viewed as a meaningful outcome of the examination process. In this study we present three neural networks of differing complexity and performance trained to classify the outlines of ejector marks on cartridge cases fired from different firearm models, as a model system for examining the performance of various metrics of error in systems using the inconclusive category. We also discuss an entropy, or information, based method to assess the similarity of classifications to ground truth that is applicable to range of conclusion scales, even when the inconclusive category is used.
Collapse
Affiliation(s)
- E M Warren
- SEP Forensic Consultants, 296 Washington Ave., Memphis, TN 38103, USA
| | - H D Sheets
- Data Analytics Program, Department of Quantitative Science Canisius College, 2001 Main Street, Buffalo, NY 14208, USA.
| |
Collapse
|
13
|
Luby A. A method for quantifying individual decision thresholds of latent print examiners. Forensic Sci Int Synerg 2023; 7:100340. [PMID: 37448982 PMCID: PMC10336733 DOI: 10.1016/j.fsisyn.2023.100340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 05/30/2023] [Accepted: 06/13/2023] [Indexed: 07/18/2023]
Abstract
In recent years, 'black box' studies in forensic science have emerged as the preferred way to provide information about the overall validity of forensic disciplines in practice. These studies provide aggregated error rates over many examiners and comparisons, but errors are not equally likely on all comparisons. Furthermore, inconclusive responses are common and vary across examiners and comparisons, but do not fit neatly into the error rate framework. This work introduces Item Response Theory (IRT) and variants for the forensic setting to account for these two issues. In the IRT framework, participant proficiency and item difficulty are estimated directly from the responses, which accounts for the different subsets of items that participants often answer. By incorporating a decision-tree framework into the model, inconclusive responses are treated as a distinct cognitive process, which allows inter-examiner differences to be estimated directly. The IRT-based model achieves superior predictive performance over standard logistic regression techniques, produces item effects that are consistent with common sense and prior work, and demonstrates that most of the variability among fingerprint examiner decisions occurs at the latent print evaluation stage and as a result of differing tendencies to make inconclusive decisions.
Collapse
Affiliation(s)
- Amanda Luby
- Department of Mathematics & Statistics, Swarthmore College, USA
| |
Collapse
|
14
|
Gutierrez RE, Addyman C. Commentary on: Monson KL, Smith ED, Peters EM. Accuracy of comparison decisions by forensic firearms examiners. J Forensic Sci. 2022;68(1):86-100. https://doi.org/10.1111/1556-4029.15152. J Forensic Sci 2023; 68:1097-1101. [PMID: 37083221 DOI: 10.1111/1556-4029.15257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 03/27/2023] [Indexed: 04/22/2023]
Affiliation(s)
- Richard E Gutierrez
- Forensic Science Division, Law Office of the Cook County Public Defender, Chicago, Illinois, USA
- Organization of Scientific Area Committees Legal Task Group, National Institute of Standards & Technology, Gaithersburg, Maryland, USA
| | - Celeste Addyman
- Forensic Science Division, Law Office of the Cook County Public Defender, Chicago, Illinois, USA
| |
Collapse
|
15
|
Mattei A, Zampa F. Error rates and proficiency tests in the fingerprint domain: A matter of perspective and conceptualization. Forensic Sci Int 2023:111651. [PMID: 37012125 DOI: 10.1016/j.forsciint.2023.111651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/01/2022] [Accepted: 03/21/2023] [Indexed: 04/04/2023]
Abstract
The purpose of this work is to critically analyse the aspects connected both with the measurement of error rates and with the design of proficiency tests and collaborative exercises in the fingerprint domain. All from the dual perspective of practitioners and organizers of PT's/CE's. A thorough analysis of the types of errors, of the methods to infer them through black-box studies and PT's/CE's is carried out, and the limits to the generalization of error rates are described, providing insightful indication on how to design PT's/CE's in the fingerprint domain, which are aimed to represent the complexity of casework.
Collapse
Affiliation(s)
- Aldo Mattei
- Raggruppamento Carabinieri Investigazioni Scientifiche, RIS Messina, Italy.
| | - Francesco Zampa
- Raggruppamento Carabinieri Investigazioni Scientifiche, RIS Parma, Italy
| |
Collapse
|
16
|
Monson KL, Smith ED, Peters EM. Accuracy of comparison decisions by forensic firearms examiners. J Forensic Sci 2023; 68:86-100. [PMID: 36183147 PMCID: PMC10092368 DOI: 10.1111/1556-4029.15152] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 08/29/2022] [Accepted: 09/13/2022] [Indexed: 12/31/2022]
Abstract
This black box study assessed the performance of forensic firearms examiners in the United States. It involved three different types of firearms and 173 volunteers who performed a total of 8640 comparisons of both bullets and cartridge cases. The overall false-positive error rate was estimated as 0.656% and 0.933% for bullets and cartridge cases, respectively, while the rate of false negatives was estimated as 2.87% and 1.87% for bullets and cartridge cases, respectively. The majority of errors were made by a limited number of examiners. Because chi-square tests of independence strongly suggest that error probabilities are not the same for each examiner, these are maximum-likelihood estimates based on the beta-binomial probability model and do not depend on an assumption of equal examiner-specific error rates. Corresponding 95% confidence intervals are (0.305%, 1.42%) and (0.548%, 1.57%) for false positives for bullets and cartridge cases, respectively, and (1.89%, 4.26%) and (1.16%, 2.99%) for false negatives for bullets and cartridge cases, respectively. The results of this study are consistent with prior studies, despite its comprehensive design and challenging specimens.
Collapse
Affiliation(s)
- Keith L Monson
- Federal Bureau of Investigation Laboratory, Quantico, Virginia, USA
| | - Erich D Smith
- Federal Bureau of Investigation Laboratory, Quantico, Virginia, USA
| | - Eugene M Peters
- Federal Bureau of Investigation Laboratory, Quantico, Virginia, USA
| |
Collapse
|
17
|
On the (mis)calculation of forensic science error rates. Proc Natl Acad Sci U S A 2022; 119:e2215695119. [PMID: 36534798 PMCID: PMC9907143 DOI: 10.1073/pnas.2215695119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
|
18
|
Reply to Kukucka: Calculating error rates in forensic handwriting examiner decisions. Proc Natl Acad Sci U S A 2022; 119:e2217508119. [PMID: 36534793 PMCID: PMC9907131 DOI: 10.1073/pnas.2217508119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
|
19
|
Houck MM, Chin J, Swofford H, Gibb C. Registered reports in forensic science. ROYAL SOCIETY OPEN SCIENCE 2022; 9:221076. [PMID: 36465679 PMCID: PMC9709573 DOI: 10.1098/rsos.221076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 11/06/2022] [Indexed: 06/17/2023]
Abstract
Research assessing the validity and reliability of many forensic science disciplines has been published; however, the quality of this research varies depending on the methodologies employed. This was a major point of contention with the United States' President's Council of Advisors on Science and Technology, who recognized the existing literature but found the majority lacking because of methodological issues. Questionable scientific methodologies have undermined the forensic science community's ability to defend the scientific foundations and examination protocols used to examine evidence in criminal cases. Such scientific failures have significant legal implications. Registered reports, which strengthen the quality of scientific research and reliability of laboratory protocols, can provide transparency, validity and a stronger scientific foundation for forensic science.
Collapse
Affiliation(s)
- M. M. Houck
- Graduate Program Director, Global Forensic and Justice Center, Florida International University, Miami, FL 33199, USA
| | - J. Chin
- College of Law, Australian National University Sydney, Sydney, NSW 2000, Australia
| | - H. Swofford
- HJS Consulting, LLC, Washington, DC, USA; Senior Editor, Forensic Science International: Synergy
| | - C. Gibb
- The University of Twente, Amsterdam, The Netherlands
| |
Collapse
|
20
|
Abstract
Much of forensic practice today involves human decisions about the origins of patterned sensory evidence, such as tool marks and fingerprints discovered at a crime scene. These decisions are made by trained observers who compare the evidential pattern to an exemplar pattern produced by the suspected source of the evidence. The decision consists of a determination as to whether the two patterns are similar enough to have come from the same source. Although forensic pattern comparison disciplines have for decades played a valued role in criminal investigation and prosecution, the extremely high personal and societal costs of failure-the conviction of innocent people-has elicited calls for caution and for the development of better practices. These calls have been heard by the scientific community involved in the study of human information processing, which has begun to offer much-needed perspectives on sensory measurement, discrimination, and classification in a forensic context. Here I draw from a well-established theoretical and empirical approach in sensory science to illustrate the vulnerabilities of contemporary pattern comparison disciplines and to suggest specific strategies for improvement.
Collapse
|
21
|
Monson KL, Smith ED, Bajic SJ. Planning, design and logistics of a decision analysis study: The FBI/Ames study involving forensic firearms examiners. Forensic Sci Int Synerg 2022; 4:100221. [PMID: 35243285 PMCID: PMC8860930 DOI: 10.1016/j.fsisyn.2022.100221] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/03/2022] [Accepted: 02/04/2022] [Indexed: 11/03/2022]
Abstract
This paper describes design and logistical aspects of a decision analysis study to assess the performance of qualified firearms examiners working in accredited laboratories in the United States in terms of accuracy (error rate), repeatability, and reproducibility of decisions involving comparisons of fired bullets and cartridge cases. The purpose of the study was to validate current practice of the forensic discipline of firearms/toolmarks (F/T) examination. It elicited error rate data by counting the number of false positive and false negative conclusions. Preceded by the experimental design, decisions, and logistics described herein, testing was ultimately administered 173 qualified, practicing F/T examiners in public and private crime laboratories. The first round of testing evaluated accuracy, while two subsequent rounds evaluated repeatability and reproducibility of examiner conclusions. This project expands on previous studies by involving many F/T examiners in challenging comparisons and by executing the study in the recommended double-blind format.
Collapse
|
22
|
Rodriguez AM, Geradts Z, Worring M. Calibration of Score based Likelihood Ratio estimation in automated forensic facial image comparison. Forensic Sci Int 2022; 334:111239. [DOI: 10.1016/j.forsciint.2022.111239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 01/15/2022] [Accepted: 02/22/2022] [Indexed: 11/25/2022]
|
23
|
Forensic science and the principle of excluded middle: "Inconclusive" decisions and the structure of error rate studies. Forensic Sci Int Synerg 2021; 3:100147. [PMID: 33981984 PMCID: PMC8082088 DOI: 10.1016/j.fsisyn.2021.100147] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 03/20/2021] [Accepted: 03/29/2021] [Indexed: 11/21/2022]
Abstract
In a paper published recently in this journal, Dror and Scurich (2020) [20] critically discuss the notions of "inconclusive evidence" (i.e., test items for which it is difficult to render a categorical response) and "inconclusive decisions" (i.e., experts' conclusions or responses) in the context of forensic science error rate studies. They expose several ways in which the understanding and use of "inconclusives" in current forensic science research and practice can adversely affect the outcomes of error rate studies. A main cause of distortion, according to Dror and Scurich, is what they call "erroneous inconclusive" decisions, in particular the lack of acknowledgment of this type of erroneous conclusion in the computation of error rates. To overcome this complication, Dror and Scurich call for a more explicit monitoring of "inconclusives" using a modified error rate study design. Whilst we agree with several well-argued points raised by the authors, we disagree with their framing of "inconclusive decisions" as potential errors. In this paper, we argue that referring to an "inconclusive decision" as an error is a contradiction in terms, runs counter to an analysis based on decision logic and, hence, is questionable as a concept. We also reiterate that the very term "inconclusive decision" disregards the procedural architecture of the criminal justice system across modern jurisdictions, especially the fact that forensic experts have no decisional rights in the criminal process. These positions do not ignore the possibility that "inconclusives" - if used excessively - do raise problems in forensic expert reporting, in particular limited assertiveness (or, overcautiousness). However, these drawbacks derive from inherent limitations of experts rather than from the seemingly erroneous nature of "inconclusives" that needs to be fixed. More fundamentally, we argue that attempts to score "inconclusives" as errors amount to philosophical claims disguised as forensic methodology. Specifically, these attempts interfere with the metaphysical substrate underpinning empirical research. We point this out on the basis of the law of the excluded middle, i.e. the principle of "no third possibility being given" (tertium non datur).
Collapse
|
24
|
Forensic science in Seychelles: An example of a micro-jurisdiction forensic delivery system. Forensic Sci Int Synerg 2021; 3:100139. [PMID: 33681750 PMCID: PMC7930355 DOI: 10.1016/j.fsisyn.2021.100139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 02/05/2021] [Indexed: 11/30/2022]
Abstract
Forensic science has become an indispensable tool for even the smallest of jurisdictions. However, micro-jurisdictions often face significant challenges with respect to resource availability, administration and local governance. This paper examines the forensic service provision in Seychelles as an example of a micro-jurisdiction forensic delivery system. The impact of limited resources and remote access to consumables or services have prompted the prospective shift to localise commonly utilised forensic services. The potential for a solid foundation for a sustainable forensic service is examined in relation to jurisdictions with more advanced forensic service delivery. Reforms of the legal framework, administration, and governance structures are some of the key underpinnings for an effective forensic delivery system built on a culture of transparent science that promotes justice and creates public confidence. The Seychelles as an example of a micro-jurisdiction forensic delivery system. Geographically remote location brings challenges to sustainable service provision. Current investment into capacity building of commonly utilised forensic services. Innovative solutions required for effective and efficient forensic delivery system. Transparent science culture needed to promote justice and create public confidence.
Collapse
|
25
|
Scarpazza C, Miolla A, Zampieri I, Melis G, Sartori G, Ferracuti S, Pietrini P. Translational Application of a Neuro-Scientific Multi-Modal Approach Into Forensic Psychiatric Evaluation: Why and How? Front Psychiatry 2021; 12:597918. [PMID: 33613339 PMCID: PMC7892615 DOI: 10.3389/fpsyt.2021.597918] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Accepted: 01/14/2021] [Indexed: 01/01/2023] Open
Abstract
A prominent body of literature indicates that insanity evaluations, which are intended to provide influential expert reports for judges to reach a decision "beyond any reasonable doubt," suffer from a low inter-rater reliability. This paper reviews the limitations of the classical approach to insanity evaluation and the criticisms to the introduction of neuro-scientific approach in court. Here, we explain why in our opinion these criticisms, that seriously hamper the translational implementation of neuroscience into the forensic setting, do not survive scientific scrutiny. Moreover, we discuss how the neuro-scientific multimodal approach may improve the inter-rater reliability in insanity evaluation. Critically, neuroscience does not aim to introduce a brain-based concept of insanity. Indeed, criteria for responsibility and insanity are and should remain clinical. Rather, following the falsificationist approach and the convergence of evidence principle, the neuro-scientific multimodal approach is being proposed as a way to improve reliability of insanity evaluation and to mitigate the influence of cognitive biases on the formulation of insanity opinions, with the final aim to reduce errors and controversies.
Collapse
Affiliation(s)
- Cristina Scarpazza
- Department of General Psychology, University of Padova, Padova, Italy
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Alessio Miolla
- Department of General Psychology, University of Padova, Padova, Italy
| | - Ilaria Zampieri
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Giulia Melis
- Department of General Psychology, University of Padova, Padova, Italy
| | - Giuseppe Sartori
- Department of General Psychology, University of Padova, Padova, Italy
| | - Stefano Ferracuti
- Department of Human Neurosciences, “Sapienza” University of Rome, Rome, Italy
| | - Pietro Pietrini
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Lucca, Italy
| |
Collapse
|
26
|
Scurich N, Dror IE. Continued confusion about inconclusives and error rates: Reply to Weller and Morris. Forensic Sci Int Synerg 2021; 2:703-704. [PMID: 33385151 PMCID: PMC7770453 DOI: 10.1016/j.fsisyn.2020.10.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Nicholas Scurich
- University of California, 4312 Social and Behavioral Sciences Gateway, Irvine, CA, 92697, USA
| | - Itiel E Dror
- University College London (UCL), 35 Tavistock Square, London, WC1H 9EZ, United Kingdom
| |
Collapse
|
27
|
Dror IE, Scherr KC, Mohammed LA, MacLean CL, Cunningham L. Biasability and reliability of expert forensic document examiners. Forensic Sci Int 2020; 318:110610. [PMID: 33358191 DOI: 10.1016/j.forsciint.2020.110610] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 11/17/2020] [Accepted: 11/18/2020] [Indexed: 11/30/2022]
Abstract
The performance of experts can be characterized in terms of biasability and reliability of their judgments. The current research is the first to explore the judgments of practicing forensic document experts, professionals who examine and compare disputed handwritten evidence to handwriting exemplars of individuals involved in criminal or civil litigation. Forensic handwriting experts determine if questioned and known handwritten items are of common authorship or written by different individuals, and present their findings in legal proceedings. The expert participants in our study (N=25) were not aware that they were part of a research study. Thirteen participants were led to believe that they were working on a case commissioned from the prosecution and the other twelve that it was for the defense. We did not find evidence in this study that this information biased their judgments, which may make sense since document examiners (in contrast to many other forensic domains) do not primarily work within an organizational forensic laboratory culture. The lack of bias in our findings may have been also due to the stimuli we used or/and the great variability in the judgments within each group, reflecting a lack of consistency in conclusions among examiners. A detailed discussion of our findings is presented along with the limitations that may have affected our results.
Collapse
Affiliation(s)
- Itiel E Dror
- University College London, London, United Kingdom.
| | - Kyle C Scherr
- Central Michigan University, Michigan, United States
| | | | | | | |
Collapse
|