ABSTRACT

The use of biometric, chemical, physical and digital databases has become an important tool to identify a suspect. When searching, for example, a DNA database, a DNA profile of a crime stain is compared to many reference DNA profiles. The donor of the crime stain will match if that person’s profile is in the database. However, each non-donor also has a small probability of matching by chance. As databases grow, they will generate more matches with true donors, but more “false” matches will arise by chance.

Ascertaining the evidential value of a database match has led to an intense scientific debate (called the database controversy by Balding). This chapter describes the main arguments and the mathematical solution. If Smith is the single matching person in the database, then the match is very strong evidence that Smith is the donor, but if there is no other evidence in the case to support this conclusion, it may not be probable that Smith is the donor⸺the posterior probability could be small. When reporting database matches, it is a challenge to make lay persons aware of the effect of the other evidence (the prior probability), and more research is needed on the optimal way to achieve this.

The Bayesian analysis of DNA database searches is also useful in understanding “selection effects” more generally, that is, whenever multiple comparisons are made and the matching or highest scoring one becomes the focus of interest. A number of interesting court cases illustrate the issues.