1 University of East London.
2 Swansea University, United Kingdom.
3 University of East London.
World Journal of Advanced Research and Reviews, 2026, 30(02),1466-1477
Article DOI: 10.30574/wjarr.2026.30.2.1387
Received on 09 April 2026; revised on 16 May 2026; accepted on 19 May 2026
The rapid growth of generative AI has challenged the authenticity of digital media and created new risks for digital forensic investigations. Although many deepfake detection tools report high benchmark accuracy, their suitability for forensic deployment remains unclear. Existing approaches are primarily designed for computer vision benchmarks rather than forensic requirements such as reproducibility, quantified error rates, transparency, and evidentiary admissibility. This systematic literature review evaluates open-source deepfake detection tools against forensic standards including ISO/IEC 27037, NIST SP 800-86, and UK Criminal Procedure Rules Part 32. Using a PRISMA-guided methodology, the study synthesised evidence from 10 peer-reviewed studies (2018–2025) across key forensic criteria including cross-dataset generalisation, reproducibility, explainability, error quantification, and compression robustness. Findings show that tools achieving 95–99% benchmark accuracy declined sharply to 54–75% on realistic out-of-distribution data, with no tool reaching the minimum forensic suitability threshold. Major weaknesses included poor generalisation, lack of confidence intervals and error documentation, limited explainability, and high false positive rates under realistic deployment conditions. The review concludes that current detection approaches are not forensically reliable and remain unsuitable as standalone evidence. Risk-stratified deployment recommendations and key research gaps are identified to support future development of forensic-grade deepfake detection systems.
Synthetic media; Forensic; DeepFake; False positives; False negatives
Preview Article PDF
Chukwudi George-Linus Onyekwere, Otuu Obinna Ogbonnia and Stanley Muturi Githinji. Beyond benchmark accuracy: Evaluating deepfake detection tools for digital forensic admissibility through a systematic review. World Journal of Advanced Research and Reviews, 2026, 30(02), 1466-1477. Article DOI: https://doi.org/10.30574/wjarr.2026.30.2.1387