Beyond benchmark accuracy: Evaluating deepfake detection tools for digital forensic admissibility through a systematic review

Chukwudi George-Linus Onyekwere; Otuu Obinna Ogbonnia; Stanley Muturi Githinji

doi:10.30574/wjarr.2026.30.2.1387

eISSN: 2581-9615 || CODEN: WJARAI || Impact Factor 8.2 || CrossRef DOI

Research and review articles are invited for publication in July 2026 (Volume 31, Issue 1) Submit manuscript

Beyond benchmark accuracy: Evaluating deepfake detection tools for digital forensic admissibility through a systematic review

Chukwudi George-Linus Onyekwere ^{1, *}, Otuu Obinna Ogbonnia ² and Stanley Muturi Githinji ³

¹ University of East London.
² Swansea University, United Kingdom.
³ University of East London.

Review Article

World Journal of Advanced Research and Reviews, 2026, 30(02),1466-1477

Article DOI: 10.30574/wjarr.2026.30.2.1387

DOI url: https://doi.org/10.30574/wjarr.2026.30.2.1387

Publication history

Received on 09 April 2026; revised on 16 May 2026; accepted on 19 May 2026

Abstract

The rapid growth of generative AI has challenged the authenticity of digital media and created new risks for digital forensic investigations. Although many deepfake detection tools report high benchmark accuracy, their suitability for forensic deployment remains unclear. Existing approaches are primarily designed for computer vision benchmarks rather than forensic requirements such as reproducibility, quantified error rates, transparency, and evidentiary admissibility. This systematic literature review evaluates open-source deepfake detection tools against forensic standards including ISO/IEC 27037, NIST SP 800-86, and UK Criminal Procedure Rules Part 32. Using a PRISMA-guided methodology, the study synthesised evidence from 10 peer-reviewed studies (2018–2025) across key forensic criteria including cross-dataset generalisation, reproducibility, explainability, error quantification, and compression robustness. Findings show that tools achieving 95–99% benchmark accuracy declined sharply to 54–75% on realistic out-of-distribution data, with no tool reaching the minimum forensic suitability threshold. Major weaknesses included poor generalisation, lack of confidence intervals and error documentation, limited explainability, and high false positive rates under realistic deployment conditions. The review concludes that current detection approaches are not forensically reliable and remain unsuitable as standalone evidence. Risk-stratified deployment recommendations and key research gaps are identified to support future development of forensic-grade deepfake detection systems.

Keywords

Synthetic media; Forensic; DeepFake; False positives; False negatives

Download Article PDF

https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2026-1387.pdf

Preview Article PDF

How to cite this article

Chukwudi George-Linus Onyekwere, Otuu Obinna Ogbonnia and Stanley Muturi Githinji. Beyond benchmark accuracy: Evaluating deepfake detection tools for digital forensic admissibility through a systematic review. World Journal of Advanced Research and Reviews, 2026, 30(02), 1466-1477. Article DOI: https://doi.org/10.30574/wjarr.2026.30.2.1387

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.

All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Developed & Designed by VS Infosolution

Beyond benchmark accuracy: Evaluating deepfake detection tools for digital forensic admissibility through a systematic review

Chukwudi George-Linus Onyekwere ^{1, *}, Otuu Obinna Ogbonnia ² and Stanley Muturi Githinji ³

Preview Article PDF

Get Certificates

Issue details

Beyond benchmark accuracy: Evaluating deepfake detection tools for digital forensic admissibility through a systematic review

Chukwudi George-Linus Onyekwere 1, *, Otuu Obinna Ogbonnia 2 and Stanley Muturi Githinji 3

Preview Article PDF

Get Certificates

Issue details

Chukwudi George-Linus Onyekwere ^{1, *}, Otuu Obinna Ogbonnia ² and Stanley Muturi Githinji ³