A survey on image captioning methods

Kavitha Soppari; Pakide Kavya; Kotla Pranay Teja; Bethi Pavan Sai

doi:10.30574/wjarr.2025.26.2.1705

eISSN: 2581-9615 || CODEN: WJARAI || Impact Factor 8.2 || CrossRef DOI

Research and review articles are invited for publication in July 2026 (Volume 31, Issue 1) Submit manuscript

A survey on image captioning methods

Kavitha Soppari ¹, Pakide Kavya ², Kotla Pranay Teja ^{2, *}and Bethi Pavan Sai ²

¹ACE Engineering College, Hyderabad, India.

²CSE-AI and ML, ACE Engineering College, Hyderabad, India.

Research Article

World Journal of Advanced Research and Reviews, 2025, 26(02), 3134-3143

Article DOI: 10.30574/wjarr.2025.26.2.1705

DOI url: https://doi.org/10.30574/wjarr.2025.26.2.1705

Publication history

Received on 07 April 2025; revised on 19 May 2025; accepted on 21 May 2025

Abstract

Image captioning is a task that Involves Natural Language Processing concepts to recognize the context of an image and describe them in a natural language like English. It requires good knowledge of Deep learning. Python, working on Jupyter notebooks, Keras library, Numpy, and Natural language processing It is a Python based project where we will use deep learning techniques of Convolutional Neural Networks and a type of Recurrent Neural Network (LSTM) together. The biggest challenge is most definitely being able to create a description that must capture not only the objects contained in an image, but also express how these objects relate to each other. Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing here, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. It could have great impact, for instance by helping visually impaired people better understand the content of images on the web.

Keywords

CNN; LSTM; Image detection; Deep learning; Natural Language Processing

Download Article PDF

https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2025-1705.pdf

Preview Article PDF

How to cite this article

Kavitha Soppari, Pakide Kavya, Kotla Pranay Teja and Bethi Pavan Sai. A survey on image captioning methods. World Journal of Advanced Research and Reviews, 2025, 26(2), 3134-3143. Article DOI: https://doi.org/10.30574/wjarr.2025.26.2.1705

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.

All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Developed & Designed by VS Infosolution

A survey on image captioning methods

Kavitha Soppari ¹, Pakide Kavya ², Kotla Pranay Teja ^{2, *}and Bethi Pavan Sai ²

Preview Article PDF

Get Certificates

Issue details

A survey on image captioning methods

Kavitha Soppari 1, Pakide Kavya 2, Kotla Pranay Teja 2, * and Bethi Pavan Sai 2

Preview Article PDF

Get Certificates

Issue details

Kavitha Soppari ¹, Pakide Kavya ², Kotla Pranay Teja ^{2, *}and Bethi Pavan Sai ²