A boundary-based tokenization technique for extractive text summarization

Nnaemeka M Oparauwah; Juliet N Odii; Ikechukwu I Ayogu; Vitalis C Iwuchukwu

doi:10.30574/wjarr.2021.11.2.0351

eISSN: 2581-9615 || CODEN: WJARAI || Impact Factor 8.2 || CrossRef DOI

Research and review articles are invited for publication in July 2026 (Volume 31, Issue 1) Submit manuscript

A boundary-based tokenization technique for extractive text summarization

Nnaemeka M Oparauwah ^*, Juliet N Odii, Ikechukwu I Ayogu and Vitalis C Iwuchukwu

Department of Computer Science, School of Information and Communication Technology, Federal University of Technology, P.M.B 1526, Owerri, Nigeria.

Research Article

World Journal of Advanced Research and Reviews, 2021, 11(02), 303-312

Article DOI: 10.30574/wjarr.2021.11.2.0351

DOI url: https://doi.org/10.30574/wjarr.2021.11.2.0351

Publication history

Received on 25 June 2021; revised on 10 August 2021; accepted on 12 August 2021

Abstract

The need to extract and manage vital information contained in copious volumes of text documents has given birth to several automatic text summarization (ATS) approaches. ATS has found application in academic research, medical health records analysis, content creation and search engine optimization, finance and media. This study presents a boundary-based tokenization method for extractive text summarization. The proposed method performs word tokenization by defining word boundaries in place of specific delimiters. An extractive summarization algorithm was further developed based on the proposed boundary-based tokenization method, as well as word length consideration to control redundancy in summary output. Experimental results showed that the proposed approach enhanced word tokenization by enhancing the selection of appropriate keywords from text document to be used for summarization.

Keywords

Boundary-based; Tokenization; Extractive; Automatic; Text; Summarization

Download Article PDF

https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2021-0351.pdf

Preview Article PDF

How to cite this article

Nnaemeka M Oparauwah, Juliet N Odii, Ikechukwu I Ayogu and Vitalis C Iwuchukwu. A boundary-based tokenization technique for extractive text summarization. World Journal of Advanced Research and Reviews, 2021, 11(2), 303-312. Article DOI: https://doi.org/10.30574/wjarr.2021.11.2.0351

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.

All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Developed & Designed by VS Infosolution

A boundary-based tokenization technique for extractive text summarization

Nnaemeka M Oparauwah ^*, Juliet N Odii, Ikechukwu I Ayogu and Vitalis C Iwuchukwu

Preview Article PDF

Get Certificates

Issue details

A boundary-based tokenization technique for extractive text summarization

Nnaemeka M Oparauwah *, Juliet N Odii, Ikechukwu I Ayogu and Vitalis C Iwuchukwu

Preview Article PDF

Get Certificates

Issue details

Nnaemeka M Oparauwah ^*, Juliet N Odii, Ikechukwu I Ayogu and Vitalis C Iwuchukwu