Home
World Journal of Advanced Research and Reviews
International Journal with High Impact Factor for fast publication of Research and Review articles

Main navigation

  • Home
    • Journal Information
    • Editorial Board Members
    • Reviewer Panel
    • Abstracting and Indexing
    • Journal Policies
    • Our CrossMark Policy
    • Publication Ethics
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Join Editorial Board
    • Join Reviewer Panel
  • Contact us
  • Downloads

eISSN: 2581-9615 || CODEN: WJARAI || Impact Factor 8.2 ||  CrossRef DOI

Research and review articles are invited for publication in April 2026 (Volume 30, Issue 1) Submit manuscript

High-throughput prediction of small molecule binding affinities to ABL1, HSP90, and CDK2 using gradient boosting and machine learning on the BELKA dataset

Breadcrumb

  • Home
  • High-throughput prediction of small molecule binding affinities to ABL1, HSP90, and CDK2 using gradient boosting and machine learning on the BELKA dataset

Vedant Shrinivas Sagare *

Dublin High School, Dublin, Alameda County, California.
 
Research Article
World Journal of Advanced Research and Reviews, 2024, 24(01), 2426-2434
Article DOI: 10.30574/wjarr.2024.24.1.3068
DOI url: https://doi.org/10.30574/wjarr.2024.24.1.3068
 
Received on 27 August 2024; revised on 23 October 2024; accepted on 26 October 2024
 
Most of the time, drug development is burdened by a large search space of possible drug-like molecules and resource-consuming conventional screening methodologies. This work leverages machine learning to predict the binding affinity of small molecules to certain protein targets, one of the major steps in modern drug development. The paper hereby aims at making the process of drug discovery more efficient and accurate by leveraging information from the Big Encoded Library for Chemical Assessment, BELKA dataset, which involves 133 million small molecules screened in interaction against three protein targets, namely Tyrosine-protein kinase ABL1, Heat shock protein 90, and Cyclin-dependent kinase 2. A model using LightGBM was thus developed for affinity prediction, using molecular descriptors derived from the SMILES representation of the molecules. It then splits the data into training and test data, and feature extraction is done through RDKit, calculating the molecular weight, hydrogen bond donors, and acceptors for each molecule. The model achieved an average precision score of 0.84 with strong predictive power. This gave an average precision of 0.88 on the target Tyrosine-protein kinase ABL1, followed by a rather moderate score for targets HSP90 and CDK2, with averages of 0.83 and 0.81, respectively. Feature importance analysis showed that molecular weight joined with hydrogen bonding capacity was among the most valued features in the model's predictions. In this respect, LightGBM can be considered a powerful tool in accelerating drug discovery due to its high accuracy and efficiency of prediction of binding interactions, whereby further potential improvements are related to the inclusion of more complex molecular features and 3D descriptors.
 
Small molecule-protein interactions; Machine learning; LightGBM; Molecular descriptors; SMILES; Binding affinity
 
https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2024-3068.pdf

Preview Article PDF

Vedant Shrinivas Sagare. High-throughput prediction of small molecule binding affinities to ABL1, HSP90, and CDK2 using gradient boosting and machine learning on the BELKA dataset. World Journal of Advanced Research and Reviews, 2024, 24(1), 2426-2434. Article DOI: https://doi.org/10.30574/wjarr.2024.24.1.3068

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content

Copyright © 2026 World Journal of Advanced Research and Reviews - All rights reserved

Developed & Designed by VS Infosolution