Computational Pharmacology and Toxicity Modeling Datasets for Drug Discovery and Translational Research

Authors

  • Samuel Chima Ugbaja Traditional Medicine, School of Medicine, University of KwaZulu Natal, Durban 4000, South Africa Author
  • Mlungisi Ngcobo Traditional Medicine, School of Medicine, University of KwaZulu Natal, Durban 4000, South Africa Author

Keywords:

computational pharmacology, toxicity prediction, machine learning, drug discovery, molecular datasets

Abstract

The integration of computational approaches with pharmacological data has transformed modern drug discovery, enabling efficient prediction of molecular properties, toxicity, and clinical outcomes. This study presents a unified framework that leverages multiple publicly available datasets, including SIDER, ClinTox, Tox21, ToxCast, and BBBP, to model drug behavior across molecular, biological, and translational levels. Comprehensive data preprocessing, feature engineering, and machine learning techniques were applied to capture structure–activity relationships and predict pharmacological endpoints. A combination of classical machine learning and deep learning models, including graph-based architectures and multi-task learning frameworks, was employed to address dataset heterogeneity and improve predictive performance. The results demonstrate strong predictive capability for toxicity and permeability endpoints, with improved generalization achieved through scaffold-based data splitting and ensemble modeling. Feature importance analysis identified key physicochemical properties influencing drug response, while integration across datasets enabled the construction of comprehensive pharmacological profiles. The study highlights the value of combining mechanistic bioassay data with clinical outcome datasets to bridge the gap between molecular-level predictions and real-world therapeutic relevance. Limitations related to data imbalance and lack of formulation-specific information were identified, indicating areas for future improvement. Overall, the proposed framework supports efficient and scalable drug discovery by enhancing predictive accuracy and enabling informed decision-making in early-stage development.

References

Downloads

Published

2026-04-06