amhsr-open access medicla research journals

Evaluating Traditional Machine Learning Models for Predicting Diabetes Onset Using the Pima Indians Dataset

Author(s):

Faith Nassiwa* and Jiahui Zeng

Diabetes is a leading disease in the world. With the seriousness of diabetes and its complexity in diagnosis, we aimed to produce a model to help with prediction of onset of diabetes. Three models, logistic regression, gradient boosting and random forest were performed and evaluated to predict the onset of diabetes. A dataset of size 768 that includes information about some indian population were used. the population are specific to indian women that are at least 21 years old and of Pima Indian Heritage. Methods of standardizing including Synthetic Minority Oversampling Technique (SMOTE) and hyperparameter tuning are performed.

Random forest performed the best with an accuracy score of 81.8%, followed by gradient boosting (78%), and followed by logistic regression (76%). Glucose, BMI and age are the top predictors for Diabetes according to random forest feature importance. Because of the limited dataset we used in this dataset, more future available datasets are hoped to improve the accuracy of the models and give more information about the onset of diabetes. Moreover, this dataset is very specific to some group, future datasets with information about broader groups (including more age, gender and race) might give more insights about this issue.


Select your language of interest to view the total content in your interested language


Awards Nomination
20+ Million Readerbase
Abstracted/Indexed in

  • Include Baidu Scholar
  • CNKI (China National Knowledge Infrastructure)
  • EBSCO Publishing's Electronic Databases
  • Exlibris – Primo Central
  • Google Scholar
  • Hinari
  • Infotrieve
  • National Science Library
  • ProQuest
  • TdNet
  • African Index Medicus
Annals of Medical and Health Sciences Research The Annals of Medical and Health Sciences Research is a bi-monthly multidisciplinary medical journal.
Submit your Manuscript