مقارنة خوارزميات التكييس والتعزيز في تعليم المجموعات (Ensemble learning) للتنبؤ بأمراض القلب

المؤلفون

  • أ.بدر نجيب عويدات كلية تقنية المعلومات، جامعة الزيتونة - ليبيا

الكلمات المفتاحية:

آلات تعزيز التدرج، آلة تعزيز التدرج الخفيف، تعزيز التدرج الأقصى، مقياس نسبة صحة، مقياس المثالية، مقياس الدقة، تعليم المجموعات

الملخص

إن تعليم المجموعات (Ensemble learning) هو نهج عام لوصف التعلم الآلي يسعى إلى أداء تنبؤي أفضل من خلال الجمع بين التنبؤات من نماذج متعددة. يتضمن أسلوب تعليم المجموعات عددًا من الطرق منها أسلوب التكييس (Bagging) والتعزيز(Boosting) ويمتلك هذين الاسلوبين مجموعة من الخوارزميات منها خوارزمية الغابة العشوائية Random Force وخوارزمية التدرج التكيفي AdaBoost وخورزميات التدرج التعزيزي (Gradient Boosting). في هذا البحث، سنقارن بين الاسلوبين التكييس والتعزيز من حيث نسبة صحة خوارزمية التصنيف (Accuracy) ومقياس المثالية (Recall) ومقياس الدقة (Precision) معامل Cohen Kappa ومقياس F (F-measure) ومقياس الحساسية (Sensitivity) ومقياس النوعية (Specificity) ومستوى المنطقة الواقعة تحت منحنىROC في التنبؤ بأمراض قصور القلب.

المراجع

المراجع:

Gautam Kunapuli, (2020), Ensemble Methods for Machine Learning, Manning Publications, ISBN-13: 9781617297137.

Julia Gastinger, S ́ebastien Nicolas, Duˇsica Stepi ́c, Mischa Schmidt, Anett Sch ̈ulke, (2021), A study on Ensemble Learning for Time Series Forecasting and the need for Meta-Learning, International Joint Conference on Neural Network, DOI:10.1109/IJCNN52387.2021.9533378

Unknown, (2022), Ensemble Learning and Ensemble Learning Techniques, Analytics Vidhya, date access 26-9-2022, direct access: https://courses.analyticsvidhya.com/courses/ensemble-learning-and-ensemble-learning-techniques?utm_source=blog&utm_medium=boosting-algorithms-simplified.

Kyle D Peterso, (2018), Resting Heart Rate Variability Can Predict Track and Field Sprint Performance, OA Journal-Sports, Volume 1.

Unknown, (2022), Bootstrap aggregating, From Wikipedia, the free encyclopedia, Date Access 26-8-2022, direct access:

https://en.wikipedia.org/wiki/Bootstrap_aggregating.

Unknown, (2022), Optical character recognition, From Wikipedia, the free encyclopedia, Date Access 26-8-2022, direct access:

https://en.wikipedia.org/wiki/Optical_character_recognition.

Sajid Nagi, Dhruba Kr. Bhattacharyya, (2013), Classification of microarray cancer data using ensemble approach, Network Modeling Analysis in Health Informatics and Bioinformatics, volume 2, pages 159–173.

- Aarshay Jain, (2022), Complete Machine Learning Guide to Parameter Tuning in Gradient Boosting (GBM) in Python, Analytics Vidhya, date Access 17-8-2022, direct access: https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/.

- مجهول, (2022), ما المقصود بالتعزيز, Amazon Web Services, تاريخ الوصول 26-9-2022, الرابط المباشر:

https://aws.amazon.com/ar/what-is/boosting/.

- Satish Gunjal, (2020), Ensemble Learning: Bagging, Boosting & Stacking, Kaggle, Data Access 18-9-2022, Direct Link: https://www.kaggle.com/code/satishgunjal/ensemble-learning-bagging-boosting -stacking/notebook.

- Jason Brownlee, (2021), Ensemble Learning Algorithms With Python: Make Better Predictions with Bagging, Boosting, and Stacking, Machine Learning Mastery.

- Andrew William, (2021), A Comprehensive Mathematical Approach to Understand AdaBoost, Towards Data Science, Date Access 18-9-2022, Direct Access: https://towardsdatascience.com/a-comprehensive-mathematical-approach-to-understand-adaboost-f185104edced.

Cheshta Dhingra, (2020), A Visual Guide to Gradient Boosted Trees (XGBoost), Towards Data Science, Date Access 20-9-2022, Direct Access: https://towardsdatascience.com/a-visual-guide-to-gradient-boosted-trees-8d9ed578b33.

Bradley Boehmke, Brandon Greenwell, (2019), Hands-On Machine Learning with R, CRC Press. 1st Edition, New York.

Ashish Kumar, 2022, The Ultimate Guide to AdaBoost Algorithm, Great Learning, Date Access 23-9-2022, Direct Access: https://www.mygreatlearning.com/blog/adaboost-algorithm/.

Gajendra, AdaBoost Classifier: Understanding AdaBoost Classifier, Medium, Date Access 29-9-2022, Direct Access: https://medium.com/@gajendra.k.s/adaboost-classifier-e43bc88ecc07.

Peng Zhang, 2021, AN OPTIMIZED ADABOOST ALGORITHM BASED ON K-MEANS CLUSTERING, Journal of Physics Conference Series, first volume, DOI:10.1088/1742-6596/1856/1/012021.

Madhumita Pal, Smita Parija, 2020, Prediction of Heart Diseases using Random Forest, ICCIEA 2020 , IOP Publishing, Journal of Physics: Conference Series , doi:10.1088/1742-6596/1817/1/012009.

Kompella Sri Charan, Kolluru S S N S Mahendranath, (2022), Heart Disease Prediction Using Random Forest Algorithm, International Research Journal of Engineering and Technology (IRJET), Volume 9, Issue 3.

Jian Yang, Jinhan Guan, 2022, A Heart Disease Prediction Model Based on Feature Optimization and Smote-Xgboost Algorithm, Information, Volume 13, Number 475.

Summer Hu, 2021, Run through LightGBM Fast Training Techniques, date Access 15-1-2023, Link Access: https://medium.com/swlh/understand-lightgbm-fast-training-techniques-8dab16487cd5.

Essam Al Daoud, 2019, Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset, World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering, Vol:13, No:1

Liang, W., Luo, S., Zhao, G., & Wu, H. (2020), Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms. Mathematics, 8(5), 765. doi:10.3390/math8050765

Stephanie Bourdeau,2019, Deciding on How to Boost Your Decision Trees, Medium, date Access 15-1-2023, Link Access:

https://medium.com/@stephkendall/deciding-on-how-to-boost-your-decision-trees-1ea5412c0fe7.

Mingming Zhaoa, Jianguo Zhoub, Zifeng Wuc, Wenyu Pengd, Wei Zhoue, Yu Liang, 2020, Exploring the H2H genes in 3D v, IOP Conf. Series: Earth and Environmental Science 440 (2020) 042079, doi:10.1088/1755-1315/440/4/0420

Powers, David M W, Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation, Journal of Machine Learning Technologies, 2008.

Nabeela Ashraf1, Waqar Ahmad2, Rehan Ashraf3, A Comparative Study of Data Mining Algorithms for High Detection Rate in Intrusion Detection System, Annals of Emerging Technologies in Computing (AETiC) Vol.2, No.1, 2018.

Lidet Tefera, Precision and recall, Medium, Addis Ababa, 2020, date access 25-6-2022, direct link: https://medium.com/@lidetsal/precision-and-recall-30fd346cf90a.

Ajitesh Kumar, (2022), Cohen Kappa Score Python Example: Machine Learning, Data Analytics, Data Analytics, date access 1-6-2022, direct link:

https://vitalflux.com/cohen-kappa-score-python-example-machine-learning/

Ajitesh Kumar, (2022), Machine Learning – Sensitivity vs Specificity Difference , Machine Learning, Data Analytics, date access 1-6-2022, direct link:

https://vitalflux.com/ml-metrics-sensitivity-vs-specificity-difference/

Shruti Shishir Gosavi, (2018), A Comparison of Data Mining Classifiers in Weka, International Journal of Creative Research Thoughts (IJCRT), Volume 6, Issue 1, 2018 | ISSN: 2320-2882

Muhammad sakib khan inan, Istiakur rahman, (2022), integration of explainable artificial intelligence to identify significant landslide causal factors for extreme gradient boosting based landslide susceptibility mapping with improved feature selection, machine learning applied to geo-technical engineering, arxiv, v1.

Md. Maidul Islam, Tanzina Nasrin Tania, Sharmin Akter, Kazi Hassan Shakib, (2022), An Improved Heart Disease Prediction Using Stacked Ensemble Method, CC BY-NC-ND 4.0, DOI:10.13140/RG.2.2.16442.47044.

التنزيلات

منشور

2023-10-29

كيفية الاقتباس

أ.بدر نجيب عويدات. (2023). مقارنة خوارزميات التكييس والتعزيز في تعليم المجموعات (Ensemble learning) للتنبؤ بأمراض القلب . مجلة البيان العلمية, (15), 209–194. استرجع في من https://journal.su.edu.ly/index.php/bayan/article/view/1801