Implementation of A Hybrid Model Using K-Means Clustering and Artificial Neural Networks for Risk Prediction in Life Insurance

Kimanga Nthenge, Jeff

DSpace Home
→
Masters Theses and Projects
→
Master Theses: Department of Computing and Information Technology
→
View Item

dc.contributor.author	Kimanga Nthenge, Jeff
dc.date.accessioned	2024-11-06T16:22:42Z
dc.date.available	2024-11-06T16:22:42Z
dc.date.issued	2024-09
dc.identifier.uri	http://repository.embuni.ac.ke/handle/embuni/4433
dc.description.abstract	Accurate assessment of the risk posed by prospective policyholders is crucial for life insurance companies to effectively price policies and manage long-term liabilities. However, the complexity of risk factors makes relying solely on traditional actuarial models insufficient, particularly with the abundance of big data and unstandardized data from various sources. This study explored the development and performance of a hybrid machine learning model that combines Artificial Neural Network and K-Means Clustering to improve risk prediction in life insurance underwriting. A quasi-experimental design was adopted to evaluate the efficacy of K-Means Clustering and ANN algorithms on benchmark datasets and develop a hybrid model for risk prediction. The proposed hybrid model utilized the strengths of Artificial Neural Networks in modelling nonlinear relationships and K-Means in pattern recognition to handle unstandardized data. Using anonymized life insurance application data from Kaggle, the ANN algorithm achieved an accuracy of 90% but showed limitations in handling nonlinear relationships. K-Means Clustering successfully identified distinct risk profiles among policyholders, revealing hidden patterns in the unlabelled data. The hybrid model, integrating K-Means Clustering and ANN with principal component analysis for feature selection and the Adam optimizer, resulted in higher model performance. Testing accuracy improved from 90% for the standalone ANN to 98% for the hybrid technique, with improvements in precision, recall, and Area Under the ROC Curve. The enhanced predictive capability highlighted the potential of the hybrid approach in modernizing underwriting practices and conducting a more sophisticated data-driven analytical evaluation of policyholder risk. However, there were limitations, such as the use of a single-sourced insurance dataset due to concerns about data privacy. Further research into integrating diverse algorithms and testing on larger real-world datasets can assist insurers in unlocking more value and gaining a competitive advantage through advanced analytical modelling.	en_US
dc.language.iso	en_US	en_US
dc.publisher	UoEm	en_US
dc.subject	Hybrid machine learning model	en_US
dc.title	Implementation of A Hybrid Model Using K-Means Clustering and Artificial Neural Networks for Risk Prediction in Life Insurance	en_US
dc.type	Thesis	en_US