| dc.description.abstract |
In this thesis, I aim to develop a machine learning model for predicting customer
churn in the telecom industry, using data from major telecom operators in India,
including Airtel, Reliance Jio, Vodafone, and BSNL. The dataset contains 243,553
customer records with demographic, usage, and geographic features, along with a
binary variable indicating whether the customer has churned. The goal of this project
is to accurately predict customer churn, providing telecom companies with valuable
insights to retain at-risk customers and optimize marketing efforts.I explore several
machines learning models, including Logistic Regression, Random Forest, and
Gradient Boosting. After preprocessing the data, addressing missing values, encoding
categorical variables, and handling class imbalance using SMOTE, I evaluate each
model’s performance using accuracy, ROC-AUC score, and classification metrics
such as precision, recall, and F1-score. Among the models tested, Gradient Boosting
outperforms others, achieving a high accuracy of 95.2% and a robust ROC-AUC
score of 0.9251. This model shows a balanced trade-off between precision and recall,
especially for the minority churn class. The findings demonstrate that Gradient
Boosting is a highly effective tool for churn prediction in the telecom sector, capable
of providing actionable insights for customer retention strategies. The results also
highlight the importance of feature engineering and data preprocessing in improving
model performance.This research offers a solid foundation for applying machine
learning to real-world business problems, particularly in customer retention within the
telecom industry. |
en_US |