Abstract:
Customer satisfaction and customer retention are essentially related to business strategy, but it is usually assumed without immediate quantitative proved. Subsequently, this problem is addressed in this research, which quantifies the effects of customer satisfaction and retention by employing a hybrid analysis. The present research makes use of a publicly available dataset on e-commerce offered by Kaggle and comprising 3,900 customer records. Preprocessing of the data involved imputing the missing values and also engineering the main variables; Retention (based on Subscription status) and satisfaction binary (based on binned Review rating scores). Two-way methodology was used. Statistically, firstly, a logistic regression model was applied. Because of this model, it was found that Retention is never significantly predicted by satisfaction binary (p-value = 0.631), which requires one to reject the primary hypothesis. On the contrary, the model established Gender as a very important factor that contributes to retention (p < 0.001) and Discount value as a slightly important factor (p = 0.056). Second, prediction and confirmation of a classification model was conducted using a random forest classification model. To verify the statistical results, the importance of measuring the model features, an analysis revealed that satisfaction binary was among the least significant features. The model was found to have Purchase Amount, Age, and Gender with the most significant predictors of retention. The conclusion of this thesis is that the general opinion that satisfaction can be the reason why a person retains is not supported in this dataset. Experiential factors (satisfaction) do not influence retention, but the demographic factor (Gender) and the financial factor (Purchase Amount, Discount value) factor. This is an indication that the business approaches of retention improvement should be directed at demographics-related segmentation and financial stimulation instead of being completely determined by maximizing customer satisfaction metrics.