Abstract:
Due to the diversified nature of enormous datasets of real world, the challenging task of data mining provides interesting and powerful insights that may contribute in greater aspects. The aim of this research work is to properly classify a large, highly dependent and complex dataset to predict customer income range from other demographic attributes. The baseline evaluation phase is confronted with overfitting problem. In order to improve classification performance, well established techniques were applied here but satisfactory result in significant level is obtained with resampling method. Useful measures have been considered to evaluate and validate the models. The positive ramification of this study is it identifies the limitations of supervised classification problem for those datasets which consist of highly dependent feature vectors and also incorrect class information.