| dc.contributor.author | Karim, Rafat | |
| dc.contributor.author | Mahmud, Md. Rifat | |
| dc.contributor.author | Maksuda | |
| dc.contributor.author | Jannatus Saiyem, MD. | |
| dc.date.accessioned | 2020-10-12T08:12:09Z | |
| dc.date.available | 2020-10-12T08:12:09Z | |
| dc.date.issued | 2019-12-10 | |
| dc.identifier.uri | http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/4661 | |
| dc.description | Data is considered to be the oil or fuel for the next generation. Data mining is one of the most widely used methods to extract hidden information from large datasets. The main goal of mining is knowledge discovery from databases, which in known as KDD. Mining and discovery is quite similar in the domain of data mining. We discover knowledge by doing mining from the databases. Now the question is how to learn from the dataset. The answer is that there is some classification algorithm for the data mining filed and these are Support Vector Machine algorithm, Decision tree algorithm, Artificial Neural Network algorithm and Adaboost algorithm etc. We train the dataset using these algorithms which classifies for us. Depending on the same or different scenarios, these algorithms' accuracy could be different. Most of the time this problem occurs for bi class datasets, and it also can occur for multi-class datasets as well. Another important term is supervised and unsupervised learning. In supervised learning class labels are known and at unsupervised learning class labels are unknown. And about our dataset this is supervised learning because which mentioned algorithm we have used those are best for supervised learning. | en_US |
| dc.description.abstract | Because of the expansion of social media and globalization now a days, peta byte scale of data is being generated in every second. Data mining is the process of extracting knowledge from this huge amount of data. Data mining applications are becoming more useful and key pre-requisite for any kind of business scenarios. However, for certain applications is supervised learning, lack of sufficient data for certain classes creates data imbalance problem. For example, in a credit card fraud detection application, most of the transactions are not fraud and few of them are fraud. In our research, we have applied some classification techniques on an imbalanced data set. We have tested synthetic data from a financial payment system because it is a great challenge to obtain real dataset. Synthetic data is artificially constructed which mimics real world events. We have tested Decision tree, Support Vector Machine, Artificial Neural Network and Adaboost algorithms to treat with class imbalance problem. Among these algorithms, we find promising accuracy from Adaboost compared of others. So in this paper, our main target is that for an imbalance dataset which classification algorithm performs better. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Daffodil International University | en_US |
| dc.subject | Data processing | en_US |
| dc.subject | Data Mining | en_US |
| dc.subject | Making Change (Money Transaction) | en_US |
| dc.title | Imbalance Data Classification to Identify Fraudulent Transactions | en_US |
| dc.type | Other | en_US |