DSpace Repository

Imbalance Data Classification to Identify Fraudulent Transactions

Show simple item record

dc.contributor.author Karim, Rafat
dc.contributor.author Mahmud, Md. Rifat
dc.contributor.author Maksuda
dc.contributor.author Jannatus Saiyem, MD.
dc.date.accessioned 2020-10-12T08:12:09Z
dc.date.available 2020-10-12T08:12:09Z
dc.date.issued 2019-12-10
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/4661
dc.description Data is considered to be the oil or fuel for the next generation. Data mining is one of the most widely used methods to extract hidden information from large datasets. The main goal of mining is knowledge discovery from databases, which in known as KDD. Mining and discovery is quite similar in the domain of data mining. We discover knowledge by doing mining from the databases. Now the question is how to learn from the dataset. The answer is that there is some classification algorithm for the data mining filed and these are Support Vector Machine algorithm, Decision tree algorithm, Artificial Neural Network algorithm and Adaboost algorithm etc. We train the dataset using these algorithms which classifies for us. Depending on the same or different scenarios, these algorithms' accuracy could be different. Most of the time this problem occurs for bi class datasets, and it also can occur for multi-class datasets as well. Another important term is supervised and unsupervised learning. In supervised learning class labels are known and at unsupervised learning class labels are unknown. And about our dataset this is supervised learning because which mentioned algorithm we have used those are best for supervised learning. en_US
dc.description.abstract Because of the expansion of social media and globalization now a days, peta byte scale of data is being generated in every second. Data mining is the process of extracting knowledge from this huge amount of data. Data mining applications are becoming more useful and key pre-requisite for any kind of business scenarios. However, for certain applications is supervised learning, lack of sufficient data for certain classes creates data imbalance problem. For example, in a credit card fraud detection application, most of the transactions are not fraud and few of them are fraud. In our research, we have applied some classification techniques on an imbalanced data set. We have tested synthetic data from a financial payment system because it is a great challenge to obtain real dataset. Synthetic data is artificially constructed which mimics real world events. We have tested Decision tree, Support Vector Machine, Artificial Neural Network and Adaboost algorithms to treat with class imbalance problem. Among these algorithms, we find promising accuracy from Adaboost compared of others. So in this paper, our main target is that for an imbalance dataset which classification algorithm performs better. en_US
dc.language.iso en en_US
dc.publisher Daffodil International University en_US
dc.subject Data processing en_US
dc.subject Data Mining en_US
dc.subject Making Change (Money Transaction) en_US
dc.title Imbalance Data Classification to Identify Fraudulent Transactions en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account

Statistics