Abstract:
Cancer is deadly disease which is caused due to uncontrolled growth of the cells and forms
from the extra mass tissue known as tumor. There are over 200 types of cancer. Breast
cancer represents one of the diseases that make a high number of deaths every year. It is
the most common type of all Cancers. Breast cancer is the second leading cause of cancer
death in women. The chance that a woman will die from breast cancer is about 1 in 39
(about 2.6%) [1]. In 2020, there were 2.3 million women diagnosed with breast cancer and
685 000 deaths globally [2]. When detected in its early stages, there is a 30% chance that
the cancer can be treated effectively, but the late detection of advanced-stage tumors makes
the treatment more difficult [3,4]. By using Machine Learning we have built a model which
can predict the possibility of having breast cancer. The model that we have built was trained
by Wisconsin Breast Cancer dataset (WDBC) for breast cancer diagnosis prediction. On
experiment, these data were processed and analyzed by various data pre-processing
techniques. Then Some classic Machine Learning algorithms like Naive Bayes, Random
Forest, Logistic Regression, K-Nearest Neighbors, Support Vector Machine (SVM),
Decision Tree and Neural Network were used for building the model and the performance
of each of them was measured using metrics like prediction accuracy on the testing and
training data, Precision, Recall, F1 score and Support. Overall Support Vector Machine
(SVM) performed better than others. So, the Support Vector Machine model was chosen
for the prediction of the disease.