DSpace Repository

Lung Cancer Prediction Using Machine Learning Techniques

Show simple item record

dc.contributor.author Akash, M.K.
dc.date.accessioned 2025-09-29T06:08:06Z
dc.date.available 2025-09-29T06:08:06Z
dc.date.issued 2024-07-13
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/14758
dc.description Project report en_US
dc.description.abstract In my thesis project, "Lung Cancer Prediction Using Machine Learning Techniques," I aimed to develop a reliable system for predicting lung cancer risk through the application of various machine learning algorithms. The dataset utilized was sourced from Kaggle, originating from an online lung cancer prediction system. It comprised multiple attributes related to individuals' demographics, lifestyle choices, and health symptoms, with a binary target variable indicating the presence or absence of lung cancer. Initially, I preprocessed the dataset, converting certain column values to binary (0 and 1) and addressing missing values. During exploratory data analysis, I identified an imbalance in the target distribution and mitigated it using oversampling techniques. Additionally, I performed feature engineering by eliminating irrelevant features and creating new ones to enhance predictive capability. To reduce dimensionality, I employed Principal Component Analysis (PCA) before training several machine learning models including Logistic Regression, Decision Tree, K Nearest Neighbor, Multinomial Naive Bayes, Support Vector Classifier, and Multi- layer Perceptron classifier. Among these models, Logistic Regression emerged as the top performer, achieving an accuracy of 95%. Subsequently, I applied Grid Search on Logistic Regression to optimize hyperparameters, resulting in a slight accuracy improvement to 94.89%. Despite experimenting with ensemble techniques like Voting Classifier, Logistic Regression consistently outperformed other models. Finally, I conducted K-Fold cross- validation to validate model robustness, with Logistic Regression demonstrating the highest average accuracy compared to Decision Tree and Multi-layer Perceptron. In conclusion, my research highlights Logistic Regression as the most effective model for lung cancer risk prediction, emphasizing its accuracy and reliability based on the given dataset and features. en_US
dc.description.sponsorship DIU en_US
dc.language.iso en en_US
dc.publisher Daffodil International University en_US
dc.subject Lung cancer en_US
dc.subject Machine Learning en_US
dc.subject Computer-aided diagnosis (CAD) en_US
dc.subject Medical imaging en_US
dc.title Lung Cancer Prediction Using Machine Learning Techniques en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account