Abstract:
Genetic disease caused by mutations in DNA or chromosome abnormalities are major global health concerns. These illnesses, which are frequently inherited, cover a broad spectrum of conditions that have significant effects on a person's physical, mental, and social well-being as well as those of their family. The frequency of genetic diseases is still rising, which is made worse by a general lack of knowledge about the significance of genetic testing, even though advancements in genetic testing enable early detectionand well-informed decision-making. Most of these ailments are incurable, meaning that continuous observation is required. Although early discovery can have a major impact on disease management and patient outcomes, predicting genetic abnormalities in advance is a tough but important task. With the use of large patient medical datasets, machine learning algorithms present a promising avenue for the extremely accurate identification of genetic diseases. This work proposes a complex genetic maladies prediction system that contains machine learning methods, such as Decision Tree (DT), K-Nearest Neighbors (KNN), Random Forest (RF), Logistic Regression(LR),Gradient Boosting(GB) and Support Vector Machine (SVM). Using datasets covering genetic disorders prediction and shown their subclasses, our ensemble method combines several machine learning models, including Support Vector Machine, Gradient Boosting, Random Forest, K-Nearest Neighbors (KNN). The dataset is graphically evaluated for patterns in genetic illnesses, symptoms, inherited genes, and birth abnormalities after undergoing extensive preprocessing to resolve missing information. Prior to using machine learning models, which are assessed using the following metrics ccuracy, recall, precision and F1 score, features are first selected by standardization. To evaluate the performance of classification, confusion matrices are produced.The Ensemble Model provide the highest accuracy of 99.59% for genetic disorders prediction. Other traditional model accuracy like, K-Nearest Neighbors (KNN) 97.54%, Decision Tree 91.9%, Random Forest 96.72%, Gradient Boosting 93.55%, Support Vector Machine 86.3%,Logistic Regression 86.3% according to the results. Ensemble model technique more accurately finding the genetic disease prediction signals a paradigm shift in preventive disease management and individualized treatment by showcasing a pragmatic and competitive strategy. This discovery has the potential to revolutionize the healthcare sector by combining cutting-edge computational methods with extensive medical data, bringing in a new era of personalized care and proactive wellness.