| dc.description.abstract |
Liver diseases such as Hepatitis, Cirrhosis, Fatty Liver Disease (NAFLD), and Cholestasis
are the main causes of world-wide health burden due to their non-symptomatic
presentation and diagnosis at late stages. Correct diagnosis at an earlier stage is crucial
to have improved patient outcomes. The present study presents a Machine Learning
Framework to characterize the liver health by using supervised methods to categorize and
sub-type liver pathology. The data is based on Kaggle's Liver Patient Dataset and includes
clinical and biochemical parameters such as Age, Gender, levels of Bilirubin, Alkaline
Phosphatase, AST, SGPT, levels of Albumin, and the A/G Ratio. A standard data pre-
processing routine was used to clean up the missing values, remove outliers, add new
result column and normalize features to prepare and transform the data to feed into
models. Its rule-based labeling system identified the categories of the diseases and then
several machine learning classifiers were trained on the dataset by employing Random
Forest (RF), K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), and an
ensemble Voting Classifier. The RF and the KNN demonstrated extremely high accuracy
(65-99%), but the Voting Classifier demonstrated higher robustness by using ensemble
learning. This rule-based system was employed to further classify Hepatitis cases into
subtypes: Acute Viral (A/E), Chronic Hepatitis B, Hepatitis C, Alcoholic Hepatitis,
Autoimmune Hepatitis, and NASH (Fatty Hepatitis). The hybrid system has been
implemented as an online interactable web app through Streamlit with real-time liver
disease and Hepatitis subtype predictions so it is readily accessible to clinicians,
researchers, and patients. This system offers faster early detection and a better liver
health diagnostic tool. |
en_US |