Abstract:
Air pollution is crucial for the health of both humans and animals because it has been connected
to a number of deadly illnesses, including cancer. Nonetheless, a lot of industries, ships, farms,
and housing contribute to air pollution due to the world's swift urbanization and population
expansion. As a result, air pollution has become a major problem in many cities, especially in
developing countries like Bangladesh. Maintaining indoor air quality requires regular forecasting
and monitoring of air pollution. As a result, machine learning (ML) has demonstrated potential in
surpassing conventional methods in the prediction of the air quality index (AQI). An indicator of
the condition of the atmosphere is the air quality index, or AQI. It estimates the short-term effects
of modest exposure on an individual's health. The public is to be made aware of the harmful effects
that ambient pollution has on health through the use of the AQI. The quantity of pollutants in the
air has significantly increased in Indian cities. Using Dhaka's (the capital city of Bangladesh) AQI,
we focus on a few parameters starting with PM2.5 in 2017 and going all the way through to 2022.
The objective of the research is to ascertain how well NLP methods recognize and classify activity
inside AQI categories. An algorithm is trained via managed instruction to classify AQI information
using labeled data and forecast outcomes with accuracy. Amongst the models that machine
learning applied to achieve this goal were XG Boost, a Random Forest, K-Nearest Neighbors,
Naive Bayes, and Linear Regression. With a 99.81% classification accuracy, the Random Forest
classifier was shown to be the most accurate after data analysis, correctly classifying AQI values
into six different categories: hazardous, unhealthy, very unhealthy, good, moderate, and unhealthy
for delicate groups. Finally, a web prototype is generated by classifying the AQI subcategory using
the method known as Random Forest.