| dc.description.abstract |
This study proves a new pest detection system at a deep level based on deep learning
technology, which promotes the productivity of farmlands through real-time pest
classification. Based on state-of-the-art Vision Transformer (ViT) and Data-efficient
Image Transformer (DeiT) architecture, this paper responds to the real need of an
early pest prediction to avoid crop losses and also limit the use of pesticides. Applying
the IP102 dataset that includes 102 different species of insect pests, special attention
is paid to the six most important ones, and nearly 75,000 images are utilized in a
model training. To increase the performance of the models, advanced strategies of
fusion, such as the early and late fusion as well as voting within the majority, are
used to enable fusion of the outputs of different models in order to obtain higher rates
of accuracy in the classification. ViT/DeiT pre-trained models are fine-tuned by the
usage of transfer learning methods so that limited labeled data could be used to the
fullest. Accuracy, precision, recall, F1-score, and the area under the curve provide
exceptionally good results whereby the Late Fusion model boasts of 98.20 accuracy
and the Teacher Model (KD) with 98.33 accuracy. The Majority Voting model
(97.23%) and Early Fusion model (96.68%) perform rather well as well. The study
highlights the future of ensemble methods and knowledge distillation to determine
the efficiency of the models and to achieve better classification results. This system
will help to develop sustainable farming methods and minimize environmental
impact and food security since it allows creating a scalable and resource-efficient
approach to detecting pests. The future work consists in improving deployment to
mobile and edge devices to have the system available to small-scale farmers in
resource-constrained contexts. |
en_US |