End-To-End YOLOv12-Based Multi-Stage Pipeline for Bangla License Plate Recognition

Nowshin, Iffat Ara

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF SOFTWARE ENGINEERING
→
Thesis Report
→
View Item

dc.contributor.author	Nowshin, Iffat Ara
dc.date.accessioned	2026-04-27T04:24:54Z
dc.date.available	2026-04-27T04:24:54Z
dc.date.issued	2025-12-27
dc.identifier.citation	SWT	en_US
dc.identifier.uri	http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17072
dc.description	Multi-Stage Pipeline	en_US
dc.description.abstract	Automatic License Plate Recognition (ALPR) systems are critical for intelligent transportation and security infrastructure yet remain challenging for scripts with complex characters like Bangla. Bangla license plate contains complex structure such as curved glyphs, complex conjuncts and area-specific layouts which makes traditional OCR-based pipelines to struggle in the presence of occlusion, motion blur and low-resolution surveillance footage. This paper presents a novel multi-layer end-to-end Bangla ALPR system using YOLOv12's attention-centric architecture. The proposed pipeline utilized a lightweight family of YOLOv12 models so that it can make the feature representation more consistently optimized across vehicle, plate and character detection and improve robustness to scale variation of urban backgournd. We introduce a three-layer model approach: (1) a YOLOv12-based vehicle detection model (0.975 mAP@0.50, 0.924 mAP@0.50:0.95, 2.3 ms/inference), (2) a YOLOv12n license plate detection model (0.975 mAP@0.50, 2.3 ms/inference), and (3) a specialized YOLOv12 character recognizer for Bangla glyphs (0.986 mAP@0.50, 0.750 mAP@0.50:0.95), eliminating OCR dependencies. All the layers are trained on real images of Bangladeshi traffic scenes covering various illumination, cluttered urban scenes, diverse viewpoints and multiple plate layouts to ensure a generalized to real roads of Bangladesh. Trained on real-world Bangladeshi vehicle datasets, our system processes 640×640 resolution images on a consumer-grade GPU. The character recognition model handles 102 classes including conjuncts such as ক্ষ (kkho), জ্ঞ (gya) etc through coordinate-based reconstruction, achieving reliable detection and recognition of Bangla license plate numbers in unconstrained traffic scenes. This study proposes a fast and reliable Bangla license plate recognition solution for real-life traffic scenes and establishes a YOLOv12-based pipeline capable of complex-script ALPR.	en_US
dc.description.sponsorship	DIU	en_US
dc.language.iso	en_US	en_US
dc.publisher	Daffodil International University	en_US
dc.subject	Computer Vision	en_US
dc.subject	Bangla License	en_US
dc.subject	Plate Recognition	en_US
dc.subject	YOLOv12	en_US
dc.subject	Object Detection	en_US
dc.subject	Multi-Stage Pipeline	en_US
dc.title	End-To-End YOLOv12-Based Multi-Stage Pipeline for Bangla License Plate Recognition	en_US
dc.type	Thesis	en_US