View Document

Title

Enhancing Real-Time Vehicle and Pedestrian Detection Using YOLO with Hybrid Feature Fusion

Organization

Department: Computer Science

Document Attachments

Resource	Length	Width	Thickness
Paper	0	0	0

Specimen Elements

City

Pocatello

Coverage

Unknown to Unknown

Creator

Shijon Das

Publisher

Idaho State University

Type

Thesis

Is Complete

Date Published

6/25/2025

Archive

digital

City: Pocatello

Degree

Master

Document Attachments

Abstracts

Real-time object detection is central to the development of intelligent transportation systems, autonomous vehicles, smart city monitoring, and pedestrian safety functionalities. Among several deep learning-based approaches, the You Only Look Once (YOLO) series of object detectors has been among the top choices for a long time due to the tradeoff it has attained between accuracy and inference speed. This thesis gives an in-depth review and experimental analysis of YOLO versions 7–12, utilized for the use of real-time vehicle and pedestrian object detection. While earlier versions of YOLO were performance competitive, they were poor at occlusion, low-light, and detection of small objects. In order to overcome these weaknesses, this paper proposes a novel YOLOv12-hybrid feature fusion model that integrates transformer-based attention mechanisms, bidirectional multi-scale feature aggregation, and RGB, depth, and semantic segmentation cross-modal input fusion. Large-scale experiments were conducted on the COCO 2017 dataset, comparing each iteration of YOLO based on mean Average Precision (mAP), training loss convergence, and inference speed (FPS). The results establish that YOLOv12 surpasses previous models, with an mAP of 88.2% and over 47 FPS inference rates, and yet offers consistent detection in challenging urban settings. Contrast against the traditional and state-of-the-art models also indicates the dominance of YOLOv12 for real-world deployment. This work not only establishes the benchmark for YOLO detector development but also offers a scalable, accurate, and real-time enabled model structure tailored for safety-critical use cases in traffic monitoring, autonomous vehicles, and smart infrastructure systems. Keywords: YOLO, object detection, hybrid feature fusion, FPS, CNN, FPN, deep learning, traffic systems

Enhancing Real-Time Vehicle and Pedestrian Detection Using YOLO with Hybrid Feature Fusion

Necessary Documents

View Details

Paper

Document

Information

Paper -Document

ETD

View Document

Enhancing Real-Time Vehicle and Pedestrian Detection Using YOLO with Hybrid Feature Fusion

Necessary Documents

Document

Other Projects