EAJET

Real-Time Object Detection and Voice-Based Recognition Using YOLO and Webcam

Authors

  • M.Dattatreya Goud

    Department of Computer Science, J.S University, Shikohabad, U.P
    Author

DOI:

https://doi.org/10.5281/zenodo.15863060

Keywords:

Object Detection, Deep Learning, YOLO ,Real-time Detection, Bounding Boxes.

Abstract

Deep detectors have arguably been the best performers across applications; the world images often introduce noise, blur, and rotation, affecting the accuracy and efficiency of detection profusely. YOLO is implemented for this work because it is a robust deep learning mechanism and a real-time detector. Unlike Convolutional Neural Networks or Fast R-CNNs that crank out image segmentation into their processing pipeline, YOLO looks at the image as a whole and makes a single forward pass to simultaneously predict bounding boxes and class probabilities. Such a one-time pass for all detections helps keep it fast and accurate, making YOLO good for real-time detections. The YOLO algorithm is employed in this project for real-time detection of various types of objects through live video feeds. Additionally, the integration of YOLO-based object detection with voice feedback has been implemented in a mobile app. The app detects objects in images and in live-streams, and also announces detected items aloud; this function makes the system very accessible for blind users. The synergy between YOLO, with its speed, precision, and efficiency, and the voice-based mobile integration, makes for one of the most useful, easy-to-use methods of real-time object detection, with much potential in the fields of accessibility, navigation, and smart assistive-ware.

Additional Files

Published

2025-07-05