NaviGatr: AI Assitance for Visually Impaired

NaviGatr: AI Assitance for Visually Impaired

NaviGatr is a project that aims to aid the visually impaired using various computer vision algorithms. While prior systems have used object detection or depth sensing independently, a combined real-time approach has not been widely implemented.

๐ŸŒ View on GitHub

๐Ÿ“„ PDFs

๐ŸงŠ STL Files (viewable)

๐Ÿ“š Docs

NaviGatr: Wearable AI Navigation Assistant for the Visually Impaired

Introduction

NaviGatr is a project that aims to aid the visually impaired using various computer vision algorithms. While prior systems have used object detection or depth sensing independently, a combined real-time approach has not been widely implemented.

Our system uses three different machine learning models to extract spatial and contextual data from camera frames captured at eye level via a wearable headset. These models perform:

The output of these models is aggregated, interpreted, and passed to an output module โ€” currently implemented as a voice assistant โ€” which guides the user with auditory cues about their surroundings.

Alt Text


๐Ÿ”ฆ Computer Vision Models

๐Ÿ” Object Detection

We use different object detection models based on platform constraints:

Both models return bounding boxes and class labels which are combined with depth data to localize and describe objects to the user in real-world terms.

๐Ÿ˜ Emotion Detection

We trained a custom model based on the FER2013 dataset, using the EfficientNetB0 architecture. This model runs on the Coral TPU and:

This allows NaviGatr to add social context to the userโ€™s environment.

Alt Text Alt Text

๐Ÿ“ Depth Sensing

We use Appleโ€™s Depth Pro model โ€” a monocular depth estimation model that returns absolute metric depth, unlike most models that return relative depth.

Depth Pro uses a transformer-based architecture and produces sharp gradients and accurate boundaries. The results are used to locate how far away each object is in meters.

Alt Text

๐Ÿ”Š Output Handling

This is how we handled output delivery in NaviGatr:

The object detection model returns an array of bounding boxes, each containing (x, y) coordinates relative to the camera image. Meanwhile, the depth estimation model outputs a dense depth map containing the estimated distance (in meters) for each pixel.

๐Ÿงฎ Determining Object Depth:

๐Ÿ•’ Spatial Orientation:

Example: If a chair is detected with its bounding box centered near the left-middle of the image and its depth is 2 meters, the audio output will be:

"Chair, 11 o'clock, 2 meters away"

๐Ÿ™‚ Emotion Awareness:

Example: If a sad person is detected at the right side:

"Person, 1 o'clock, sad"

The output is then synthesized into speech and delivered via audio for real-time navigation assistance.



๐Ÿ› ๏ธ Hardware Overview

Raspberry Pi 5

Pi Camera

Coral TPU

GPIO Display

Power Bank

3D Printed Headband

Alt Text


๐Ÿ”— Further Docs & Resources

๐Ÿ“‚ Raspberry Pi 5 Setup & Firmware

Full walkthrough for flashing Ubuntu, installing dependencies, setting up camera access, and running the core script.

โšก Wiring & Power Guide

Wiring diagrams for GPIO display, fans, Coral TPU, and safe USB-C routing.

๐Ÿ“ฝ๏ธ 3D Design & Hardware Assembly

STL files, CAD renderings, and assembly instructions for the NaviGatr headset.

๐Ÿ“„ Final Report (PDF)

Complete engineering report including benchmarking, design decisions, and testing plans.

๐Ÿ“š GitHub Repository

Codebase for the models, hardware integration, and deployment scripts.

๐ŸŽฅ Demonstration Video

(Coming soon) Live test run of NaviGatr navigating a test environment.


๐Ÿ‘ฅ Team Members


NaviGatr pushes the boundary of accessible computing, demonstrating how on-device machine learning and thoughtful hardware design can empower a community often overlooked by mainstream tech.