VisionMate — Where Technology assists Vision

Beyond the White Cane

It started with a simple question: what if technology could see the world for someone who cannot?

For most of us, understanding our surroundings happens without thought - a face in front of us, an obstacle on the road, a familiar place. The human eye captures light and detail, and the brain turns it into meaning in a fraction of a second. But for a visually impaired person, every step requires awareness, trust, and courage. The world depends on vision, but not everyone can experience it the same way.

VisionMate was born from the desire to bridge this gap.

It all began in our first year(second semester) of college with a simple idea and a hackathon.

We began asking questions, reading more, and trying to understand the real, everyday needs of visually impaired individuals and how technology could actually help them. At that point, VisionMate was just a rough web prototype. The technology was basic, but there were endless discussions, sketches, late-night conversations, and a lot of excitement around what it could become.

Being one among the top 10 teams
In our second year (third semester), we came across the same problem statement again. This time, with a condition that the solution had to work offline. This phase tested us the most. We had to move away from what we were comfortable with in web development and start thinking in terms of mobile applications, limited resources, and fewer ready-made features. Even when the tools changed and the challenges grew, we never stopped thinking about the problem itself. Brainstorming continued, ideas evolved, and we kept pushing forward.


By the fifth semester, another hackathon gave us the chance to return to VisionMate, this time with clearer understanding and more experience. Our focus shifted towards making the app truly usable - better accessibility, simpler interactions, thoughtful features, and a deeper understanding of how a user would actually navigate the application in real life. That effort paid off when we won a track prize in the healthcare theme, reaffirming our belief that we were building something meaningful.

Winning a track prize

VisionMate is a project that has stayed with us for over two years. It has been part of countless conversations about new tech stacks, real-world accessibility, reducing computational load, improving accuracy, and constantly questioning how technology can do better. VisionMate reflects our belief that technology can create new paths offering a new way to experience the world and a new sense of vision for those who navigate life without sight.


HOW DOES VISIONMATE WORK?

VisionMate is built as an end to end assistive Android application with a strong emphasis on real time perception, accessibility, and reliability. 

Tech Stack:

  • The frontend is developed using Expo React Native with TypeScript, leveraging Expo’s native libraries for features such as text to speech, UI components, and device level integrations.
  • Firebase is used for authentication and database management, enabling secure login. 
  • Emergency communication is handled using Twilio, which powers automated voice calls, and WhatsApp messages when critical events such as falls or unsafe situations are detected.
  • The backend is implemented in Python . It integrates multiple deep learning models using PyTorch, including YOLOv8n for real time object detection and tracking, Depth Anything v2 for depth and distance estimation, and InsightFace for face recognition.
  • Deep Translate is used to support multilingual interactions, ensuring accessibility across languages. 
  • Gemini 2.5 Flash is integrated specifically for the Describe Screen feature, generating natural language descriptions of the user’s surroundings based on visual context. 
  • Real time communication between the mobile application and backend services is achieved through WebSockets, enabling low latency streaming of detections, depth information, and contextual updates.

Features:

  • Object detection
  • Distance Estimation
  • Spatial Awareness
  • Multilingual Interface
  • User Authentication
  • Face recognition
  • Describe 
  • Fall detection 

Workflow:

VisionMate begins with secure user authentication using Face ID/fingerprint/PIN, ensuring quick and accessible access to the application. Navigation across the app is entirely gesture-based, where users swipe between screens, and each screen name is uttered through audio feedback to maintain continuous awareness. The multilingual interface further enhances usability by delivering all audio prompts in the language chosen by the user, making the app accessible by allowing users to interact entirely in their native languages.

The Camera Screen forms the core of real-time assistance, enabling object detection with audio prompts that utters camera orientation and guide users while switching views. Detected objects are conveyed through speech based on priority and estimated distance. When an object crosses a predefined safety threshold, the audio feedback changes to alerts such as “Object ahead”, offering immediate spatial awareness and helping users understand both the presence and location of obstacles around them.

Fall detection functions independently of the active screen within the application. When the mobile device’s gyroscope detects a disturbance beyond a set threshold, VisionMate immediately asks, “Fall detected. Are you okay?” If the user responds back by tapping the screen, emergency actions are cancelled. In the absence of a response, the system automatically initiates emergency calls and sends a WhatsApp message with the user’s live location to saved contacts within 20 seconds.

The Describe feature allows users to gain a clearer understanding of their surroundings by providing an audio-based description of the environment. Additionally, the Face Recognition feature enables personalized interaction by allowing individuals to register themselves by capturing and storing multiple images along with their names, making future identification seamless and intuitive.

A demo showcasing the working of VisionMate is provided below:

VisionMate - Demonstration Video

Conclusion:

VisionMate was developed during hackathon in the fourth semester. While it successfully addresses key accessibility challenges, there remains significant scope for improvement and the addition of new features. This project reinforced our belief in building inclusive technology that can continue to evolve and create meaningful real-world impact.

Acknowledgement:

We sincerely thank the contributors Alma Periera and Sunil Andrade, whose ideas and efforts laid the foundation for VisionMate and supported its growth.

Project Credits:

  • Aashna Ferrao

  • Aroma Peris

  • Alroy Vishanth Dsouza

  • Jeethan Elroy Roche