When I was a child, I had an uncle who was blind. Every day after returning from school, I would describe the sunset to him.What if, with the help of Generative AI and computer vision technologies, I could make this a reality for many people like him today? In this talk, I'll share a hobby project I've developed to narrate the world in real-time.The "Be My Eyes" project leverages AI to extend the scope of experience for blind or visually impaired people, utilizing OpenAI's advanced object detection and computer vision models.The magic unfolds through a simple yet powerful set-up: a video camera that continuously records the user's surroundings. These real-time recordings are then fed into an AI model trained to analyze and interpret the visual content of the video.In a remarkable feat of technological integration, "Be My Eyes" converts the AI model's textual narration of the scene into an audio description via a text-to-speech model.
Get notified about new features and conference additions.