Description :
We were inspired by the potential to help parents provide educational content and narrate bed time stories for their children.
V.I.N.E is a generative model that takes user's desire genre of story and then creates, and narrates the story in an interactive format through images and a user-chosen narrator.
We utilized Mircrosoft's Azure Speech Recognition to take user's input and then utilizes Stable Diffusion AI's API and ElevenLabs API to to generate relevant images and user-chosen voice from content generated directly from a large language model.
We faced implementation issues as there was a lack of synchronization between multiple calls. We also struggled to find great open source repositories that could help us achieve our desired goals.
We're proud to have implemented a product that is able to solve a genuine problem that is relevant to people in our society.
Our biggest takeaway was that great things are possible when we're motivated by a desire to create something new and exciting.
We're going to implement multiple character and make the user interface more interactive through subtitles and provide a real-life experience.
There are two more zip files that named as stability-sdk. Here is the drive link for it: https://drive.google.com/file/d/1qhKJuwWbliy6SCUu1QMm6bH40z9jPOYy/view?usp=sharing Another is the env zip file. Here is the drive link: https://drive.google.com/file/d/1SlzgesH0SAiV8agpFchVB6TCa7fIo-W-/view?usp=sharing