Powering Talking Characters with Generative AI

thumbnail

Introduction

The Talking Character is a customizable 3D avatar builder that uses Generative AI to bring animated characters to life. The avatar's personality, backstory, and knowledge base can be configured to create a specialized expert with a unique perspective on any given topic. Users can interact with the avatar in both text or verbal conversation.

Goals

The code template serves two main goals. The first is to provide developers and users with a test interface to experiment with prompt engineering for character development and leveraging specific datasets on top of the PaLM API to create unique experiences. The second goal is to enable an interactive visual experience by creating a 'talking' 3D avatar that animates based on the pattern and intonation of the generated voice.

Animation

We created a 'talking' 3D avatar that animates based on the pattern and intonation of the generated voice. We used the MediaPipe framework and a new audio-to-blendshapes machine learning model for generating facial expressions and lip movements that synchronize to the voice pattern. We then employed the open source JavaScript 3D library three.js to morph the facial expressions and lip motion of the 3D avatar.

Use Cases

The potential use cases of this type of technology are diverse and include interactive creative tools in developing characters and narratives for gaming or storytelling, tech support even for complex systems or processes, customer service tailored for specific products or services, debate practice, language learning, or specific subject education, or simply for bringing brand assets to life with a voice and the ability to interact with. The Talking Character template allows users to create their own avatars, dress them up in different clothes and accessories, and interact with other users in virtual worlds.

Conclusion

Talking Agent will also be open-sourced so developers and startups can build on top of the experiences we have created. We believe a combination of advanced language models and avatars will infinitely expand what is possible in the metaverse, and we are excited to be a part of it. The demo shows how characters can bring text corpus or knowledge bases to life in any domain and could be used to create avatars for customers as they explore a retail environment and gamify the experience of shopping in the real world.