Someone is developing AI Avatars for Work Meetings

Work meetings are a huge cost for companies. Can AI solve this?

Are avatars going to substitute us in work meetings?

Some startups and research labs are working for this goal, so today we’ll dive into this topic:

  • A quick reminder of why meetings are the #1 problem for organizations (and they often don’t fix it)

  • What note-taking apps / ‘AI-assistants’ for meetings are already out there and what they do

  • New solutions being researched on AI Agents for meetings, including key open questions and issues

Why Meetings are the #1 Problem for Organizations

To remind you, with some stats:

  • Time spent in meetings has been rising by 8% to 10% annually since 2000

  • 83% of employees spend up to a third of the workweek in meetings

  • 71% of professionals waste time every week due to unnecessary or canceled meetings.

  • 31 hours are spent on unproductive meetings monthly (more than a full day!)

  • There's an estimated 400$ Billion spent in poorly organized / unproductive meetings in the US alone 

We have a problem with meetings. Imagine solving this, wouldn't it have a bigger impact on productivity than mere automation of manual work?

So the question we need to keep in mind, before going forward, is "What do we need meetings for, and how can we have fewer bad ones and more good ones?"

Otherwise, we would be trying to solve the symptoms instead of curing the root cause of the problem (which again, is what we hope AI helps us to do with a magic wand in many of the cases). I think 2024 will be the year when we need to ask ourselves better questions on AI, starting with "Do we need an AI for that?" instead of following the 'shiny object syndrome' that has poisoned us in 2023.

Current Solutions

The current version of 'Avatars' on popular platforms like Zoom or Microsoft Teams are pretty useless (they mirror your head movements and facial expressions, but are only useful if you don't want to turn on the camera) and can be mostly used for short (and one-time?) fun:

As of today, it's starting to be common to have AI-note-taking apps being used (either visibly, or in the background) in an attempt to solve what is arguably one of the most underrated problems of an organization's productivity and inefficiency.

What are the main types?

  • Speech-to-text transcription: Converts spoken words into written text

  • Intelligent note-taking: Leverage AI algorithms to automatically organize and categorize your notes

  • OCR-based (Optical Character Recognition): Uses AI algorithms to extract text from images and make it editable and searchable

  • Contextual note-taking: Leans AI to understand the context in which notes are taken and provide relevant suggestions or actions

  • Summarization: Condenses lengthy notes into key points, making it easier to review and comprehend information

  • Collaborative note-taking: Allows multiple users to work together on the same set of notes in real-time

  • Knowledge graph: Uses AI algorithms to create interconnected notes and concepts—link related information, enabling you to navigate through a network of connected ideas

Top examples and some of the key differentiations:

  1. Microsoft Teams A.I. Note-Taker: Microsoft has integrated AI note-taking capabilities into Microsoft Teams. The AI can help summarize or rewrite existing notes and assist with drafting plans, generating ideas, creating lists, and organizing information within OneNote, which is part of the Microsoft 365 suite

  2. Google Meet’s AI (Duet AI): Google has introduced AI-powered features to Google Meet, including automatic note-taking. Duet AI can take notes and capture summaries, and action items during meetings. Users can also enable Duet AI to "attend" meetings on their behalf, providing a summary of what was discussed.

  3. Tactiq.io - Chrome plugin, capability to turn frequent AI prompts into 1-click actions:

  4. Notta: More focused on higher quality audio transcription, covering pretty much all possible languages.

  5. Otter.ai: Automatic speaker identification and keyword searches, has features based on different use cases, i.e. for Education, Sales, which is a very interesting take on fine-tuning the LLM and user interfaces.

  6. Fireflies.ai: Focus on collaboration and meeting intelligence stats

  7. Colibri.ai: Well-developed vertical use-cases that go even further, by adding things like coaching for calls, to-do lists

How can AI help?

With that in mind (it will take a separate article to address that) we can go back to the original topic: which type of meetings could this technology be a good use case for?

  1. Status Update Meetings:

    • Goal: To provide updates on ongoing projects, tasks, or operations.

    • AI Suitability: High. AI agents can be programmed to gather data, compile reports, and present them efficiently. They can handle routine updates, track progress, and even answer basic queries about the status of projects.

    • Estimated Time Spent: 20-30%

  2. Decision-Making Meetings:

    • Goal: To make key decisions regarding business strategies, project directions, or operational changes.

    • AI Suitability: Moderate. AI can provide data-driven insights and predictions to support decision-making but may not be able to fully participate in nuanced discussions that require human judgment, emotions, or ethical considerations.

    • Estimated Time Spent: 10-15%

  3. Problem-Solving Meetings:

    • Goal: To address specific challenges or issues that have arisen.

    • AI Suitability: Moderate to High. AI can be useful in offering data analysis, suggesting possible solutions based on past data, and simulating outcomes. However, complex problem-solving that requires creative or out-of-the-box thinking might be beyond the current capabilities of AI.

    • Estimated Time Spent: 10-15%

  4. Brainstorming or Creative Meetings:

    • Goal: To generate new ideas, creative solutions, or innovative approaches.

    • AI Suitability: Low. While AI can assist in providing information or inspiration, the spontaneous and highly creative nature of these meetings is not something AI can effectively replicate. Human imagination and creativity are key here.

    • Estimated Time Spent: 15-20%

  5. Training or Educational Meetings:

    • Goal: To educate or train employees on new skills, policies, or procedures.

    • AI Suitability: High. AI can be very effective in delivering training content, especially if it's standardized. It can also adapt to different learning styles and paces, though it may lack the personal touch of a human trainer.

    • Estimated Time Spent: 5-10%

  6. Team Building or Motivational Meetings:

    • Goal: To build team camaraderie or motivate employees.

    • AI Suitability: Low. These meetings often rely on emotional intelligence, personal interactions, and motivational skills that are uniquely human.

    • Estimated Time Spent: 5-10%

  7. Feedback or Review Meetings:

    • Goal: To provide feedback on performance, projects, or other work-related matters.

    • AI Suitability: Moderate. AI can offer objective data-driven feedback but may not be able to handle the nuanced interpersonal aspects of delivering constructive criticism or personalized encouragement.

    • Estimated Time Spent: 10-15%

  8. Planning or Strategy Meetings:

    • Goal: To plan future activities, set goals, or develop strategies.

    • AI Suitability: Moderate to High. AI can contribute valuable data analysis and predictive modeling to inform planning but may not capture the full complexity of strategic thinking that involves long-term vision and human intuition.

    • Estimated Time Spent: 10-20%

The Project

So back to our initial question: Are avatars going to substitute for us in meetings?

During one of the Demo Day presentations at PiSchool, Italy’s leading School of AI, I heard Sébastien Bratieres (Managing Director), announce that they had been awarded a 7M€ grant by the European Union to build technology to do so. The project is called Metween, and interestingly, even if Zoom’s Research team is also participating, the goal is to make everything Open Source, so that it can be available to anyone for adaptations and improvements in the future.

A completely autonomous Avatar, replacing you if you can't (or don't want to..) attend a particular meeting.

Use Case #1: Making Video Meetings Work

Technology-wise, this would be built with the capability to build video based on audio and vice versa; meaning that the first use case is a simple (but practical) one: fixing technical issues for video conferences. Slow connections may end up ruining the experience of a meeting with video lagging, but with this technology, the AI avatar could step in and provide a seamless video feed, eliminating the issues caused by poor connectivity.

Use Case #2: Having an AI 'Butler' Meeting Assistant

The second use case involves having an AI 'Butler' Meeting Assistant. This AI avatar could handle tasks such as taking notes, setting reminders, and responding to simple queries during meetings. It could also analyze the content of the meeting in real-time, providing insights and summaries, helping to keep the meeting on track as well as performing actions and taking initiative like muting/unmuting, setting up a follow-up meeting, and so on.

Use Case #3: Completely delegate a meeting to your AI Avatar

The third use case is the most advanced: completely delegating a meeting to your AI Avatar. If you're unable to attend a meeting, your AI avatar could step in and represent you. It would be able to understand the context of the meeting, respond to questions, and make decisions based on your preferences and past behavior. This could revolutionize the way we approach meetings, allowing for greater flexibility and efficiency.

At this point you should be asking yourself: how will it be enabled to do that? Simple, by having access to personal, private data.

For an AI avatar to function effectively in the described scenarios, it would need access to a variety of data types:

  1. Audio and Video Data: To create a seamless video feed or to build video based on audio, the AI would need to process live audio and video streams from the meeting.

  2. Personal Identifiers: To represent an individual, the AI would require personal identifiers such as name, job title, photographs, or video to create a realistic avatar.

  3. Behavioral Data: For the AI to make decisions based on your preferences and past behavior, it would need to collect and analyze how you typically respond in meetings, including your speech patterns and decision-making processes.

  4. Communication Content: To take notes and provide meeting summaries, the AI would need to process and understand the content of the communications during the meeting.

  5. Calendar and Contact Information: For setting reminders and scheduling follow-up meetings, the AI would need access to your calendar and contact lists.

  6. Biometric Data: If the AI is to authenticate the user or personalize the avatar, it may use biometric data such as voiceprints or facial recognition data.

  7. Contextual Data: To understand the context of the meeting, the AI would need access to background information, such as previous meeting notes, company data, and project statuses and presentations.

A lot of data privacy issues need to be solved, including legal ones (who is responsible for what your avatar says?). I'm also asking myself a fundamental question: who owns this avatar? Is it you, or is it the company you work for? What happens when you leave?

I've spoken about Human Capital before, a highly underrated topic and under-measured value in organizations. This is a potentially 'perfect' solution to capture part of that value if AI Avatars would ever really end up being fed with most of your data, and capabilities. It's easy to think that it would help the frenzy to replace humans with machines to get better cost savings, have more control, and capability to measure. However, I hope that as the hype partially runs down, we're starting to see that after all, there will only be specific sets of tasks in knowledge work, more than roles per se that can be fully replaced, or augmented. The risk is that we'll start to get lazy and make meetings worse because you didn't get what you needed out of that time.

Much has to be defined yet, including some of these other open questions:

  • Human-less Meetings: What happens if you organize a meeting where all participants are avatars?

  • Aesthetics: Does your avatar have your face and voice?

  • Moral and Ethics: How do you feel interacting with your 'AI clone' (this is a much broader topic, a lot of startups are working on celebrity and personal AI clones and it's unchartered territory)

  • Decisions: Who will decide if you're allowed to send your avatar to a meeting?

The project is set to last until 2027, with incremental releases up until then. Others are working on this, including Meta with their Seamless research team (possibly to make it part of their efforts toward the Metaverse, remember that?) as well as High Performance Language Technologies. We’re just at the beginning, but it will be interesting to follow and see where things go.