2 Min Read

ChatGPT Can Now See, Hear, and Speak (And It's a Big Deal)

Featured Image

Wondering how to get started with AI? Take our on-demand Piloting AI for Marketers Series.

Learn More

OpenAI just dropped two big upgrades that bring ChatGPT closer to becoming a true multimodal AI assistant.

First, ChatGPT can now analyze and respond to images. So, you just drop a photo or picture into the tool and then prompt it to interact with an image.

Early use cases for this functionality are stunning.

Second, ChatGPT can now talk to you like Siri or Alexa.

Just speak to it via the app, and it responds.

Tech reporter Kevin Roose had it read him a bedtime story, help him analyze a dream, and chat with him about some work-related stress.

(Notably, he said it was much, much better than Siri or Alexa.)

Right now, most users don’t have access to these features. But they’ll soon be rolled out to ChatGPT Plus and ChatGPT Enterprise users.

Why It Matters

ChatGPT is now the latest and greatest example of a multimodal AI assistant, or an AI assistant that can use different mediums like text, voice, and images to perform actions and produce outputs.

This unlocks huge new use cases for marketers and businesspeople using these tools.

Connecting the Dots

On Episode 66 of The Marketing AI Show, Marketing AI Institute founder and CEO Paul Roetzer walked me through why this is such a big deal.

  1. This makes ChatGPT insanely powerful. You can now do things like have it analyze marketing data charts, write product descriptions from photos, caption social posts, identify product defects from photos, build apps and sites from a simple drawing, and more.
  2. Yet most people still don’t truly comprehend its capabilities. “They’re not really thinking about the depth of the capabilities it has,” says Roetzer. Nor are they really pushing it and experimenting with it fully. Most people are still using a few basic prompts to do rudimentary tasks. There is much, much more to the tool.
  3. Domain experts stand to gain the most from the tool. “This is the kind of tech that becomes extremely powerful when you put it in the hands of people with domain expertise,” says Roetzer. “The professionals who figure out how to use these tools to do this kind of stuff will be able to just leap their peers.”
  4. And don’t sleep on ChatGPT as a voice assistant. OpenAI’s Whisper transcription technology is already incredible. We expect the voice functionality in ChatGPT to similarly excel. It also confirms that the future of all AI assistants will be multimodal, says Roetzer.

What to Do About It

  • Get moving on testing this out. “The leaders in different industries who get there first and test this technology, they're going to find the ways to reinvent their own industries,” says Roetzer. There’s no excuse not to have a ChatGPT Plus license moving forward.

Related Posts

Microsoft Copilot Is Now Available to Everyone

Mike Kaput | January 23, 2024

Every individual and business now has access to Microsoft Copilot. That's thanks to a couple of big announcements from the company.

Say Hello to ChatSpot: The Game-Changing AI Tool That Will Supercharge HubSpot

Mike Kaput | March 6, 2023

HubSpot just released ChatSpot, an AI tool that builds ChatGPT-like functionality right into your HubSpot CRM.

ChatGPT Is About to Change Knowledge Work and We’re Not Ready

Mike Kaput | April 25, 2023

The way we all work is about to change in major ways thanks to ChatGPT—and few are ready for how fast this is about to happen.