The AI world is buzzing with a flurry of groundbreaking announcements, rumors, and releases from leading AI companies.
This week, Anthropic released an updated Claude 3.5 Sonnet and taught Claude to use a computer, tons of other AI companies released powerful updates, and the world of AI can’t stop talking about wild rumors about OpenAI and Google’s next models.
(Including rumors we might get significantly more powerful models this year.)
If you want to understand what’s really going on, keep reading:
I got the scoop from Marketing AI Institute founder and CEO Paul Roetzer on Episode 121 of The Artificial Intelligence Show.
The Big Wave of AI Announcements
First, let’s break down the announcements we got this week. Then, we’ll unpack why they matter.
- Anthropic unveiled significant upgrades to its AI models, including an enhanced Claude 3.5 Sonnet and a new Claude 3.5 Haiku model. Most notably, they introduced Claude's ability to use computers—controlling cursors, clicking buttons, and typing text through their API.
- OpenAI is reportedly doubling down on AI-powered software development tools, spurred by growing competition from Anthropic in the coding space. They're working on integrations with popular code editors and more ambitious features to automate complex engineering tasks.
- Perplexity announced that their Pro service is evolving into a "reasoning-powered search agent" for complex queries that require extensive browsing and analysis.
- Runway unveiled Act-One, a breakthrough tool that transforms character animation creation using simple video inputs—potentially revolutionizing the animation industry.
- ElevenLabs launched Voice Design, allowing users to generate custom voices through text descriptions alone.
- Stability AI (yes, they're still around) released Stable Diffusion 3.5, their most powerful image generation model yet.
And the rumor mill is working over time. We also saw the following rumors catch like wildfire:
First, The Verge reported that OpenAI could launch its next frontier model, codenamed “Orion,” as early as December of this year—right around ChatGPT’s two-year anniversary.
CEO Sam Altman quickly posted this was “fake news,” and an OpenAI spokesperson said the company does not plan on releasing anything called Orion this year. (They did say, however, they plan to release “other great technology.”)
Second, The Verge also reported that Google is potentially planning to release Gemini 2.0 in December. The Information piled onto that report with claims that Google is also working on something called “Project Jarvis,” which is AI that can do tasks for you in Google Chrome, including doing research, booking a flight, or buying a product.
Is your head spinning yet?
But, in all seriousness, what’s really important to pay attention to here? Roetzer had some thoughts.
Is Computer Use All Hype?
The biggest formal announcement turning heads is Anthropic giving Claude the ability to use a computer. But, beware the hype, says Roetzer.
The technology is still very, very early and very, very rudimentary. It’s also, despite what some online commentators are saying, not actually new. The idea of an AI model being able to see what’s happening on your screen and take action goes back quite a ways.
In fact, famous AI researcher Andrej Karpathy was working on this idea during his first stint at OpenAI back in 2017. It’s a concept called “World of Bits."
The basic idea is that the “World of Atoms” is the real world we humans physically inhibit and interact with. But, there’s also another “World of Bits,” the digital world, that a machine may be able to navigate for us.
Back in 2017, it wasn’t possible to build a general agent that could figure out how to interact with a website. But that changed over the course of the following years.
“Large language models unlocked this ability to build these web-based agents,” says Roetzer. “Because what they realized is once the model could understand the language, it could actually be trained to do this computer use model where it could learn how to use keyboards and mice.”
In short, this idea of computer use has been around for awhile and top AI labs outside of Anthropic have also been working on it.
Anthropic’s version, the company fully admits, is in its very early stages. (Claude counts pixels on the screen to make computer use possible.)
Not only is the technology early, says Roetzer. It’s also dangerous. Giving an untested agent access to your account logins or full access to your permissions is not advisable, Anthropic warns.
The company even said that, for safety reasons, they did not allow the model to access the internet during training.
“Oddly enough, Anthropic, the frontier model company supposed to be focused on responsible AI more than all the others, is the one that came to market with a tool that is not safe,” says Roetzer.
Their own documentation warns users to:
- Use dedicated virtual machines with minimal privileges
- Avoid giving access to sensitive data or login information
- Limit internet access to approved domains
- Have humans confirm any meaningful real-world decisions
The company claims that releasing computer use functionality now is better than later. That gives more time to make the technology safer before the underlying models get too good.
So, computer use hasn’t changed anything yet for your average business leader, says Roetzer. And it may not for some time: Users are going to be understandably nervous about giving computer-using AI access to the tools it needs to perform tasks.
Are These Rumors For Real?
So, what about all the rumors flying around about OpenAI and Google?
As always, it’s important to remember, says Roetzer: “It’s just rumors.”
However, he says he wouldn’t be surprised if Google dropped Gemini 2.0 by year’s end as reported.
He also would bet that “something is coming” from OpenAI. We’re seeing all the typical signs here of something coming soon, including tons of rumors and leaks, as well as Sam Altman cryptically posting online.
what should we give it for a birthday present...
— Sam Altman (@sama) October 21, 2024
“So they’re absolutely coming out with something,” says Roetzer. “Though they probably won’t call it Orion.”
One of his guesses? The full version of o1, the company’s new advanced reasoning model. Right now, we’ve only got access to the preview version of the model. Roetzer suspects that will soon change.
Mike Kaput
As Chief Content Officer, Mike Kaput uses content marketing, marketing strategy, and marketing technology to grow and scale traffic, leads, and revenue for Marketing AI Institute. Mike is the co-author of Marketing Artificial Intelligence: AI, Marketing and the Future of Business (Matt Holt Books, 2022). See Mike's full bio.