OpenAI o1: What You Need to Know

Written by Mike Kaput | Sep 17, 2024 12:05:51 PM

It’s finally here…

OpenAI has released an initial version of its code-named “Strawberry” project—a new AI model that displays advanced reasoning.

The new model is formally called “o1,” and it’s designed to spend more time thinking about problems before it responds, mimicking how the human brain works.

As a result, o1 can “reason through complex tasks and solve harder problems than previous models in science, coding, and math,” according to OpenAI.

But what exactly is o1, and why should you care?

I got the inside scoop from Marketing AI Institute founder and CEO Paul Roetzer on Episode 115 of The Artificial Intelligence Show.

Why o1 Matters

OpenAI sees the road to artificial general intelligence (AGI) as consisting of five levels:

Level 1: Chatbots (We're here now with GPT-4)
Level 2: Reasoners (That's o1)
Level 3: Agents (AI that can take actions)
Level 4: Innovators (AI that can aid in invention)
Level 5: Organizations (AI that can do the work of an entire organization)

Until o1, we were at Level 1. Now, we’re likely at Level 2, says Roetzer. And Level 2 is critical for what comes next, because reasoning unlocks more advanced capabilities by giving AI systems the ability to solve more complex problems (like taking actions in Level 3) and even helping us make AI research breakthroughs (Level 4). In turn, this leads to AI that can do the work of entire organizations (as OpenAI sees it) in Level 5.

"Reasoning is the foundation of all of this," says Roetzer.

So, o1 may be the key that unlocks the entire path to more advanced—even generally intelligent—AI.

What Makes o1 Special?

The reason o1 has this potential to unlock advanced AI is because it’s built a little different than what came before.

“We trained these models to spend more time thinking through problems before they respond, much like a person would,” OpenAI writes of the o1 family of models. “Through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes.”

Essentially, the model natively embeds “chain-of-thought” reasoning, a popular and effective prompting strategy, into how it works.

This is similar to a type of thinking called “System 2” thinking, which humans use specifically to tackle complex problems. Because of that, o1 is not better than a model like GPT-4o at everything, says Roetzer, just advanced reasoning tasks.

"You're only going to use the o1 model for advanced reasoning, decision making, things that require System 2 thinking," Roetzer explains.

The results of o1 on these types of tasks, so far, appear to be impressive.

OpenAI says o1 performs “similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology.” It also scored 83% on a qualifying exam for the International Mathematics Olympiad (IMO), whereas GPT-4o scored a 13% on the same test.

What You Need to Know About o1 Right Now

There are some important things to remember about o1 as it exists today, says Roetzer.

"This is like the GPT-1 moment for reasoning models," says Roetzer.

Regardless of o1’s capabilities, we’re seeing the earliest version of this type of technology.

You can see that in the iterative rollout of the model. We only have access right now to the o1 preview model, and a “mini” version of the preview model that is faster and cheaper. We don’t even have access to the full o1 model yet. So, be careful reading examples online of how well (or not) it works.

You can use the o1 preview models right within ChatGPT. But they don’t have all the features of ChatGPT yet, like access to the internet or the ability to use files.

And, reminds Roetzer, o1 is designed to accomplish specific reasoning and planning tasks well—not everything. So, if you test it on things it’s not uniquely good at, it may fall short of your expectations.

It’s best for things that require multiple steps. For instance, Roetzer plans on putting it through its paces on a number of business planning challenges for the coming year.

“There are a bunch of [business] challenges I see and problems to solve and opportunities to evaluate,” he says. “I intend to use the model to help me go through a chain-of-thought to evaluate the future of our business.”

View full post