AI agents are not as independent as headlines suggest.
Join hosts Mike Kaput and Paul Roetzer as they examine why giants like OpenAI and Google are seeing diminishing returns in their AI development, demystify the current state of AI agents, and unpack fascinating insights from Anthropic CEO Dario Amodei's recent conversation with Lex Fridman about the future of responsible AI development and the challenges ahead.
Listen or watch below—and see below for show notes and the transcript.
00:04:34 — Has AI Hit a Wall?
00:14:31 — What Is An AI Agent?
00:38:56 — Dario Amodei Interview
00:49:27 — OpenAI Nears Launch of AI Agent Tool
00:51:58 — OpenAI Co-Founder Returns to Startup After Months-Long Leave
00:53:41 — Research: How Gen AI Is Already Impacting the Labor Market
00:58:42 — Google’s Latest Gemini Model Now Tops the AI Leaderboard
01:02:53 — Microsoft Copilot Is Struggling
01:09:03 — Microsoft 200+ AI Transformation Stories
01:11:11 — xAI Is Raising Up to $6 Billion at $50 Billion Valuation
01:13:15 — Writer Raises $200M Series C at $1.9B Valuation
01:15:24 — How Spotify Views AI-Generated Music
Has AI scaling hit a wall?
This is a question that is increasingly on everyone’s mind in the AI community as we’re getting more and more reports that the major AI companies are hitting roadblocks in their race to build the next generation of AI models.
According to recent reporting from Bloomberg and The Information, OpenAI, Google, and Anthropic are all experiencing diminishing returns in their efforts to develop more advanced AI models, despite massive investments in computing power and data.
OpenAI's latest model, codenamed Orion, hasn't met the company's performance expectations, particularly struggling with coding tasks. Similarly, we’re hearing that Google's upcoming Gemini update is falling short of internal goals, and Anthropic has actually delayed the release of its anticipated Claude 3.5 Opus model.
The root of the problem appears to come from the following:
First, companies are running out of high-quality training data. The internet's freely available content, which powered the first wave of AI models, may no longer be enough to create significantly smarter systems.
Second, even modest improvements now require enormous computing resources, making it harder to justify the costs.
Third, the long-held belief in Silicon Valley that simply scaling up models with more data and computing power would lead to better performance—known as "scaling laws"—is being challenged.
This has caused some prominent voices in AI to claim we’re “hitting a wall” when it comes to AI development—and that we’re not on as fast a path to artificial general intelligence (AGI) as some AI leaders would have you believe.
What is an AI Agent?
The importance of AI agents continues to grow and with that importance, comes lots of misconceptions.
While many major tech companies like Microsoft, Salesforce and Google are touting “autonomous” AI agents, these systems are not truly autonomous yet. Instead, they still require significant human involvement in setting goals, planning, and oversight.
While the current definition and future of AI agents remain uncertain, they should be seen as opportunities rather than threats.
Experience with building AI agents could become a significant advantage in job interviews and career advancement, similar to how custom GPTs and other AI tools can demonstrate valuable skills to potential employers.
Dario Amodei Interview
Lex Fridman just dropped a massive 5-hour-long interview with key leaders at Anthropic, including CEO Dario Amodei, Amanda Askell, who works on fine-tuning and AI alignment at the company, and co-founder Chris Olah, who is working on mechanistic interpretability at the company.
Amodei talked a lot about scaling laws, and their limitations, including the possibility of running out of data or hitting a ceiling in terms of the complexity of the real world. He also spent a lot of time on Anthropic’s responsible scaling policy, which is designed to address the risks of AI systems becoming too powerful.
Amodei believes that it is important to start thinking about these risks now, even though AI systems are not yet powerful enough to pose a serious threat.
Of course, this is just a small sample of the topics discussed. But these types of interviews are really important to stay aware of for a couple reasons:
One, the best way to understand what the handful of people actually shaping the future of AI believe is to listen to what they tell you over time in interviews like this.
And, two, these interviews are increasingly fulfilling the role of formal company statements.
We’re increasingly seeing AI founders “go direct” to popular podcasts to get their viewpoints and perspectives out there, so these types of interviews may be the source of truth when you want details on things like model releases, product roadmaps, or company viewpoints.
In fact, instead of responding to Bloomberg’s requests for interviews in the last segment we covered, Anthropic simply pointed the publication to this podcast.
Today’s episode is brought to you by our AI for Agencies Summit, a virtual event taking place from 12pm - 5pm ET on Wednesday, November 20.
The AI for Agencies Summit is designed for marketing agency practitioners and leaders who are ready to reinvent what’s possible in their business and embrace smarter technologies to accelerate transformation and value creation.
You can get tickets by going to www.aiforagencies.com and clicking “Register Now.” When you do, use the code AIFORWARD200 for $100 off your ticket.
Disclaimer: This transcription was written by AI, thanks to Descript, and has not been edited for content.
[00:00:00] Paul Roetzer: If you hear about AI agents and you think, Oh my gosh, they're taking my job next year. That is not happening. If, if you realize all the things that have to go into making an agent work, goal setting, planning, building it, monitoring it, improving it. That is almost always the human's job right now. Welcome to the Artificial Intelligence Show, the podcast that helps your business grow smarter by making AI approachable and actionable.
[00:00:29] Paul Roetzer: My name is Paul Rader. I'm the founder and CEO of Marketing AI Institute, and I'm your host. Each week I'm joined by my co-Ho. and Marketing AI Institute Chief Content Officer, Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career.
[00:00:50] Paul Roetzer: Join us as we accelerate AI literacy for all.
[00:00:57] Paul Roetzer: Welcome to episode 124 [00:01:00] of the Artificial Intelligence Show. I'm your host, Paul Roetzer, along with my co host, Mike Kaput. who is our chief content officer at Marketing Institute and co author of our book, Marketing Artificial Intelligence. we've got like a lot to cover this week. It's going to be a continuation in some ways of last week's conversation about like, have these scaling laws just stopped working?
[00:01:24] Paul Roetzer: Have we hit a wall? There was more stuff there. We've decided to do a deep dive into what is an AI agent and I'll explain why in a minute, but this is really, ish. Important conversation, and I hope very valuable people, and then, this unbelievable five hour podcast from Lexprey that I, and shocking that I actually listened to the whole thing with Dario Amodei and Amanda.
[00:01:51] Paul Roetzer: from Anthropic. wow. What a, what a marathon that was. okay. So we got a lot to talk about and a bunch of rapid fire items and then some [00:02:00] stuff we had to cut at the last minute, cause I, we're going, we're going to struggle to keep this one under an hour, 15 minutes, but we're going to do our best. So this week's episode is brought to us by the AI for Agencies Summit.
[00:02:11] Paul Roetzer: This is our virtual event that's taking place Wednesday, November 20th. So if you are listening to this the day it comes out on November 19th, you still have time to get in for the live event. If you are listening to this after November 20th, you can get AI for Agencies on demand. So you haven't missed out if you listen to this late.
[00:02:30] Paul Roetzer: But we are recording this on November 18th. I'm in the midst of finalizing my opening keynote, which we are going to talk about because it is related to the AI Agent topic. but AI for Agencies Summit, again, is coming up on Wednesday, November 20th. It is a half day virtual event from noon to 5 p. m.
[00:02:48] Paul Roetzer: Eastern time. we're going to hear from, I think there's about six, case studies from AI agents, or from agency leaders. Talking about how they're using AI, how they're infusing it into their [00:03:00] own transformation and building into their client programs. I've got an opening keynote on AI, agents and the future of agencies.
[00:03:09] Paul Roetzer: We've got an incredible closing keynote. we've got a panel that provides brand side perspective on what's going on and how brands are thinking about working with agencies. So, there's a ton of content packed into five hours of, a virtual event. You can check all that out at AIforAgencies.com, that's AIforagencies.com
[00:03:29] Paul Roetzer: com. you can use promo code AIFORWARD200, that'll get you 200 off your ticket. And, again it's AI4Agencies. com, be sure to use that promo code. And as I mentioned, there will be on demand options, so if you can't make it live, different time zone, or you're busy that day, don't worry about it, you can catch up on demand.
[00:03:51] Paul Roetzer: Alright, and then a quick programming note. So, we are not going to have an episode next week. That would be the November 26th, would be the [00:04:00] drop day normally, Tuesday, November 26th. We will not have an episode. The next weekly will be Tuesday, December 3rd. So, I'm actually going to be on vacation at the end of this week, and, we can't record while I'm on vacation.
[00:04:13] Paul Roetzer: So, Mike and I are going to take a week off. Hopefully nothing too crazy happens. I realize now that that will be in the midst of the two year anniversary of, ChatGPT. So, maybe some things will be happening, but, yeah. So, we'll catch up with you on December 3rd, and, we'll, we'll remind you again at the end here.
[00:04:34] Paul Roetzer: Okay, Mike, has AI training hit a wall?
[00:04:40] Mike Kaput: That is the question of the day, of the week, maybe of the month, because this is a topic we are kind of basically just continuing a conversation. About that we started last week and we've got some more news and some more sources on this topic because more and more people in the AI community, at least [00:05:00] some of them, appear to be asking, are we hitting a wall when it comes to scaling AI and improving AI models because right now we're getting more and more reports that Major AI model companies are hitting roadblocks in their race to build their next generation of models.
[00:05:19] Mike Kaput: So according to things like recent reporting from Bloomberg and The Information, OpenAI, Google, and Anthropic are all experiencing diminishing returns. in their efforts to develop more advanced AI models, despite making massive investments in computing power and data. Last week we talked a little bit about how OpenAI's latest model, which is codenamed Orion, hasn't met the company's performance expectations.
[00:05:48] Mike Kaput: It has particularly struggled with coding tasks. Similarly, Google's upcoming Gemini update is falling short of internal goals. [00:06:00] And Anthropic has actually delayed the release of its anticipated Claude 3. 5 Opus model. Now, the root of this problem appears to be threefold, according to all these sources.
[00:06:13] Mike Kaput: First, companies might be running out of high quality training data, so the internet's freely available content, which powered the first wave of AI models, May no longer be enough to create significantly smarter systems. Second, even modest improvements now require enormous computing resources, which makes it harder to justify the costs.
[00:06:38] Mike Kaput: And third, this kind of long held belief in Silicon Valley that simply scaling up models with more data and more compute would lead to better performance, which is known as scaling laws, is being challenged. So this all has kind of come together in a narrative right now where prominent AI voices are claiming we are hitting a wall when it [00:07:00] comes to AI development.
[00:07:02] Mike Kaput: And, more importantly, we're not on as fast a path to Artificial General Intelligence or AGI as some AI leaders have previously led us to believe. For instance, Margaret Mitchell, Chief Ethics Scientist at Hugging Face, put it to Bloomberg this way, quote, the AGI bubble is bursting a little bit. So Paul, maybe start us off from the top here and walk us through, like, what's going on here.
[00:07:28] Mike Kaput: We've talked about this topic a bunch of times, not just last episode, but throughout the year. Why are these conversations about hitting a wall getting so loud and prominent right now?
[00:07:39] Paul Roetzer: Yeah, so there's a lot to unpack here and, you know, the one about Anthropic and Claude Opus and, you know, why we haven't seen that one, we're actually going to talk about that, is the third main topic today because that was part of Dario Amodei and Anthropic did that massive interview with Lex Friedman.
[00:07:58] Paul Roetzer: So we'll get into Dario's [00:08:00] thoughts on this. But, in essence, media reports and some AI antagonists are claiming the scaling laws are slowing down, or plateauing. But, many voices inside the labs say there's no end in sight. So, just this week, we got a tweet from Sam Altman, November 14th, said there is no wall, I guess, last week.
[00:08:19] Paul Roetzer: we'll put the links to these in, you can go check them out for yourself. we had Oriol Vinales from Google DeepMind, VP of Research and Deep Learning, lead at Google DeepMind. He replied, what wall, in response to a new benchmark that we'll talk about in a rapid fire item that showed Google has a forthcoming model that is now number one on the benchmark leaderboard.
[00:08:44] Paul Roetzer: And then Miles Brungage, Brundage. Who we talked about on episode 121, who was the former senior advisor for AGI Readiness at OpenAI. So someone who certainly is aware of what OpenAI is doing, but also no [00:09:00] longer has to tow the company line because he is independent now and was very vocal on his way out as we talked about in episode 121.
[00:09:08] Paul Roetzer: So he doesn't really have a stake in, you know, you know, continuing to push OpenAI messaging if it's not true. He tweeted, betting against AI scaling, continuing to yield big gains is a bad idea, would recommend that anyone staking their career, reputation, money, etc. on such a bet reconsider it. So that being said, it does appear, based on reports, that there have been delays in some of the frontier models that we expected to see in 2024.
[00:09:41] Paul Roetzer: So we could think like a Gemini 2. Claude, Opus, Thor, a GPT 5 or Orion, like we kind of assume we might see all those models. A Llama 4, so a few thoughts here. One, the year isn't over yet, so there's certainly still the possibility we're going to [00:10:00] get smarter, bigger, more generally capable models.
[00:10:03] Paul Roetzer: The labs don't share their model release plans, so while we may have been anticipating these models by year end, they may not have. and then the third and maybe the most important aspect of this is these models are complex. They are, they are not traditional software where you just brute force a bunch of code and you release a model that does what you want it to do and then you fix some flaws after you release it.
[00:10:29] Paul Roetzer: These things are not, they don't work like that software. They don't do what you want them to do all the time. And oftentimes, it's not until you train the model that you find the flaws or deficiencies or that maybe it doesn't do what you wanted it to do as well. And you have to go in to retrain it or you have to fine tune it after the fact.
[00:10:50] Paul Roetzer: And so, as these models get bigger, they get more complicated to train, to post train, to red team. Red teaming, again, is like, The [00:11:00] idea of testing and evaluating models, vulnerabilities, limitations, risks. So maybe you train this massive thing and then you realize this is too dangerous. Like there's too many risks associated with this thing.
[00:11:11] Paul Roetzer: It has too many emergent capabilities. We can't release this thing. We got to go back and like fine tune this and do more post training to make it safe enough to put out. This is like, I think, kind of what we saw with the advanced voice mode from OpenAI. You have the thing ready, you've done the training, you've done all the testing, but then you realize it's got some capabilities that we cannot release.
[00:11:32] Paul Roetzer: And so we have to now make it safer. So as they get bigger, it's gonna be harder to project what's gonna happen. And as we'll hear from Dario, I'll kind of walk through what goes into training and preparing these models, and people will realize, like, this isn't a, you run a, do a training run and 30 days later you just release the thing.
[00:11:51] Paul Roetzer: That is not how these work. So My current bet would be the fact that the labs will continue to push the scaling laws. [00:12:00] They will continue to do more compute, more data, likely with new approaches to maximize performance and capabilities. So the labs are going to keep buying NVIDIA chips to do the training.
[00:12:11] Paul Roetzer: We're going to keep hearing about massive data centers being built. We're going to continue to hear about massive investment in energy infrastructure. That's going to be a major priority of the incoming administration. It's going to be a, in the United States. It's going to be a major priority of people like Sam Altman to push this.
[00:12:27] Paul Roetzer: The labs and the governments will spend tens of billions of dollars next year on training and building these models. Within two to three years, they will be spending hundreds of billions of dollars, to build bigger, more generally capable bottles. So whether the scaling laws as we have known them, remain exactly true or not, I don't think it really matters.
[00:12:49] Paul Roetzer: and I don't think all these headlines about the scaling laws plateauing or different You know, people kind of taking a victory lap who are the general antagonists of the eye models [00:13:00] and the scaling laws. I think those victory laps will be seen as premature in the end. So when we talked about this last week on episode 123, there was a few things I highlighted.
[00:13:12] Paul Roetzer: So one was that a lot of these leaders of the frontier labs like Sam Altman, Demis Hassabis, they have been very public that they see there needs to be maybe two to three breakthroughs to unlock like the true intelligence, powerful AI, AGI, whatever you want to call it, sort of the the model that takes like a massive leap forward from what we have today, like a GPT 4.
[00:13:37] Paul Roetzer: And so that has been known. Now the things I highlighted in that episode was Reasoning, you know, the O1 model from OpenAI, we're under the assumption we're going to get the full O1 model soon. Multi model training, where they're not just trained on text, but images and video and audio. The idea that there'll be a symphony of models working together, [00:14:00] that the large models will be kind of like a conductor working with a bunch of smaller models.
[00:14:04] Paul Roetzer: The concept of self play or recursive self improvement, where the models are able to identify their own flaws and kind of fix them as they're going. And then memory, there was actually an interview, I don't think we have it on the list to talk about today, but Mustafa Suleyman did an interview last week and he was talking about memory.
[00:14:21] Paul Roetzer: Maybe we touched on this last week. I don't, I don't remember. I think we did. Ironically. I don't remember. But memory is a huge one and he thinks it'll be solved by next year.
[00:14:31] Paul Roetzer: Now, the one thing we didn't talk about, Mike, is, AI agents. And so that's going to kind of lead us in to our second main topic.
[00:14:42] Paul Roetzer: And to get started here, the way, like, again, we've talked a little bit how this podcast works, how the planning works sometimes, but it's a very dynamic process sometimes up until literally the minute Mike and I get on to record this. And this would be an example of [00:15:00] that. We. I'm preparing to do my AI Agents and the Future of Agencies keynote on Wednesday and the deck isn't done yet.
[00:15:10] Paul Roetzer: And so as of like Sunday night, I was still going through all of my research around this idea of AI agents and what they are. And so we weren't sure how much of this we were going to weave into today's conversation. But as I kind of like. Aimed to some personal, like, peace of mind on the topic very late Sunday night, I decided that this probably needed to be a main topic.
[00:15:37] Paul Roetzer: so the concept here is, the issue here, I guess, is earlier this year, a lot of the AI companies like Google and Microsoft and OpenAI and others started talking Salesforce, started talking a lot about AI agents. And it started creating a lot of confusion for me as someone who [00:16:00] obviously follows the space very closely because I wasn't really clear what exactly they were talking about, like what they were considering agents to be.
[00:16:10] Paul Roetzer: And so historically for me, like when I did the episode 87 AI timeline, we talked about the explosion of AI agents starting next year. Right. And I had a very clear picture in my mind of what I believed AI agents to be. Based on what they have historically been talked about as. And so the simple concept, the simple definition I have historically used is that the system that takes actions to achieve goals.
[00:16:37] Paul Roetzer: And so in the idea of an agent. LLMs, like the PowerChat, GPT, Claude, Gemini, those AI systems answer questions and create outputs by predicting tokens or words. Like they don't take an action, they just output something to answer a question or write an email or, you know, do an article or whatever, but it's just [00:17:00] predictions of words and tokens.
[00:17:01] Paul Roetzer: So, and we have other generative AI systems that create images and videos and audio, but again, they're just outputting something. Those systems don't take actions, they don't complete a workflow, they don't go through like 10 steps to do something. so when we talk about agents in a traditional sense, the concept was, you give it a goal, it plans and executes to achieve it with no human inputs or oversight.
[00:17:27] Paul Roetzer: It's, it's this idea of like, autonomy. So, an example here would be Google DeepMind's AlphaGo, so I know a lot of our, you know, listeners, viewers, Have probably watched the AlphaGo documentary. If you haven't, it's great. You know, while we're not here next week with you, go watch the AlphaGo documentary.
[00:17:45] Paul Roetzer: it's a great example where the machine is, is. It provided training data to win at the game of Go, it then does all these simulations to learn how to play the game, but then it functions autonomously. [00:18:00] It's just told basically to win the game. It does all the planning, it figures out how to do it analyzes its own moves, it thinks 10, 20, 100 steps ahead of what the human may do.
[00:18:11] Paul Roetzer: And so that was kind of like the traditional idea of an agent. Now, the confusion comes in today because a lot of leading AI companies have been talking about their AI agents as autonomous. And that is largely not the case and can be extremely misleading. And so autonomy actually becomes sort of the sticking point here.
[00:18:35] Paul Roetzer: And the way I talk about this, and I mentioned this last week, is like, In full self driving and autonomous cars, which Tesla and Waymo and others have been pursuing for well over a decade, the idea of full autonomy is that you don't need a steering wheel or pedals in the car. The human just gets in the car and says, I would like to go to the office.
[00:18:57] Paul Roetzer: And the car figures out [00:19:00] everything else. The human has no involvement in anything other than the goal setting. And then the machine executes the goal. And so. The example I gave when we talked about this last year or this year is like the idea of sending an email in HubSpot. If I want to send an email in HubSpot, it's a minimum of 21 clicks as a human for me to do, for me to send an email.
[00:19:23] Paul Roetzer: The idea of an AI agent that's autonomous would just be me the human saying, Hey, go send this email. Here's what I want you to do, like provide us some parameters and a goal. It then goes and does the 21 steps with no human oversight. So this is the problem is that we've been seeing brands talking about their agents as autonomous when they are not.
[00:19:49] Paul Roetzer: They're not even close to autonomous. And so this is why I had created this human to machine scale years ago. It's this idea that the technology and the tasks. [00:20:00] Have levels of autonomy that there's kind of like there's zero which is it's all us. We're telling it what to do and then there's full autonomy at the end where the machine does everything.
[00:20:10] Paul Roetzer: The human provides no real inputs or oversight. It's not dependent upon the human for anything. And so just to give you a sense, the Salesforce Agent Force page, so this is agent force, is all the rave. That's what Salesforce is pushing everything into. They, they define agent force, agent as. A proactive, autonomous application that provides specialized, always on support to employees and customers.
[00:20:37] Paul Roetzer: They're equipped with necessary business knowledge to execute tasks according to their specific role. Now, they're calling it an autonomous application, and yet, on that same page, it says the user defines the role, connects the trusted data sources, defines the actions, sets the guardrails, determines the channels, Where they connect.
[00:20:59] Paul Roetzer: That [00:21:00] sounds like a lot of human involvement and oversight to me for something that's supposed to be autonomous, so you can understand where the confusion comes in. Then we go over to Microsoft. October 21st of this year, less than a month ago, the headline of their own blog post, New Autonomous Agents Scale Your Team Like Never Before.
[00:21:20] Paul Roetzer: If I'm a marketer or a business person and I see that headline, I think I'm going to assume that we are at the age of autonomous agents, right? Like that's pretty defect. In that post, it says we're announcing new agentic capabilities that will accelerate these gains and bring AI first business process to every organization.
[00:21:38] Paul Roetzer: First, the ability to create autonomous agents with Copilot Studio will be in public preview next month. Great, I'm the CEO of a company. Autonomous agents are here in November of 2024. Like, I don't need agencies. Maybe I don't even need employees. Like, autonomy has arrived. Second, we're introducing 10 new autonomous agents in Dynamics 365 [00:22:00] to build capacity for every sales, service, finance, and supply chain team.
[00:22:04] Paul Roetzer: They then go on to provide some context, which is actually quite helpful if they hadn't. Already made all these promises in the headline. So they say, Copilot is your AI assistant. It works for you. And Copilot Studio enables you, now in here, remember these are autonomous. You easily create, manage, and connect agents to Copilot.
[00:22:28] Paul Roetzer: Think of agents as new apps in the iPower world. Every organization will have a constellation of agents. Now, this is the real key that maybe should have been closer to the headline, ranging from simple prompt and response to fully autonomous. They will work behalf on behalf of the individual team and function to execute and orchestrate business processes.
[00:22:49] Paul Roetzer: Then they have another blog post same day, unlocking autonomous agent capabilities with Microsoft Copilot. And in that blog post, agents are expert systems that operate [00:23:00] autonomously on behalf of a process or a company. They also have another one, Unveiling Copilot Agents Built with Microsoft Copilot to Supercharge Your Business.
[00:23:10] Paul Roetzer: Now in this one they talk about they come in all shapes and sizes and they like actually don't get into the autonomy thing. So that's Microsoft. They're maybe like the most guilty party here in terms of like claiming autonomy. Google has actually done a pretty good job of not claiming autonomy, per se.
[00:23:29] Paul Roetzer: So Sundar Pichai, May 2024. So this is right around the Google I. O. conference. he defines it as intelligent systems that show reasoning, planning, and memory. They're able to think, quote unquote, multiple steps ahead and work across software and systems, all to get something done on your behalf, and most importantly, under your supervision.
[00:23:49] Paul Roetzer: They're actually like very directly not saying it's purely autonomous. then Thomas Kurian, the CEO of Google Cloud, September 2024, for now, this [00:24:00] is two months ago. AI agents are intelligent systems that go beyond simple chat and predictions to proactively take actions. That's not bad. Like, again, Google's done a pretty nice job here of not over promising autonomy.
[00:24:11] Paul Roetzer: NVIDIA, in October 2024, they say in a blog post, what is agentic AIs the, is the title. AI chatbots use generative AI to provide responses based on a single interaction. A person makes a query, chatbot uses natural language processing to reply. The next frontier of AI is agentic AI, which uses sophisticated reasoning and interactive planning, iterative planning, to autonomously solve complex, multi step problems.
[00:24:41] Paul Roetzer: So they're kind of alluding to autonomy is coming. Then Jensen Huang, the CEO, at a conference last week, the NVIDIA AI Summit in Japan. This is what he said about AI agents. The first AI is basically a digital AI worker. [00:25:00] These AI workers can understand, they can plan, and they can take action. Sometimes, the digital AI workers are being asked to execute a marketing campaign, support a customer, come up with a manufacturing supply chain plan, help write software, maybe a research assistant, a lab assistant in drug discovery industry.
[00:25:19] Paul Roetzer: Maybe this agent is a tutor to the CEO. These AI, these digital AI workers, we call them AI agents, are essentially like digital employees. Now, I actually really like the direction Gensim goes here, and so I'm going to finish this excerpt because I think it's, it's very representative of the reality. Just like digital employees, you have to train them.
[00:25:38] Paul Roetzer: You have to create data to welcome them to your company, teach them about your company. You have to train them for particular skills, depending on what function you would like them to have. You evaluate them after you're done training them to make sure that they learned what they're supposed to learn.
[00:25:54] Paul Roetzer: You guardrail them to make sure they perform the job they're asked to do and not the jobs they're not asked to do. [00:26:00] And of course, you operate them, you deploy them. That does not sound like autonomy to me. That sounds very clearly like the human is essential in this process. okay, so then they interact with other agents, they have the ability potentially to interact with other agents, to work as a team to solve problems.
[00:26:18] Paul Roetzer: Agentic AI is transforming every enterprise using sophisticated reasoning and iterative planning to solve complex, multi step problems. so let me go into that, okay. So what makes agentic AI so powerful, this again is still Jensen talking, is its ability to turn data into knowledge and knowledge into action.
[00:26:37] Paul Roetzer: A digital agent, in this example, can educate individuals with insights from a set of informationally dense research papers. None of these agents can do 100 percent of anyone's task, anybody's job. None of the agents can do 100%. However, all of the agents will be able to do 50 percent of your work. This is the [00:27:00] great achievement.
[00:27:00] Paul Roetzer: Instead of thinking about AI as replacing the work of 50 percent of people, you should think that AI will do 50 percent of the work for 100 percent of the people. By thinking that way, you realize that AI will boost your company's productivity. You know people have asked me, is AI going to take your job?
[00:27:19] Paul Roetzer: This again is Jensen still, and I always say, because it's true, and I'm, I'm the one who's gotten ridiculed for saying this, but now Jensen is saying it again. AI will not take your job. AI used by somebody else will take your job. And so be sure to activate using AI as soon as you can. So the first is digital AI agents.
[00:27:39] Paul Roetzer: Then, one other piece of context from Jensen in spring of this year on, following an earnings call on a CNBC interview, he said, the world's enterprise software platforms represent approximately a trillion dollars. These application oriented, tools oriented platforms and data oriented platforms are all going to be revolutionized by [00:28:00] these AI agents that sit on top of it.
[00:28:02] Paul Roetzer: And the way to think about it is very simple. Whereas these platforms used to be tools that experts could learn to use. In the future, these tool companies will also offer AI agents that you can hire to help you use these tools to help reduce the barrier. Now, someone who may have already been working on this or listened to this quote as he was building it is Dharmesh Shah, our friend that I talked about on last week's episode.
[00:28:27] Paul Roetzer: Because Dharmesh has built Agent. AI where literally the call to action button is hire or like add to team an agent. And there's over a hundred agents you can go look at. So go look at Agent. AI if you want to kind of understand how this is going to work in the near term. So Dharmesh in September did a future of AI agents keynote at Inbound.
[00:28:48] Paul Roetzer: And, to Darmesh's credit, they did a really good job here of not over promising autonomy. He described it as software that uses AI and tools to accomplish a goal requiring multiple steps. [00:29:00] And he specifically said, some agents can have the ability to run autonomously, some have executive planning capabilities, but those are niceties, not necessities to be an AI agent.
[00:29:12] Paul Roetzer: So, as I take a breath here The key thing to understand about AI agents, forget all the kind of confusing different messaging coming from these different brands. At the end of the day, an AI agent takes actions to achieve goals. Now there is a spectrum of autonomy. So it is not like there is going to be no agent in the near future that the human just gives the goal to, and it just goes and does everything.
[00:29:39] Paul Roetzer: And that's it. The human has no involvement beyond that. No inputs, no oversight. So, think of autonomy as, again, this spectrum. It is not binary. Something is not autonomous or not, it can have kind of a level of autonomy. That's where the human to machine scale came in, was the different levels of autonomy.
[00:29:57] Paul Roetzer: Because if you think about, What needs to happen in [00:30:00] an AI agent for it to work? Someone has to set the goals. That is the human. Someone has to then do the planning of how this agent is going to function. Then, there's the execution. The plan is in place. It knows what to do. Then it executes. That's where the autonomy today lives.
[00:30:17] Paul Roetzer: That's where they're What they're calling autonomy is the execution step of the agent. Then there's the iterating or improving, like knowing it's doing something wrong and fixing it. Then there's the analyzing the performance. So if you think about kind of those five steps, goals, planning, executing, improving, analyzing, the autonomy that Microsoft is talking about, and others, is basically the executing phase.
[00:30:40] Paul Roetzer: So, there are, with every AI agent, there are varying levels of autonomy, there's varying levels of complexity, of it doing simple, like, five step process to 200 steps with no error rate, which basically doesn't exist today. So there's these levels of complexity, [00:31:00] there's, there's levels of its ability to understand, reason, plan, and remember, like memory.
[00:31:06] Paul Roetzer: They, in theory, can learn and adapt and improve and make decisions, but not all of them can. they can interact with tools like search. So, ChatGPT can now go and use search, calculators, Python code, like the ability to interact with tools and create other content, interact with other agents. They have data sources.
[00:31:25] Paul Roetzer: They have guardrails and controls. They can be multimodal or not. They can be interpretable or not, meaning I can look and see why it did what it did, the steps it took, and they can engage with humans through natural language. So every one of those characteristics I just outlined, they're not uniform across agents.
[00:31:43] Paul Roetzer: Every one of them can be a variable within an agent. So we're, we're basically like using this, this AI agent term to encompass every form of agent that can take an action. But there's like a dozen characteristics. [00:32:00] That will all vary depending on the kind of agent you're interacting with. So my main takeaway for people here is to kind of summarize this as they are nowhere near autonomous.
[00:32:11] Paul Roetzer: If you hear about AI agents and you think, oh my gosh, they're taking my job next year, that is not happening. Like, If, if you realize all the things that have to go into making an agent work, goal setting, planning, building it, monitoring it, improving it, that is almost always the human's job right now. So I would actually be looking at this as the opposite of being threatened by them.
[00:32:36] Paul Roetzer: I would look at it in my company and say, well, I'm going to go play with agent. ai today and try and figure out how to build agents once they, it's a wait list right now. Once I can build agents on Agent. AI'm going to start building agents that are really valuable to people. If I have access to Copilot Studio, I can go build agents for my team to do things more efficiently.
[00:32:55] Paul Roetzer: The ability to build these agents, which mostly won't require coding [00:33:00] ability, is a massive superpower. So if you own an agency, if you are a brand marketer, if you are an accountant, a lawyer, I don't care what you do, think about the things that require multiple steps, that are repetitive, data driven processes in your business.
[00:33:18] Paul Roetzer: You will have agents for all of those things. It may take years. You can be the one that figures out how to build those things. The closest parallel right now is custom GPTs. Right. Yeah, since that's what you're doing. You're kind of like building AI agents in a way to do a thing. And so if you start to imagine the value of building a bunch of custom GPTs to take all of your processes, all these like 10, 20 step processes and build something that can do those.
[00:33:48] Paul Roetzer: Yeah, it's going to save a ton of time, drive efficiency, productivity, creativity. But someone's got to envision them, give them goals, plan them, build them, improve them. That is humans for the foreseeable [00:34:00] future. So, okay, I'm going to stop there. Hopefully that all makes sense because it's just, I think it, people need to think of this as an opportunity, not a threat.
[00:34:11] Paul Roetzer: Like, that's kind of my main takeaway right now.
[00:34:14] Mike Kaput: Hearing you outline that, it really does strike me with more clarity than I think I had in the past of just, I've been racking my brain, like, what skills are going to be really valuable moving forward outside of just nebulous, like, get good with AI, right?
[00:34:30] Mike Kaput: And being a manager, creator, and or shepherd of AI agents immediately strikes me, like you said. As something super, super valuable and like will be an obvious skill need in the next one to two years.
[00:34:48] Paul Roetzer: Yeah, I think so. Like, I think you could start to see resumes or start to see job applications where building agents and custom GPTs is a desirable capability across any [00:35:00] industry.
[00:35:00] Paul Roetzer: Now, obviously, like marketing sales service may move faster or product development, like things like that. They're going to be ahead of the curve looking for people with those capabilities. If you want to like bolster your resume, don't like just take a class, like the way you used to boost your resume or like your career opportunities was go take a class, go get a certification.
[00:35:20] Paul Roetzer: That still matters. Build agents, build custom GPTs in your personal life, build them to help with your own job. And then when you go into those interviews, say, yeah, actually like I've managed to open up 50 hours per month because I built five agents. That do these things I used to do and it enabled me to go do these things.
[00:35:41] Paul Roetzer: and like, I think the people who are proactive within their own companies are going to become more valuable there. But if you're like, I need to move, I need to go get a career opportunity with a company that's more AI forward than where I'm at, go build some agents and improve your ability to bring that to another organization.
[00:35:59] Paul Roetzer: Like [00:36:00] that's the opportunity. Or if you're an entrepreneurial mindset. Think of all the agents you can build. Like they truly are going to be part of your team. Like that Jensen interview, I would recommend, or the, presentation, the Japan one, I would recommend people watch that and then I would go look at agent.
[00:36:13] Paul Roetzer: ai and see how Dharmesh and the team are kind of positioning these things as team members and that's it. Like you basically add agents to your team to do things and org charts are going to one to two years out. You're going to see that, where there's AI agents just baked right into the org chart.
[00:36:31] Mike Kaput: And as you're saying that, I also, and it's on top of mind because I'm doing a talk later this week at, graduate school. You know, everyone always asks, like, what's your advice for job interviews or skills or career advice in the age of AI? And what you just said with even custom GPTs and or agents is huge.
[00:36:50] Mike Kaput: But it's so easy to do that I'd also be considering every interview I go into creating one specifically for the company I'm interviewing with. It's not hard to figure out what a company's [00:37:00] broad marketing strategy is, say, if you're in marketing, for instance, from their website. So you could pretty easily extrapolate what are they likely to be spending a ton of time on with a little research and create something valuable to show them.
[00:37:14] Paul Roetzer: Yeah. and I like that idea a lot. And you can even, like one of the AI agents that Dharmesh and the team built on agent. ai is like a go to a company profile thing and it'll go to a company profile. I think there's one for earnings calls. So I don't know. I mean, I really think that if people get through the abstract nature and uncertainty of what an agent is and just think of it is something that can basically take actions for you across And you start thinking about all those repetitive data driven things you do and start thinking, maybe I can build an agent for that.
[00:37:51] Paul Roetzer: And again, it may not be today that you can go do it, but it might be first quarter of next year. And so if you [00:38:00] can be the one on your team that just starts building agents internally that other people can use, again, it's going to be so valuable. And so many people are going to think it's harder than it is because it's not going to require coding ability.
[00:38:14] Paul Roetzer: And it's, it's almost hard, like, honestly, I've spent basically my last, like, ten days of my life immersed in what is an AI agent and like, and I've been thinking about it for years, but very intensely for, like, the last week and a half. And I'm having trouble, honestly, wrapping my mind around how big the opportunity is to be the one that learns how to build these things.
[00:38:36] Paul Roetzer: Whether it's in your team or if you're an agency or, if you're an independent developer, People are going to need help doing this. Like, this is a massive consulting opportunity. It's a huge opportunity internally to create a career path for yourself. Like, it's, it's big. Like, it's real big.
[00:38:56] Mike Kaput: What's also big is our third main [00:39:00] topic.
[00:39:00] Mike Kaput: And big, I mean literally, because Lex Fridman just dropped an insanely long interview, five hours long, with key leaders at Anthropic, including CEO Dario Amodei. Amanda Askell, who works on fine tuning and AI alignment at Anthropic. And co founder Chris Ola, who is working also on mechanistic interpretability at the company.
[00:39:25] Mike Kaput: And as you can expect, they discussed a lot of different things in the time they had. Amodei talked a lot about the scaling law limitations we just discussed. He talked about the possibility that we may run out of data or hit a ceiling, in terms of how AI models can learn about the world. He talked a lot about Anthropic's Responsible Scaling Policy, which is designed to address the risks of AI systems.
[00:39:52] Mike Kaput: Askell, she talked about the importance of creating a good character and personality for Claude, their model, and how this is [00:40:00] done through a process called character training. Ola discussed basically how the company aims to reverse engineer neural networks to figure out what's going on inside. That's that term mechanistic interpretability, what that means.
[00:40:13] Mike Kaput: And of course, this is just a very small sample of what they covered in five hours. But, like we've talked about before, like, these types of interviews are really important to stay on top of for a couple reasons. So, one is that The best way to understand what is shaping the future of AI is to listen to the handful of people who are actually doing it, which is actually a relatively small number.
[00:40:38] Mike Kaput: So, listen to what they tell you in interviews like this. What's also really interesting is number two, these interviews are actually kind of increasingly fulfilling the role of formal company messaging. We're increasingly seeing AI founders and startup founders generally, quote unquote, go direct, quote unquote.
[00:40:56] Mike Kaput: to say popular podcasts to get their viewpoints and [00:41:00] perspectives out there. So these interviews may actually be kind of the source of truth you get reference to on things like model release dates, product roadmaps, company viewpoints, etc. It's actually really funny and instead of responding to Bloomberg's requests for interviews, in one of the stories we cited in the are we hitting a wall segment, Anthropic literally just pointed them to this podcast multiple times, that Bloomberg article that we've cited that we were talking about.
[00:41:29] Mike Kaput: It literally says, Anthropic, in response to our questions, pointed to the 5 hour podcast with Lex Fridman. So, Paul, I know you've found a lot to pay attention to in this interview. Could you maybe share with us some of the most important highlights?
[00:41:44] Paul Roetzer: Yeah, luckily I had flights to and from San Diego last week, so I had, you know, 12 hours of travel to consume this at 2x speed.
[00:41:54] Paul Roetzer: So I cut through the vast majority of it. I'm just going to call it a couple of things. So I [00:42:00] referenced earlier on the scaling laws. Dario does not see it as an issue. You know, he thinks synthetic data is going to be a big thing. He thinks the reasoning path that OpenAI and others are taking is going to be a thing.
[00:42:12] Paul Roetzer: he said, I think most of the frontier companies, I would guess, are operating in roughly 1 billion scale, meaning a billion dollars for a training run, plus or minus a factor of three. Those are the models that exist now, or are being trained now. I think next year we're going to a few billion, and then 2026 we may go to above 10 billion for an individual training, for a single model.
[00:42:37] Paul Roetzer: And probably by 2027 there are ambitions to build 100 billion dollar clusters. and I think that will actually happen. So he certainly is a believer that this is going to continue. The one, section I found really interesting, I'm going to read this, this excerpt because I think it's really helpful, is the complexity of training these big models that I referenced earlier.[00:43:00]
[00:43:00] Paul Roetzer: So he said, so Lex says, what is the reason for the span of time between, say, a Claude Opus What takes that time, if you can speak on that? Dario says, so there's different processes. There's pre training, which is just kind of the normal language model training. And that takes a very long time. Again, that's where you take all the content, all the text, everything, and you train these, these models on.
[00:43:24] Paul Roetzer: That source data. that uses these days tens of thousands, sometimes many tens of thousands, GPUs, NVIDIA chips, for training them. Or we use different platforms, often training for months. So that initial training process, pre training, can take months and tens of thousands of NVIDIA chips. then he says there's then a kind of post training phase where we do reinforcement learning from human feedback as well as other kinds of reinforcement learning.
[00:43:55] Paul Roetzer: And again, that's humans telling the model, this is a good output, that's a bad [00:44:00] output. And they're trying to kind of tune it to, to do what they, the humans think is good basically. And so you hire people and they literally work with these models to fine tune these outputs using reinforcement learning. He said that phase is getting larger and larger now, and often that's less of an exact science.
[00:44:19] Paul Roetzer: It often takes efforts to get it right. Models are then tested with some of our early partners to see how good they are, and they're then tested both internally and externally for their safety, particularly for catastrophic and autonomy risks. So we did, we do internal testing according to our responsible scaling policy.
[00:44:39] Paul Roetzer: And then, he says, we have an agreement with the U. S. and the UK AI Safety Institute, as well as other third party testers in specific domains, to test our models for other risks, chemical, biological, radiological, and nuclear. We don't think that models pose these risks seriously yet, but every new model we can evaluate to see if we're [00:45:00] starting to get close to some of these more dangerous capabilities.
[00:45:03] Paul Roetzer: So those are the phases and then it just takes some time to get the model working in terms of inference and launching it in the API. So there's lots of steps of actually making a model work. so again, why don't we get GPT 5 like on December 1st like we thought we might? Well, because any one of these steps, they could have run into obstacles.
[00:45:23] Paul Roetzer: Now, is it scaling laws they're running into? It may have nothing to do with the scaling laws. It may just be they're getting bigger and more complex. And these different steps just take longer and they're finding more and more kind of hiccups or weaknesses or threats or whatever it may be within the models.
[00:45:39] Paul Roetzer: And they're not going to tell us that stuff. So, the media is going to write whatever they write, it may have nothing to do with the reality of what's going on. And then, the other one I'll save from Dario was, he said if, about AGI, which he prefers powerful AI, but whatever, if you just eyeball the rate at which these capabilities are increasing, [00:46:00] it does make you think that we'll get there by 2026 or 2027.
[00:46:05] Paul Roetzer: Again, lots of things could derail. We could run out of data. We might not be able to scale clusters as much as we want, but he seemed, he doesn't really see any obstacles that aren't, able to be overcome. And then, I won't dive into Amanda's, I love Amanda's interview because she's the person that's basically building the character of Claude, the personality behind Claude.
[00:46:26] Paul Roetzer: I would listen to that, like, even if you just want to jump ahead and listen to her, it's so intriguing, like, how she thinks about prompting, character development, the system prompt that goes into Claude that kind of guides its behavior, it's really intriguing and very non technical, it's kind of a, a very approachable, non technical.
[00:46:44] Paul Roetzer: overview, and then the, interpretability, mechanistic interpretability is a more dense technical topic, but the reason why this matters is because we, we've said, said this before, but if you're kind of new to these models, we don't [00:47:00] know why they do what they do. If it starts misbehaving or if it has some risk that's identified or has some emergent capability that wasn't expected when it comes out of training, they can't just go look at the code and like, Oh, there's the line that's causing this.
[00:47:15] Paul Roetzer: That is not how these things work. They function much closer to like the human brain where you just have neurons and they're firing and doing all kinds of things. So if you say to like, if you ask where are memories stored in the human brain or how are memories created or what are dreams or like, why did you have that thought?
[00:47:30] Paul Roetzer: Why did you say that word? You can't just go into the human brain and pick that thing out or like find the exact neuron that fired or neurons that fired together. That's how these things work. They, they just have all these parameters and they do all these things, basically like the human brain has the neurons.
[00:47:48] Paul Roetzer: And so like the interpretability is trying to understand why they do what they do, how they do what they do. And so it's a very important like bigger picture topic. That if you like the more [00:48:00] scientific, technical side of this, that would be a great listen for you. If that's overwhelming to you, then just don't stick around for the last hour and a half.
[00:48:08] Mike Kaput: Yeah, I think what's also notable here is it's proof positive of exactly what you were saying in the first segment, that despite everyone shouting their head off about us hitting a wall, there are many, many people, many of whom are deep within the actual AI labs that do not appear to believe this. Yeah.
[00:48:27] Paul Roetzer: And they And again, if they were selling something to us that was like a future sci fi, ten years out thing, you could like, question their motives. We'll know in like three to six months whether they're full of crap or not. And like, if you're Sam Altman or Dario Amodei and you're staking your entire reputation and career on them.
[00:48:50] Paul Roetzer: These being right, I feel like you might hedge a little bit more if you, if we were all going to know in three months you were lying about [00:49:00] it all, or you were just being misleading. Like, this is near term stuff. We'll know when the next models come out if we hit scaling laws, walls or not. And they don't think we did.
[00:49:12] Paul Roetzer: And so I just, I don't know, like I said earlier, I just feel like there's probably some elements of truth to it, but I would not, overreact to it. I wouldn't bet against these things continuing to get bigger and better.
[00:49:27] Mike Kaput: Alright, let's dive into this week's rapid fire topics. So first up, OpenAI is set to release an AI agent tool of their own in January, according to Bloomberg.
[00:49:38] Mike Kaput: This new tool, which is codenamed Operator, will be able to perform complex tasks on behalf of users, from things like writing code to booking travel arrangements, all by directly controlling a computer. The tool will be released as both a research preview and through OpenAI's developer API In a [00:50:00] recent AMA on Reddit, CEO Sam Altman said, quote, We will have better and better models, but I think the thing that will feel like the next giant breakthrough will be agents.
[00:50:09] Mike Kaput: To that end, Operator is apparently just one of multiple agent related research projects that OpenAI is working on, according to the sources interviewed by Bloomberg. So Paul, we've known everyone's working on agents, we just talked a bunch about agents, but OpenAI formally getting into the game, and quite soon, seems like potentially a big deal.
[00:50:31] Paul Roetzer: Yeah. And this is more, this is like computer use. Like we talked about that with Anthropic Claude. I think we talked last week about Google's working on something like this. This is more in the realm of what traditional AI agents were considered within the labs. Like I give you a goal to like book my trip to Florida and you go and have the ability to use my computer and other tools and you're able to go and do the thing and I rely on you, I trust you to have my credit card, I trust you to have the [00:51:00] login to the apps you're going to need.
[00:51:01] Paul Roetzer: And you just go fulfill the goal. You take actions to fulfill a goal. So, again, this is kind of why the confusion exists of like, this is the traditional AI agent that's far more capable, more autonomous. What we're instead getting are AI agents where I determine the 25 steps you're going to take. I tell you what steps to take and you just go do the thing.
[00:51:22] Paul Roetzer: It's like more automation. but yes, this, they're all working on this. We've known this since 2017 that they've been working on this. and I think next year we'll probably get a bunch of like cool demonstrations I do not expect in your consumer life or in your business life that you're going to be using these kind of truly more autonomous AI agents that take over your screen and do things.
[00:51:47] Paul Roetzer: Apple is working on this kind of stuff. so I think next year you'll start to experience it, but this will not, like, be life changing for you in 2025s.
[00:51:58] Mike Kaput: All right, some other [00:52:00] OpenAI news. Co founder Greg Brockman has returned to the company after a three month sabbatical. So in an internal memo to staff last week, Brockman announced he was officially starting work again.
[00:52:13] Mike Kaput: He also said he'd been working with Sam Altman to create a new role for him. Which is focused on tackling major technical challenges. Back in August, Brockman said he was taking his first break since helping start OpenAI nine years ago. Now, Paul, like, no doubt Greg deserves a break, but there's more to the story than just that.
[00:52:37] Mike Kaput: Because in past episodes, we had talked about some drama, some reports. At OpenAI, that some people kind of saw Brockman's leadership style as perhaps problematic or counterproductive. Is this true? All just kind of a quieter way of making sure Greg stays in the fold, unlike all these other executives, while shifting them away from managing teams.
[00:52:58] Mike Kaput: Like, what's going on here?
[00:52:59] Paul Roetzer: I [00:53:00] have no idea. I mean, the whole idea of like a more technical role probably implies like, hey, man, like, you're not going to be the president anymore. Yeah. And I don't, and maybe that's what he wants. Maybe he just wants to get back in. Like, you know, I think he liked being involved on the technical side.
[00:53:15] Paul Roetzer: With all the stuff they've got coming, with their O1 release, and whatever Orion is, and Sora, and you've got all these technical things, like maybe, maybe that's where he wants to be, or maybe it's just where they've decided is best, so, yeah, I guess we'll just have to wait and see what, what, role he ends up, being involved in, but, I can certainly see them taking a significant role, as we move forward, because they got a lot going on.
[00:53:41] Mike Kaput: So we got some research from earlier this year on generative AI's impact on jobs, and this research is kind of getting some new life and some new buzz. So the research we're talking about is from February of 2024, but it was highlighted just this week in Harvard Business Review, because the research is now going to be featured in the [00:54:00] peer reviewed journal Management Science.
[00:54:03] Mike Kaput: This research paper is called, quote, Who is AI Replacing? The Impact of Gen AI on Online Freelancing Platforms. And it's notable because it's a comprehensive study that analyzed over 1. 3 million job postings from a major freelancing platform before and after the introduction of ChatGPT. So from July 2021 to July 2023 is when they started looking at their data.
[00:54:30] Mike Kaput: And they actually found that the introduction of ChatGPT led to a 21 percent decline in demand for certain types of freelance work compared to jobs requiring manual skills. Now this impact was not uniform across all the categories. Writing related jobs were hit hardest, they experienced a 30 percent drop in demand.
[00:54:50] Mike Kaput: Software and web development saw a 21 percent decline. Engineering related posts dropped by about 10%. And after the release of AI image generation [00:55:00] tools, demand for graphic design and 3D modeling work fell by approximately 17%. Now it's not all bad news here, the study did find that the remaining job postings in AI impacted categories actually saw slight increases in budget and complexity, suggesting that while simple tasks might be automated, there was still demand for more sophisticated work that combines human creativity with AI tools.
[00:55:27] Mike Kaput: Now, Paul, obviously this is quite, data from quite a while ago. it is a study that's kind of probably going to be talked about quite a bit more just given that it's going to appear in Management Science, but we have to keep in mind it was published in February 2024. However, it does seem to highlight some interesting trends, which is there was a pretty immediate and material impact on what types of work people wanted to hire for once something like ChatGPT came out.
[00:55:56] Paul Roetzer: Yeah, I'm always happy to see this kind of research. I don't know how meaningful [00:56:00] it is, honestly, like it mainly because the The time period that they pulled the data from ends in July 2023, which is four months after GPT 4 came out, when there was almost no enterprise adoption. So, I mean, if anything, it might be an early sign that has gotten far worse.
[00:56:21] Paul Roetzer: Like, I could imagine these numbers are much, much higher for those traditional. roles because honestly, by summer of 2023, I don't really know too many enterprises that were using it to replace those roles. the other thing I would be really fascinated to see though, so I guess what I'm saying is I would love to see some updated data through summer of 2024, if those trends continued or grew.
[00:56:46] Paul Roetzer: My assumption is they would, my hypothesis would be that they grew significantly, in terms of like the impact it had on those jobs and postings. But the other thing would be, I would be fascinated to see what [00:57:00] are the other jobs that emerge. Cause I would guess that there's tons of postings for like AI agent building and an AI training and all these other things.
[00:57:09] Paul Roetzer: And again, like the opportunity or like other viewpoint here is if you're someone in these roles that is being impacted or may be impacted or the trends show you should be kind of really thinking about the future. Study where the merging roles are because AI agent training, gen, you know, gen AI training, like all of those things, your skills are transferable.
[00:57:35] Paul Roetzer: Like it's not the end of the world. You just got to look where the opportunities are going to be and like move, move in that direction. I'm not saying give up on your career path and what you went to school for. But the markets are going to shift and there's going to be new jobs that emerge that people didn't go to college for.
[00:57:54] Paul Roetzer: And so maybe that's, maybe that's kind of what you're going to be doing. Like, again, think about that post training [00:58:00] example from Dario, the importance of reinforcement learning from human feedback. Where, where does that come from? It comes from experts in their fields. They need experts in writing, in medicine, in biology, in math, and, you know, in, in Business consulting, like they need the experts to teach these things, how to do what they do.
[00:58:18] Paul Roetzer: And there's no end in sight for that. In fact, they're going to be paying more money for those experts. So I don't know, like, again, this study, is it super reliable data? It's old data. That's for sure. But it's directionally worth paying attention to, and I think, you know, maybe an impetus for people to be a little bit more proactive in figuring out where their career moves might come from next.
[00:58:42] Mike Kaput: Alright, next up, Google's most recent version of their Gemini model is now at the top of a popular AI leaderboard. So this new model is an experimental model called Gemini exp 1114. And it now beats out every other model on [00:59:00] the popular chatbot arena leaderboard, which we've talked about before. It uses ELO ratings and human rankings to rank over 150 of the most popular AI models.
[00:59:11] Mike Kaput: The organization behind the leaderboard made the announcement in a post on X on November 14th. And in that post they said the new Gemini model jumped from rank number 3 to number 1 overall, which puts it ahead of everyone like GPT etc. It also made leaps in specific categories. It went from number three to number one in math, number two to number one e creative writing two to one in vision and five to three in coding.
[00:59:39] Mike Kaput: Now you can test this new model out along with other models that aren't in commercial deployment yet. If you go to Google AI Studio, which is ai studio.google.com. Now there were a couple like nuances I saw here, Paul, where it was like. They have something on the chatbot arena called a style control rating, and this is basically an [01:00:00] evaluation method they developed to do what they call, quote, de biasing, like the user ratings.
[01:00:06] Mike Kaput: And they do that by kind of accounting for things like style elements that might influence how you or I would rate a model's performance. They say, for instance, quote, style indeed has a strong effect on model's performance in the leaderboard. This makes sense from the perspective of human preference.
[01:00:22] Mike Kaput: It's not just what you say. But, how you say it. But now, we have a way of separating the effect of writing style from the content so you can see both effects individually. And so like, when you look at the rating for style control, which they also provide, Gemini actually hasn't moved at all, it's still sitting at number 4, behind O1, GPT 4 O, and Claude Sonnet.
[01:00:46] Mike Kaput: So, Paul, this is just one leaderboard, it's a very important one though. Bye. Maybe take us a step back and just walk me through, like, why should we be tracking who's on top, who's not, how often is this changing, [01:01:00] what do we have to pay attention to here?
[01:01:02] Paul Roetzer: I'm not really sure I understand the style control thing, but, you know, whatever.
[01:01:08] Paul Roetzer: I mean, I get the premise of it, but I don't think I really understand how exactly that would work. Um Yeah, I mean, I think it's interesting for people like us to kind of keep an eye on it. I think it's increasingly intriguing because it's actually a pretty good indicator of when new models are about to get dropped.
[01:01:26] Paul Roetzer: So because all the frontier model companies are putting their models in here under different names, in this case, it's actually Gemini Experiments, so you know it's a Google model. Sometimes they don't put the name of the model in there. But when you see something jump like this, it's a really good indicator that we may be on the precipice of like a major new model coming out.
[01:01:45] Paul Roetzer: So that's part of why we follow it is just it's an indicator of things are coming. and obviously when you see a leap like this, it could be an indication that maybe there's something major. Maybe it's actually like a whole another leap up, like a [01:02:00] Gemini 2, and I'm not saying that's what this is, but when you see big jumps, you might get indications of something much bigger coming.
[01:02:08] Paul Roetzer: Sundar did tweet more to come, like he replied to Logan Kilpatrick's tweet about Gemini. This experimental model is pretty good. And again, I, These CEOs aren't going to boast if they don't know some things on the frontier, like they don't want to, you know, get out there and say stuff like that. so yeah, definitely worth watching.
[01:02:31] Paul Roetzer: I would think in the next couple weeks here, you might see something. And then the other note is the Gemini app is now available for download on iPhone. So if you have iPhones and haven't been able to have the Gemini app, You can now go grab that. I've been playing around with the Gemini Live, which is their version of, like, advanced voice mode.
[01:02:47] Paul Roetzer: Pretty slick. So, yeah, it's an easier interface for people.
[01:02:53] Mike Kaput: Alright, next up, Microsoft Copilot is having a bit of a bad week. Business Insider [01:03:00] just dropped an in depth investigation into how Copilot is falling pretty short of customer expectations. Business Insider says it reviewed internal emails, spoke with customers and competitors, and interviewed 15 current and former Microsoft insiders for the report we're about to talk about.
[01:03:20] Mike Kaput: They then report that many customers appear dissatisfied for what Copilot can actually do. Especially when compared to what was promised by Microsoft and how much the tool costs. They cite a number of third party research reports showing that customers are struggling to see the value of the tool, including a Gartner report from October that says only 4 out of 123 IT leaders they surveyed believes it provides significant value to their companies.
[01:03:49] Mike Kaput: Customers also appear to be seriously concerned about CoPilot's security. The tool relies in parts on browsing and indexing internal company information. Many have run [01:04:00] into issues with what CoPilot can access and what that means for employees, writes Business Insider. Quote, as a result, many customers have deployed CoPilot only to discover it can enable employees to read an executive's inbox.
[01:04:13] Mike Kaput: Or access sensitive HR documents. And this is a quote from an employee. Now, when Joe Blow logs into an account and kicks off Copilot, they can see everything, said one Microsoft employee familiar with customer complaints. All of a sudden, Joe Blow can see the CEO's emails. Another Microsoft employee said the tool quote works really darn well at sharing quote information that the customer doesn't want to share or didn't think it had made available to its employee, such as salary info.
[01:04:45] Mike Kaput: According to a Gartner, that Gartner survey again, a full 40 percent of IT managers said that their company had delayed implementing the tool for at least three months due to these types of concerns. This is affecting how much value companies get out of the tool. A customer they [01:05:00] talked to said his company had to disable the meeting summary tool, which he found really, really valuable because the legal team was wary of it saving transcripts.
[01:05:10] Mike Kaput: And last but not least, the worst criticism in this article for Copilot kind of came from Microsoft itself. One long time employee told Business Insider, quote, I really feel like I'm living in a group delusion here at Microsoft. In reference to the gap between what the company was promising, quote, And what it can actually do.
[01:05:30] Mike Kaput: So Paul, this is a rough picture of Microsoft Copilot. I mean, anecdotally, we've definitely heard rumblings from some people we talked to about gripes with Copilot. Like how bad is this?
[01:05:46] Paul Roetzer: Damn, really bad. Like those are, I mean, I've been following Mark Benioff, the CEO of Salesforce, and he's, you know, living his best life retweeting these, you know, negative things about Microsoft Copilot. He is like. [01:06:00] the chief antagonist at the moment for this stuff. Yeah. but yeah, it's so like, I, you know, I want to try and, be as objective as possible here.
[01:06:15] Paul Roetzer: I have yet to meet with an enterprise that loves Copilot. Like, and Mike, you, you do the, you do these talks too. We've been in workshops, we've met with big enterprises who have Copilot. I have yet to talk to a single person that is like, it's life changing, it's, you know, it's amazing. It's life changing. I had assumed a lot of the lack of value creation or utility was coming from a lack of education and training and change management, like where people were being trained how to use it properly, More and more, it does seem like it's just not ready for prime time and maybe they, maybe they're trying to sell like a whole [01:07:00] massive thing when they should be focusing on like smaller use cases or features within it that are immediately valuable.
[01:07:07] Paul Roetzer: Because what I'll say to people is if your company has Copilot or you're thinking about getting Copilot or, you're in a situation where the company has it, but it hasn't been rolled out due to different concerns. If you're a leader of a company and you're sitting around doing nothing because you're hearing co pilot doesn't work, go get ChatGPT, build some custom GPTs for people that help them do their specific thing that don't need to be connected to any systems or data, and get to work.
[01:07:36] Paul Roetzer: Like, don't let these articles make you think that generative AI in an enterprise is invaluable. That is ridiculous. Generative AI, when it's not Personalized to individuals for their workflows, or when there isn't a plan to prioritize use cases and roll those out across teams and departments, then yes, it doesn't [01:08:00] work.
[01:08:01] Paul Roetzer: There are hundreds of use cases, I promise you, in every company, across every department. Where you can get value without all these headaches, where it doesn't run into the issue of servicing the CEO's emails or salary information for your peers, like you, you just need to think this through differently and start coming at it from a different angle.
[01:08:21] Paul Roetzer: Do not wait until the middle of 2025 when your IT and legal finally allow you to roll out copilot to do something about this, you are going to fall behind. So yeah, tough look for Microsoft. Hopefully they get some stuff fixed or hopefully it's not as bad as it appears. but from a user perspective, don't wait around to get this.
[01:08:45] Paul Roetzer: Go invest the money and just get some licenses to ChatGPT or something. Do something.
[01:08:51] Mike Kaput: Well, there is some good news from Microsoft. Maybe this is totally coincidental they released this. You and
[01:08:57] Paul Roetzer: I earned some time in PR. Like, [01:09:00] sometimes we gotta balance the negatives. Yeah, for sure.
[01:09:03] Mike Kaput: So, Microsoft just dropped this awesome list of over 200 examples of real life companies using Microsoft AI to get results.
[01:09:12] Mike Kaput: This includes Copilot and Azure. They mention a few companies like BlackRock purchased more than 24, 000 Copilot licenses to improve productivity. Finastra uses Copilot to save employees 20 50 percent of their time on content creation, personalization, etc. Honeywell employees are saving 92 minutes per week, which is 74 hours a year, using AI from Microsoft.
[01:09:37] Mike Kaput: McKinsey created an agent to reduce lead time during onboarding by 90%, and admin work by 30%, and much, much, much more. Go check out the show notes, you'll see a link to the full list, they're really valuable to take a look at. Microsoft claims more than 85 percent of Fortune 500 companies are using its AI products.
[01:09:58] Mike Kaput: And they also mentioned an additional [01:10:00] study they commissioned with IDC that says for every 1 that organizations invest in generative AI, they're realizing an average return of 3. 70. So Paul, obviously Microsoft is like talking their own book here, but this certainly seems to show that despite criticism of generative AI generally, and Copilot as we just saw, like some companies are getting a ton of value out of these tools.
[01:10:25] Paul Roetzer: Yeah, and again, like, use this as education, to help inspire adoption, you know, inspire ideas of how to use it. My guess is a lot of these are custom builds from Microsoft. So a lot of times they'll sell the copilot licenses, then they'll go in and for three million build some, like, custom solution for something.
[01:10:42] Paul Roetzer: And I have no doubt that you can see, like, massive value from those. but again, like, Microsoft's copilot issues aside. these are all real things and I would, you know, I would go check them out. It's like you never know when you're going to see something that aligns with your business where it's like, I hadn't thought about that.
[01:10:58] Paul Roetzer: That's cool. So a quick [01:11:00] read, they're like, you know, it's like a sentence or two for each one. You can scan all 205 minutes.
[01:11:04] Mike Kaput: And they all link to, I think, further case studies. Yeah. You can kind of skim very quickly and then pick and choose what you want to read about.
[01:11:11] Mike Kaput: Alright, next up, Elon Musk's AI company, XAIs reportedly seeking a massive new funding round of up to 6 billion at a 50 billion valuation.
[01:11:22] Mike Kaput: According to CNBC, the deal is set to actually close early next week, and the money is going to be used largely to buy 100, 000 NVIDIA chips. The company, if you recall, raised 6 billion, a 6 billion Series B in May at a 24 billion valuation. And the majority of this funding round apparently is expected to come from Middle Eastern sovereign wealth funds.
[01:11:43] Mike Kaput: So, Paul, we've talked about Elon Musk recently, he seems very well capitalized headed into 2025. Even more importantly, he seems more well connected than ever, given the recent Trump election win. Like, what is the new money, the election result, what does this all mean for [01:12:00] XAI's direction?
[01:12:01] Paul Roetzer: I don't know if he's going to take it public in 2025, but this company can raise as much money as they want.
[01:12:06] Paul Roetzer: Like, given his clout within the incoming government and his influence over everything that's going to happen, yeah, I mean, they're, I can't imagine they're not going to raise a Series C and a Series D at some point next year, if not, like, start looking at an IPO. Um. I don't know. Yeah, I mean, this company is going to just skyrocket.
[01:12:28] Paul Roetzer: And whether they, you know, start delivering right away or not, it's just their own distribution. Like they have Tesla cars, they have the Axe platform, they have Neuralink, they have SpaceX, they have all Elon's companies. And this, this company is going to be like the AI platform for all of those companies.
[01:12:44] Paul Roetzer: Yeah. Yeah. yeah, it's just, it's going to be wild to watch this. The, I don't think we talked about, there was this, I don't know if it was insider information or somebody had this story last week about how, like, I think it was OpenAI was rumored to have hired a plane to fly over the [01:13:00] data center that Musk built in Memphis because they were so shocked that he was able to build the thing so fast and they were basically doing a reconnaissance mission, trying to figure out, like, How they were doing this, that's wild.
[01:13:12] Paul Roetzer: Yeah, it's gonna be nuts to follow.
[01:13:15] Mike Kaput: Alright, some other fundraising news. Writer, which is a leading generative AI startup, has just raised 200 million in a Series C round that values the company at 1. 9 billion. Writer, we've worked with them several times and talked about them a bunch. They have typically been a generative AI platform that helps enterprise teams generate content securely at scale.
[01:13:37] Mike Kaput: However, they've expanded beyond that initial valuable use case to now offer a quote, full stack generative AI platform for enterprises. In this funding announcement, it also sounds like some further evolution could be underway because Breiter says that quote, the new capital will help cement the company's leadership in the enterprise generative AI category.
[01:13:58] Mike Kaput: And fuel writers [01:14:00] development of enterprise grade, agentic AI.
[01:14:03] Paul Roetzer: There's those agents again. I'm gonna hear it in every press release, every funding round, every earnings report. You're going to see AI agent or agentic AI in
[01:14:12] Mike Kaput: all of them. So Paul, we've known the folks at Rider for a long time. Like what can we learn about their trajectory, overall trajectory of the startup market based on this funding?
[01:14:22] Paul Roetzer: Yeah. Good people. I'm a huge fan. I think May Habib is awesome. Their CEO and co founder. had a chance to spend time with her. Get to know them. they've made a bet on big frontier models aren't necessary to create enterprise value. And they're building their own, in some cases, domain specific models to start going after verticals.
[01:14:45] Paul Roetzer: And it's, it's been, probably viewed as counterintuitive for a couple of years as these scaling laws have kind of gone. And the research has shown like the frontier models are just going to obsolete these like smaller models that will just be smarter than them all. so I, you [01:15:00] know, good on them for continuing to stick to that vision and that bet.
[01:15:03] Paul Roetzer: And I do think there's going to be a place in the market for these like vertical domain specific smaller models that are very, the post training, like that reinforcement learning, all that stuff. is very fine tuned to specific domains. I think there's a massive market for that. And so I think companies like Reuter are well positioned to take advantage of that and keep growing.
[01:15:24] Mike Kaput: All right, Paul, we've got one final topic here, then two quick announcements. I'm going to kind of roll these together as we wrap everything up here. But first up. In a new episode of the Big Technology Podcast, Spotify's Chief Technology Officer, Gustav Söderström, shared some really interesting ideas about how AI is reshaping the music industry and how Spotify is reacting to all this.
[01:15:46] Mike Kaput: Rather than viewing AI generated music as a threat, he says Spotify sees it as the latest evolution in music creation tools. He emphasized Spotify doesn't plan to generate music itself, But it will serve as a platform for creators [01:16:00] who use AI tools, as long as they follow the right laws and licensing.
[01:16:04] Mike Kaput: And he said on the recommendation front, Spotify is evolving from simply algorithmic suggestions to become kind of a quote, ambient friend. An AI powered presence that understands context and can engage into a conversation about music. He also said that as AI capabilities grow, the scarcity of genuine human connection might make it more valuable than ever.
[01:16:27] Mike Kaput: Will we care if our favorite new song was created by AI, or will the human stories behind music become even more important? Paul, these are some pretty interesting points that kind of hint at a larger tension that creative industries are trying to figure out. What exactly does human creativity mean when AI can generate great music or art?
[01:16:48] Mike Kaput: How much are we going to care if something was created by AI or not, as long as it resonates? Like, that seems like there are some really big questions at play here.
[01:16:57] Paul Roetzer: I would go listen to this. I think Alex, the, [01:17:00] you know, Kantorowitz, does a great job. He asks very direct, challenging questions. I love listening to his podcast.
[01:17:05] Paul Roetzer: it's the first time I've heard Gustav, speak and I found him to be incredibly thoughtful and very balanced in Both, like, insights, but also honest about his uncertainty about what comes next, and I thought he was being very transparent about Spotify's approach to this. And I like that, to believe he's right because he talked about humans valuing human experiences and creativity even more in the age of AI, content, abundance, and overload, which is what I've been betting everything on.
[01:17:38] Paul Roetzer: That's what I can be more intelligent, more human.
[01:17:40] Mike Kaput: Yeah, I'm
[01:17:40] Paul Roetzer: very bullish on in person events and, you know, podcasts like this where perspectives and points of view are shared, not just like AI generated stuff from a PDF. so I don't know. I just, I thought it was a very vulnerable interview where I just felt like I really liked this guy.
[01:17:56] Paul Roetzer: Like, I want to hear this guy talk more because I feel like [01:18:00] that's the kind of like deep thinker that I love to like learn from. So I would. I would just suggest going and listening to it. It was a pretty quick clip that I saw. It was like a 10 minute clip or something. So, lots to be learned and I think big areas that we all need to be exploring more as we go forward.
[01:18:15] Mike Kaput: Alright, Paul, so at the end here, two quick announcements you've got for us on a webinar and an upcoming special podcast episode.
[01:18:23] Paul Roetzer: Yeah, so, you know, on recent episodes I've talked about this co CEO, GPT that I've built for myself for internal purposes and we've gotten lots of inquiries about this and how it works and things like that.
[01:18:35] Paul Roetzer: And so what I've decided to do is we're gonna host a webinar on December 17th and I'm gonna actually demo what I've built. We're going to show you how to build your own and share a prompt you can use for it. So whether you are a CEO or you just want to be able to talk to the CEO and understand how they think and work and kind of approach things like a CEO would, we're going to give you the tools to do that.
[01:18:59] Paul Roetzer: So stay [01:19:00] tuned. We don't have the webinar page live yet, but you can go to www. smarterx. ai slash newsletter, and subscribe to the newsletter and we will alert everyone that's a subscriber as soon as the page is live that you, that you can register for that. So that's the, one big one. And then the second is, we're gonna do a, special episode in December, I don't remember the date on this one, Mike.
[01:19:26] Paul Roetzer: Do you have the date?
[01:19:27] Mike Kaput: I think we had said we're going to release a special episode on Thursday, December 20th, whoops, Thursday, December 19th, so before Christmas break.
[01:19:37] Paul Roetzer: Okay, so same week, and what we're gonna do is 25 AI questions for 2025. We were looking at the data and of, 10, top 10 podcast episodes we've done.
[01:19:48] Paul Roetzer: Three of them are these Q& A episodes. We figured, all right, let's, you know, give the people what they want, I guess. And like, you know, focus on things to think about going into next year. But we're going to do a twist on this one and let you [01:20:00] contribute questions. So if you go to bit. ly, slash 25. Dash questions, dash episode, go to the show notes.
[01:20:08] Paul Roetzer: It's going to be in there. It's a Bitly link, specific to a Google form where you're going to be able to submit questions. And then Mike and I will curate those and integrate a bunch of those into that special episode. So again, coming in December, we're going to have a co CEO webinar of how to build your own co CEO and 25 AI questions for 2025, check the show notes for both of those.
[01:20:31] Mike Kaput: Great, Paul, as always, thanks so much for breaking everything down for us this week.
[01:20:36] Paul Roetzer: Thanks, everyone. And again, final reminder, no episode next week, November 26th. We'll be back on December 3rd. Thank you, as always, for listening. Thanks for listening to The AI Show. Visit MarketingAIInstitute. com to continue your AI learning journey and join more than 60, 000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, [01:21:00] attended virtual and in person events, taken our online AI courses, and engaged in the Slack community.
[01:21:07] Paul Roetzer: Until next time, stay curious and explore AI.