[The AI Show Episode 106]: Enterprise AI Adoption, GPT-4o Mini, New Research on Prompting, OpenAI “Strawberry,” & AI’s Impact on the Crowdstrike Outage

Written by Claire Prudhomme | Jul 23, 2024 12:10:00 PM

Join our hosts as they unpack Chevron CIO Bill Braun's candid insights on the challenges of implementing AI in large corporations, explore OpenAI's latest GPT-4o mini model, and discuss the latest findings in prompting research. Plus, get the latest on Apple’s AI training data, AI’s impact on the Crowdstrike outage, and the latest developments in AI at Meta.

Listen Now

Watch the Video

Timestamps

00:04:03 — AI Enterprise Adoption

00:18:07 — GPT-4o mini

00:27:53 — New Prompting Report

00:38:02 — OpenAI “Strawberry”

00:48:40 — Apple’s AI Steals YouTube Videos

00:51:48 — AI’s Impact on the Crowdstrike Outage

00:57:09 — Meta Llama 3 405B Release

00:59:09 — Meta Will Avoid the EU

01:01:57 — Intuit’s AI-Reorganization Plan

01:04:00 — Andrej Karpathy Starts Eureka Labs

01:06:41 — Update on Lattice Digital Workers

Summary

AI Enterprise Adoption

In a recent conversation with Chevron's Chief Information Officer, Bill Braun, The Information seems to have gotten a reality check on the current state of AI adoption in enterprises.

In that conversation, Braun said that despite Chevron creating an Enterprise AI team last summer to find valuable AI applications, he isn't yet convinced that large language models will significantly transform Chevron’s business or boost employee productivity enough to justify the costs.

Braun indicated that about 20,000 employees are testing Microsoft Copilot, but, he said, “the jury is still out on whether it’s helpful enough to staff to justify the cost.”

(The company is also using an internal chatbot to help with finding company knowledge across departments.)

In the interview, Braun also notes that Chevron, while increasing its cloud spend, isn’t doing so because of AI, but rather by the ongoing process of moving computing workloads to the cloud.

OpenAI Releases GPT 4-o mini

OpenAI has just unveiled its latest AI model, GPT-4o mini, positioning it as their most cost-efficient small model to date.

GPT-4o mini boasts impressive capabilities, scoring 82% on the MMLU benchmark, which measures textual intelligence and reasoning. It even outperforms GPT-4 on chat preferences in the LMSYS leaderboard.

Now, the game-changer: cost.

GPT-4o mini is priced at just 15 cents per million input tokens and 60 cents per million output tokens. This makes it more affordable than previous frontier models and over 60% cheaper than GPT-3.5 Turbo.

The model supports text and vision inputs in its API, with plans to expand to audio and video in the future. It also has a substantial context window of 128K tokens and can handle up to 16K output tokens per request.

In benchmark tests, GPT-4o mini outperforms other small models like Gemini Flash and Claude Haiku across certain tasks, including math, coding, and multimodal reasoning.

GPT-4o mini is now accessible through OpenAI's various APIs, and it's replacing GPT-3.5 for ChatGPT users across Free, Plus, and Team tiers.

Prompting Report

New research on prompting is making waves in the AI community. This research comes from AI education company LearnPrompting and researchers at OpenAI, Microsoft, Stanford, and a handful of other leading educational institutions.

It’s called The Prompt Report, and over 76 pages it surveys 1,500 prompting papers and over 200 different prompting techniques.

Now, the research doesn’t present the gospel truth on which prompt is always best or how to always increase prompt performance.

However, they did find that few-shot prompting (as in, providing at least a few examples while you prompt) using chain-of-thought (where you encourage the model to show its reasoning step by step) performs the best in the limited tests the researchers did.

Though, they emphasize that much more research into prompt performance needs to be done, and performance significantly varies depending on the task or model.

Links Referenced in the Show

AI Enterprise Adoption
GPT-4o mini
- GPT-4o mini: advancing cost-efficient intelligence - OpenAI
- Sam Altman X Status
Major New Prompting Report
OpenAI “Strawberry”
- Exclusive: OpenAI working on new reasoning technology under code name ‘Strawberry’ - Reuters
- What Elon Musk Really Thinks of Artificial Intelligence -- and Why You Should Care - Inc.
Apple’s AI Steals YouTube Videos
- Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI - Wired
- Apple trained AI models on YouTube content without consent- 9to5Mac
Crowdstrike Outage
- Using Devin to Recover from the CrowdStrike Outage - Cognition
- The AI Show Episode 88
Llama 3 405B Release
- Meta Platforms To Release Largest Llama 3 Model on July 23 - The Information
Meta Will Avoid the EU
- Scoop: Meta won't offer future multimodal AI models in EU - Axios
Intuit’s AI-Reorg
- Tax preparation company Intuit to lay off 1,800 as part of an AI-focused reorganization plan - AP News
Andrej Karpathy Starts Eureka Labs
- Eureka Labs
- Andrej Karpathy X Status
Update on Lattice Digital Workers
- Today, Lattice Makes History and Leads the Way in Responsible Employment of AI - Lattice
- The AI Show Episode 105

This week’s episode is brought to you by MAICON, our 5th annual Marketing AI Conference, happening in Cleveland, Sept. 10-12. Early bird pricing ends Friday. If you’re thinking about registering, now is the best time. The code POD100 saves $100 on all pass types.

For more information on MAICON and to register for this year’s conference, visit www.MAICON.ai.

Read the Transcription

Disclaimer: This transcription was written by AI, thanks to Descript, and has not been edited for content.

[00:00:00] Paul Roetzer: I think that by now it's pretty common knowledge that YouTube videos and transcripts are being used to train. And what does that mean to the creators? Like, who knows?

[00:00:10] Paul Roetzer: Welcome to the Artificial Intelligence Show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Roetzer. I'm the founder and CEO of Marketing AI Institute, and I'm your host. Each week, I'm joined by my co host. and Marketing AI Institute Chief Content Officer, Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career.

[00:00:41] Paul Roetzer: Join us as we accelerate AI literacy for all.

[00:00:48] Paul Roetzer: Welcome to episode 106 of the Artificial Intelligence Show. I'm your host, Paul Roetzer, along with my co host, Mike Kaput. We have some fascinating topics [00:01:00] this week. There was, I actually had some fun. Prepping for this one this morning, there was some where I kind of like dug back into some history, some articles I hadn't read in a few years, to give some context to some of the big topics we're going to talk about today.

[00:01:14] Paul Roetzer: So I'm excited to get into it. There's potentially some big news coming out this week, maybe a new AI model. Coming out very soon we'll talk about, but this week's episode is brought to us by the Marketing AI Conference or MAICON. The fifth annual, event is happening in Cleveland, September 10th to the 12th.

[00:01:34] Paul Roetzer: It's coming up really fast,a good part of my summer is dedicated to finalizing this agenda and preparing for all the different, sessions and everything we got going on at the event. There's going to be more than 60 sessions, by far our biggest event yet. last year we had about 700 attendees.

[00:01:52] Paul Roetzer: This year we're trending toward 1, 500. There's going to be amazing keynotes. We have 10 general, sessions, you [00:02:00] know, featured talks. One from Mike Walsh, who Mike and I have both read. He's. The author of the Algorithmic Leader. It was one of the more influential AI books in kind of the early days of the Institute and our, you know, thinking around AI and the potential for it.

[00:02:15] Paul Roetzer: So, day one is going to close with a keynote from Mike Walsh. I can't wait for that one. We have Andrew Davis, our good friend, Andrew, Andrew Davis doing the digital doppelganger. It's an amazing talk. I've seen it like three times and it changes every time. He's going to be right after my opening keynote on the AI timeline sort of road to a GI.

[00:02:35] Paul Roetzer: What comes next? Andrew is going to bring an insane amount of energy, as he always does. we have Liz Grin, partner at Quantum Black AI by McKinsey. she's going to be talking about digital trust in the trust economy and ai, and just an amazing list of speakers. So go check it out. It's MAICON.AI, so MAICON.AI

[00:02:57] Paul Roetzer: early bird pricing ends this Friday. So [00:03:00] that's what, July 26th, Mike,

[00:03:02] Mike Kaput: Yes.

[00:03:03] Paul Roetzer: Yeah, so July 26th is the last chance to get the early bird pricing. You can get an additional 100 off any of the passes with POD100 promo code. And then if you're bringing a group of five or more, We actually have a, a promotion going where you can get our Scaling AI on demand series for everyone that registers.

[00:03:23] Paul Roetzer: Again, that's for groups of five or more. They'll get instant access, or, you know, near instant access to the Scaling AI 10 core series. So you can accelerate AI learning now, and then, you know, get in person at Macon in September. So we would love to have you there Again, it's Macon, MAICON.AI

[00:03:44] Paul Roetzer: and don't forget to use that POD100 promo code to get 100 off in addition to the final early bird pricing that ends on Friday. Okay, Uh, AI adoption in the enterprise, Mike, we've got an [00:04:00] interesting case study, I guess, going on here.

[00:04:03] AI Enterprise Adoption

[00:04:03] Mike Kaput: Yeah, for sure. So this past week, the information seems to have kind of like gotten a reality check on the current state of AI adoption in enterprises. Now they just did a recent conversation with Chevron's chief information officer. Bill Braun. And they talked to him about how the company is using AI.

[00:04:26] Mike Kaput: And kind of surprisingly, in that conversation, Braun said that despite Chevron creating an enterprise AI team last summer to find valuable AI applications, they were He isn't yet convinced that large language models will significantly transform Chevron's business or boost employee productivity enough to justify the costs.

[00:04:49] Mike Kaput: Braun also indicated that about 20, 000 employees are testing Microsoft Copilot, but he said, quote, the jury is still out on whether it's [00:05:00] helpful enough to staff to justify the cost. The company is also using an internal chatbot to help find company knowledge across departments. Braun further went on to say, quote, We're a little dissatisfied with our ability to know how well it's working.

[00:05:15] Mike Kaput: We have to survey employees a lot, which isn't our favorite thing to do, but we're giving Microsoft a lot of feedback, and they're trying to help figure it out because they know they need to. He also noted that Chevron, while they are increasing their cloud spend, they're not doing it because of AI.

[00:05:31] Mike Kaput: They're rather just going through more of moving their computing workloads to the cloud. So, Paul, kind of an interesting angle here, one we don't always hear about. And you had posted about this on LinkedIn, kind of pointing out maybe a larger consideration here. And you wrote, quote, as a reminder, the cost of a copilot license is about 30 bucks per user per month, though they probably pay less with that many licenses.[00:06:00]

[00:06:00] Mike Kaput: You know, using it with 20, 000 people. Here's my opinion on this. If a company can't justify 30 bucks for Copilot or ChatGPT, Gemini, or Claude, then it is more likely due to a lack of education, training, and planning than it is to a deficiency in quality. in the AI's capabilities. This is both a challenge for the company licensing the technology and a weakness in how the AI tech companies are selling and supporting the platforms.

[00:06:28] Mike Kaput: So Paul, I guess given that commentary, maybe could you walk us through what you see as the possible problem here, both with the Chevron case study, specifically, but also more generally with enterprises and also kind of what should we do about it?

[00:06:44] Paul Roetzer: We keep hearing about this lack of value or this uncertainty of value created, and I think the media, I don't blame the media for running with these stories because it's, it's the majority of the stories that are out there [00:07:00] is, you know, these organizations and again, like, The information article doesn't explicitly state that they didn't have an onboarding plan or they didn't do education and training around how to get value out of copilot.

[00:07:12] Paul Roetzer: They didn't help people find use cases, things like that, but you can certainly make that assumption that, that, that is probably the case because it's what we keep hearing over and over again. So you hear these stories about not sure if we're getting value. Now, 20, 000 licenses is a lot. I don't blame them for wanting value.

[00:07:31] Paul Roetzer: I didn't do the math in my head, but that's probably, I mean, what, at least a couple million a month in licenses, I think. so it's not insignificant. And I could imagine the CFO is putting some pressure on saying, are we actually getting any value here? Are we increasing productivity? Are we saving costs?

have a plan

[00:08:30] Paul Roetzer: And you go look at the line item for Gen AI and it's like, well, are we even getting anything out of this? Now, all the while the people that are deciding whether or not we're getting anything out of this or whether or not it's worth the budget may not even understand AI, so they're just like, and so, you know, it's like, You know, when I first read this article, that was the, that, that was what came to my mind was like, man, this is exactly what we see every time we talk to big enterprises.

[00:08:53] Paul Roetzer: and even small businesses too, like, we're guilty of this at times. Like we have chat t, we have Claude, we have Gemini, we have all of 'em, [00:09:00] and we haven't even done this, but we're a little different than that. Everyone in our team kind of knows what they're doing, knows the use cases. Like they, we trust that they can get the value and we don't really think too deeply about it.

[00:09:12] Paul Roetzer: But even in a small business, like when I was running. Our agency, like, you know, I sold in 2021, but that's about 20 people. And I could imagine that feeling of like, wow, we're paying 30 bucks a month for 20 people, like as a small business, that's a lot of money. So to your question, what do you do about this?

[00:09:29] Paul Roetzer: And so, you know, in, I was reading this and I'm like, all right, just like real quick framework, like. Kind of off the top of my head, I put these five things on LinkedIn and it seemed to resonate with people. Cause last I looked at like 25, 000 impressions on this post and over 320 engagements. So it definitely like resonated with people.

[00:09:47] Paul Roetzer: So the five things I outlined for any size business, number one, pilot with small groups in select departments over a 90 day period, prove the value and create internal user champions, then scale it. [00:10:00] So buying 20, 000 licenses probably isn't the right call. Buying. 10 or 20 or a hundred and making sure you have a very specific group of people who have a responsibility, a KPI, an OKR, like whatever you want to call it for 90 days to prove the value.

[00:10:18] Paul Roetzer: So you're going to do this. And then if it works, now you go buy the 20, 000 licenses or 10, 000 or whatever you can do. But we've seen time and again, These big enterprises get, whatever reason, they just buy into these massive packages and buy all the licenses at once with no plan of what they're going to do with them.

[00:10:35] Paul Roetzer: The second thing, and this is really critical, is to prioritize use cases specific to employee roles and responsibilities. So don't just give them Copilot and say, yeah, it's got email writing assistance and it's got to help me write things in docs and it can maybe help create some images in PowerPoint.

[00:10:52] Paul Roetzer: Like, yeah. Don't just rely on the obvious use cases that are being sold with the tech [00:11:00] companies, what they're positioning it as. Think about the individual person's role, the tasks that they do in their job, and then assess the value of AI at that task level, and then pick three to five use cases initially for each person.

[00:11:13] Paul Roetzer: That will have an immediate and measurable impact. So, you know, this is what you and I do all the time. Like, it's like, okay, what are the campaigns we run? Like, you know, the podcast as an example, what are the tasks that we're doing over and over again? And then let's make sure we're infusing these platforms into those core things.

[00:11:32] Paul Roetzer: Now, if we find other use case along the way, great. But if all we do is these three, three to five things every week with this tool, it pays for itself 10 times over. In a month. and that's like, so when I look at these, like, how can you not get 30 worth of value out of this file? I'm like, if you, if you just took.

[00:11:52] Paul Roetzer: 10 minutes to look at the tasks you do in your job and pick three to five things these things can help with. There is no [00:12:00] way you're not saving at least an hour each week, if not each month. Like that's nuts. so that, that's the other thing. So small group and give them very specific use cases, help them figure out the ways to get value.

[00:12:12] Paul Roetzer: The third is, and extremely critical, and we talked in depth about this on last week's podcast, episode 105.

[00:12:19] Mike Kaput: Provide

[00:12:20] Paul Roetzer: Provide generative AI education and training to maximize the value, but don't just do general training, like tailor learning journeys for individuals, including specific coursework and experiences in the core platforms.

[00:12:33] Paul Roetzer: So help them figure out how to use these tools specific to what they do. we're going to, you and I have a webinar on Thursday, July 25th. We're going to introduce the state of marketing AI. research findings for 2024. I know we talked a little bit about that last week, but, two of the key findings we're going to go through, 75 percent of respondents, employers do not have internal AI focused education and training.

[00:12:58] Paul Roetzer: We asked that question [00:13:00] specifically. And the second, when we asked about barriers to adoption in their organization, in their marketing, 67 percent said lack of education and training is the top barrier to adoption of AI. And that is for For four straight years, education and training has been the number one barrier people identify.

[00:13:15] Paul Roetzer: We give them like 15 choices. So we know that education and training is an issue and these organizations have to provide that. And it's, I mean, you can't rely on the tech companies to do it. Like, I don't think Google and Microsoft and OpenAI and Anthropic can be expected at this point in their life cycle of AI to deliver all the education and training you need.

[00:13:38] Paul Roetzer: And honestly, when they do, it's often. To, product specific and not, you know, general enough overall, overall for monitored utilization. This is a little hard one. And I actually appreciated the point the CIO made in, the information article about utilization. It's kind of hard. Like they've been providing feedback to Microsoft, [00:14:00] like, Hey, we need more tools, more analytics to know how it's being utilized, but you need to find ways.

[00:14:07] Paul Roetzer: To monitor whether or not the people you're providing these licenses to on your team are actually using the technology. and then invest in the employees who are actually experimenting and finding value in them. And remove the licenses from the employees that aren't using them. So we've seen this a lot where, You assign a hundred licenses, say, to ChatGPT or Microsoft, and maybe 10 of the people who have those licenses are actually using them for anything other than rewriting your emails or writing drafts of documents.

[00:14:36] Paul Roetzer: Those aren't your power users. Those aren't the people that are going to find value and help others along. So, utilization is key. And then, Reporting performance benchmarks. And so what we mean by that is, once you know, like, the three to five things that someone's going to use it for, have them do a report of, okay, I'm going to use it to help me in writing articles, developing video scripts, [00:15:00] and, you know, you know, Producing the podcast each week.

[00:15:03] Paul Roetzer: Okay, great. Create a task list for each of those things, and then do an estimate of how much time you're currently spending to do those, and then we're going to, at 30, 60, 90 days, we're going to assess how that's changed based on your utilization of these models. And then you just have a very. Cut and dry report.

[00:15:20] Paul Roetzer: It's like it's transparent. This is what we were doing before. This is what it looks like now. So by doing that, you're able to then directly prove the value. So these five steps, and again, I'll just recap them. Pilot with a small group. Prioritize use cases. Provide education and training for generative AI specific to the tool you're bringing in.

[00:15:39] Paul Roetzer: Monitor utilization and report performance versus benchmarks. So in short, what I said on LinkedIn is, Have a plan. Like, the value is absolutely there if it's rolled out in a strategic way and as part of a larger change management plan, not a technology solution we just bought and we gave to everybody.

[00:15:58] Paul Roetzer: So, as a quick reminder, I [00:16:00] mean, this is what we do. So, like, SmarterX, Marketing Institute, it's all about helping people develop, learning journeys to, you know, provide this education and training, help people on like a pursuit to mastery of, of AI. And so there's three things I'll recommend. So one, I teach a free monthly intro to AI class on Zoom.

[00:16:18] Paul Roetzer: It's had over 25, 000 people go to it. I'm, we're going to do episode or, session 40, I think in August. That's INTROTOAI.COM

[00:16:27] Paul Roetzer: intro to INTROTOAI.COM. It is free. It's a 30 minute class with 30 minutes of Q& A every month. So that's a great way to do intro. We have Piloting AI, which Mike and I have created now two years in a row.

[00:16:42] Paul Roetzer: This version was released in late January 2024. And, It's 18 on demand courses that's meant as a step by step learning path for beginners at all levels. While it's tailored to marketers, a lot of the focus of the middle courses in that series is specific to marketing use cases. It is broadly [00:17:00] applicable to any knowledge worker that wants to understand how to pilot AI.

[00:17:02] Paul Roetzer: We teach how to prioritize use cases. And how to adopt this technology. So that's about nine hours of content with a professional certificate. That's pilotingai. com. You can go learn more about that. And then just in June of this year, we launched Scaling AI, which is sort of the next step. So again, intro, piloting, scaling, it's all designed as a learning journey.

[00:17:22] Paul Roetzer: That's 10 on demand courses. for business leaders, not marketing specific. It is for business leaders, director level and above is kind of how we think about it. and that is about six hours of content and walks through how to build an AI academy, the education training, how to develop an AI council.

[00:17:39] Paul Roetzer: Generative AI policies, Responsible AI principles, how to conduct impact assessments on your team, your partners, your tech stack, and how to build an AI roadmap. So that's, scalingai. com. So again, intro to AI is www. intro2ai. co, pilotingai. com, scalingai. com. So if you want to [00:18:00] accelerate education, those are the three key components that we offer, to help people in this area.

[00:18:05] Paul Roetzer: And I think we're going to flip

[00:18:07] GPT-4o mini

[00:18:07] Mike Kaput: All right. So another big topic this week, OpenAI has unveiled GPT-4o Mini, which is

[00:18:15] Mike Kaput: stand

[00:18:15] Mike Kaput: their most cost efficient small model to date. So GPT-4o Mini boasts some pretty impressive capabilities. It's scored 82 percent on the MMLU benchmark, which measures textual intelligence and reasoning.

[00:18:30] Mike Kaput: And it actually, they are saying outperforms GPT-4 on chat preferences on the LM SYS leaderboard.

[00:18:37] Mike Kaput: Now, the big thing to note here is the cost. GPT-4o mini is priced at just 15 cents per million input tokens and 60 cents per million output tokens. So this makes it an order of magnitude more affordable than previous frontier models, and it's over 60 percent cheaper than [00:19:00] GPT-3.5. Turbo. Now, GPT-4o Mini also supports both text and vision inputs in its API.

[00:19:07] Mike Kaput: There are plans to expand to audio and video in the future. It has a substantial context window of 128, 000 tokens and can handle up to 16, 000 output tokens per request. And in benchmark tests, GPT-4o Mini outperforms other small models like Gemini Flash and Claude Haiku across certain tasks like math, coding, and multi modal reasoning.

[00:19:33] Mike Kaput: Now right now, GPT-4o Mini is accessible through OpenAI's APIs, and it is going to be replacing GPT-3. 5 for ChatGPT users across free, open, and remote. So, Paul, like, first up, let's kind of talk about this announcement specifically. What does this mean for businesses using ChatGPT or building on it?

[00:19:58]weird [00:20:00] the Intelligence rises, costs fall. It's kind of like the high level here. they're getting smarter, but they're getting smaller, faster, more cost efficient, and that opens up the ability to build intelligence. So, to rewind back to episode 87, where we talked about this potential AI timeline, I'll, I'll just pull a couple of excerpts from, from that.

[00:20:23] Paul Roetzer: So I had, I had written, you know, the 2024 to 2030, sort of this road to AGI. The first bullet under 2024 is continued advancements and potential leaps in multimodal reasoning, planning, decisioning, expanded context windows, memory, personalization, reliability, accuracy. Then when you look into 2025 to 2026, which I headlined multimodal AI explosion, large language models built multimodal from the ground up, which is kind of what we're hearing here, but with video and audio, not just text and image.

[00:20:58] Paul Roetzer: Frontier models become [00:21:00] 10 to 100 times more powerful and generally capable. We're going to talk about that in a minute. but here's the key one. Smaller, faster, more efficient models enable vast applications on device. Phones, earbuds, watches, possibly wearables, Um, so, and then synthetic data is a key one to come back to.

[00:21:20] Paul Roetzer: So at this point, I've put synthetic data potentially becomes the dominant source of vision training, but also, as we learn, it plays a key role in training the big frontier models.

[00:21:32] Paul Roetzer: So

[00:21:32] Paul Roetzer: I want to jump over to Andrej Karpathy, who we talk about all the time, legendary AI researcher, ran AI at Tesla for five years, a couple of stints at OpenAI.

[00:21:43] Paul Roetzer: We'll actually talk in a few minutes about Andrej's new initiative, but he tweeted when this model came out, this is July 18th, Large language model size competition is intensifying backwards. My bet is that we'll see models that think very well and [00:22:00] reliably that are very, very small. There is most likely a setting, even in GPT 2 parameters.

[00:22:06] Paul Roetzer: So we're talking a couple of generations ago, for which most people will consider GPT 2 smart. The reason current models are so large is because we're still being very wasteful during training. We're asking them to memorize the internet, and remarkably, they do, and can recite complex things. But imagine if you were going to be tested, closed book, on reciting arbitrary passages of the internet, given the first few words.

[00:22:34] Paul Roetzer: This is the standard pre training objective for models today. The reason doing better is hard is because demonstrations of thinking are entangled with knowledge in the training data. Therefore, the models have to first get larger before they can get smaller, because we need their automated help, automated help.

[00:22:54] Paul Roetzer: To refactor and mold the training data into ideal synthetic formats. [00:23:00] So in essence, he's saying like by making them bigger, we can now make them smaller. And then by making them smaller, we can actually make them bigger because we can now create synthetic data that can be used to train the bigger models.

[00:23:13] Paul Roetzer: he goes on to give an analogy of Tesla and their full self driving that I found kind of fascinating as well. ChatGPT, so if you don't follow ChatGPT, it has its own Twitter account, and it tweeted, Pushing the limits of intelligence we can serve for free is part of the quest to make sure AGI benefits all of humanity.

[00:23:34] Paul Roetzer: So the cost reduction not only has practical application to businesses, it It has a broader application to this pursuit of AGI and making it readily available. Sam Altman also tweeted, Towards intelligence too cheap to meter. Way back in 2022, the best model in the world was TextDaVinci 003, which was available in their playground.

[00:23:57] Paul Roetzer: I remember You know, using that before [00:24:00] ChatGPT came out, it was much, much worse than this new model. It cost 100 times more. So just two years ago, the best model in the world cost 100 times more than this new model and was nowhere near its capabilities. So the rate of change, again, is so Complicated to understand um, process, but long story short, AI gets smaller and cheaper, drives access to it and accelerates innovation because now startups who build on the APIs can innovate faster.

[00:24:36] Paul Roetzer: They can build tools that are more practical to different industries. and it drives disruption faster because intelligence becomes cheaper and it potentially accelerates the impact on. The knowledge workforce that we've talked about. So, um, yeah, it's the bigger models are getting bigger and more generally capable and the smaller models are [00:25:00] achieving performance, outputs and capabilities that a year ago was what the most advanced frontier models did.

[00:25:08] Paul Roetzer: And this is the cycle we're going to be in. They're going to keep getting bigger and they're going to keep getting smaller. And that's a weird dynamic to be. Planning for the future, and, when intelligence seems to be everywhere and the cost of it is dramatically plummeting.

[00:25:24] Mike Kaput: This is really good timing too with last week's topic around the Carl Schulman podcast, which I know I'm sure for some people it just seemed like so future looking, but this is the cycle it seems he's talking about when intelligence that is good enough gets cheap enough, it unlocks all sorts of kind of Exponential or runaway, improvements in what's possible.

[00:25:50] Mike Kaput: Is that kind of how you're looking at this?

[00:25:51] Paul Roetzer: Yeah, for sure. I think it's a good call to like, if you didn't listen to last week's podcast, I would, I would go listen to it. I [00:26:00] think that it's helpful context. And then like I was trying to, I was like, was it 1969? the Apollo mission goes to the moon. And I was thinking back to, cause it was in July of, of 69.

[00:26:14] Paul Roetzer: So, you know, anniversary week, or right around the anniversary. And I was remembering back to the stories of how the computer, the supercomputer quote unquote of that time is nowhere near the capabilities of we now have in our pockets with the iPhone. And just that idea, now that was over an extended period of time.

[00:26:34] Paul Roetzer: We're talking about decades of innovation to lead to the shrinking

[00:26:37] Paul Roetzer: of compute capability into our pockets.

[00:26:40] Paul Roetzer: But that's in essence what's now happening every like 12 to 24 months is this like doubling, tripling, 10xing of intelligence to where that's what these models are going to feel like. It's kind of like what Sam was saying with that text Da Vinci in 2022 was 100 times more expensive than what we have now.

[00:26:59] Paul Roetzer: [00:27:00] And just keep playing that out as Shulman said. Like. Just play this out. And even Leopold, when we talked about situational awareness a few episodes ago, just play this out over every like one to two years and assume a 10 to 100 time increase in, in intelligence and a 10 to 100 times reduction in cost of intelligence.

[00:27:18] Paul Roetzer: There, there is no end in sight to that. And that's why, you know, we talk about this idea of an exponential, like it's really hard for our minds to comprehend Two years from now, if we look back and say, Oh, remember when they interviewed 4. 0 Mini and it was 100 times cheaper that, and now we have like 6. 0 Mini and it's 100 times cheaper.

[00:27:37] Paul Roetzer: Like that's where this is all going. And then yeah, Shulman's point was what happens when this occurs? Like, this is the future that is predictable right now. And no one seems to be planning for it other than OpenAI, Google, Microsoft, Anthropic.

[00:27:53] Prompting Report

[00:27:53] Mike Kaput: Yeah.

[00:27:55] Mike Kaput: All right. Our third kind of main topic this week is that some new research [00:28:00] has come out on prompting that is making some waves in the AI community. So this research comes from an AI education company called Learn Prompting and researchers at OpenAI. Microsoft, Stanford, and a handful of other leading educational institutions.

[00:28:17] Mike Kaput: This research is called the Prompt Report, and over 76 pages, it surveys 1, 500 papers on prompting and looks at over 200 different articles. Prompting techniques. Now, unfortunately, the research doesn't have the full gospel truth on kind of which prompt is always the best or how to always increase prompt performance.

[00:28:39] Mike Kaput: However, it did find that right now, few shot prompting, as in providing at least a few examples when you prompt, using chain of thought reasoning, where you encourage the model to show its reasoning step by step, that can perform the best in the limited tests the researchers did. Though they do emphasize [00:29:00] that much more research into actual prompt performance needs to be done, and performance significantly varies depending on the task or the model.

[00:29:09] Mike Kaput: However, in a related interview about the report, Sander Schulhoff, who is the founder and CEO of LearnPrompting, Gave six pieces of advice to improve the results you get from few shot prompting. And these are pretty helpful if you're doing any type of prompting along the lines of the marketing and business use cases we talk about all the time on the podcast.

[00:29:30] Mike Kaput: So, just very quickly, he says, generally, including more exemplars, that's kind of the fancy term they use for well selected examples of what you're trying to do. More examples typically improves performance. Again, not always. A little bit art, a little bit of science. Randomly ordering your examples can, in some cases, help.

[00:29:52] Mike Kaput: Because it reduces the tendency of the model to bias towards certain ones. For instance, if you were showing it both positive and negative [00:30:00] examples of the outcome you want, actually grouping those by positive and negative could force it to be biased towards one or the other. You definitely want a balanced representation of the different Outcomes you're trying to achieve with your examples, you want to label them as correctly as possible.

[00:30:18] Mike Kaput: You want to choose a common, consistent format for your examples. And finally, when possible, you want to use examples that are similar to the test instance you're asking about. So basically, making your examples very close to what you're trying to do can help you improve performance. Now, let's If you fail to do all these things, it's not like your prompt is going to totally fail.

[00:30:39] Mike Kaput: This is just a way to get some better results. So, Paul, this certainly, so far as I've seen, looks like the most comprehensive review yet of kind of the literature out there on prompting, possible ways to prompt better. Like, what did you take away from this [00:31:00] report?

[00:31:00] Paul Roetzer: was, I thought it was great. You know, one takeaway for people is that the models are weird, like we still don't really know why they work. Like some of the leading researchers continue to be surprised that you can just give these models all this data and they somehow make all these predictions about the next token or word in a sequence and they do it in this really comprehensive and intelligent way.

[00:31:23] Paul Roetzer: So we have to realize that. that.

[00:31:26] Paul Roetzer: This is all still pretty new. Like, us humans interacting with these machines and learning how to talk to them. It's still pretty new. Um. It does demonstrate that prompting ability still matters. So while I am a big believer that eventually it becomes less and less important for the average knowledge worker to be able to prompt this way, right now it is still highly relevant.

[00:31:53] Paul Roetzer: It is a differentiator. It's a competitive advantage to be able to talk to the machine and get a better output [00:32:00] from it. I think this demonstrates the importance of onboarding education and training that we just talked about with the Chevron example. So imagine day one, you're a Chevron employee or Brand X, whatever the company is, and you are given a license to Copilot or Gemini or Anthropic Claude or ChatGPT Team or whatever it is.

[00:32:22] Paul Roetzer: And day one with that license, you have a one hour training course. That one hour training course synthesizes how to prompt the machine. And it gives you 10, 15, 20 examples relevant to your daily workflow of how to craft prompts. You cannot tell me that greater value won't be unlocked from those platforms if training like this was provided to people.

you hav[00:34:07] Paul Roetzer: Like, as you're looking at your career potential and the value you can create in your organization, something like this is time really well spent to understand the fundamentals. Like, even when they just break down the components of a prompt into giving it a directive, giving it examples, telling it the format you want in the output, giving it instructions on style, Giving it a role, like what, what is it supposed to be?

[00:34:29] Paul Roetzer: and then providing any additional context. Like if you just do those things. Then you can be way ahead. The other thing I'll say, and we talked about this on some recent episodes, is if you're using Anthropic or Google or ChatGPT or Runway for Video, search for, and we'll put them in the links, the prompting guides provided by those organizations.

[00:34:49] Paul Roetzer: So while there are universals that do seem to work across models, cause they're generally trained on similar or the same data in many cases. So the prompting styles seem to kind of work [00:35:00] across models. There are nuances to them. So go read the Anthropic Prompt Library. Look at their, you know, Sheets Prompting Tutorial.

[00:35:08] Paul Roetzer: go to Google Gemini and get their prompting guide, on a quick start handbook for effective prompts. Read the Runway Guide. I like. There are some resources provided. And then the last thing, and again, Mike and I try and avoid the podcast being promotional, like outside of the upfront, like, promo for, you know, something we do.

[00:35:25] Paul Roetzer: We don't really talk too much about what we provide. But we, we did actually recently introduce a Prompting for Professionals workshop for this specific reason. Because that was the number one thing we kept hearing from people, is we don't know what to do. to do, we don't know how to guide our team. So we actually offer a new workshop where we'll go in and teach people the art and science of prompting, create a collection of template prompts that teams can use to immediately apply to their workflows.

[00:35:49] Paul Roetzer: But you don't have to use us. Like you can go to smarterx. ai, and learn more, click on workshops and then you can learn more about it. But there's tons of resources out there for this stuff. Just find people you [00:36:00] trust who can help you with prompting and give it a go. Like even if it's a free resource or reading these guides, but there are also professional solutions like Mike and I would offer through workshops.

[00:36:10] Paul Roetzer: But find people who can help you do this. If you don't have the time to do it yourself, because this is very valuable to your team.

[00:36:16] Mike Kaput: Yeah, I couldn't agree more with that. After having gone through this entire report, I've already got to dos listed out to go through it a second and third time because it is, like you said, the highest, I think, value use of your time in terms of what you get in terms of ROI because in all the workshops and engagements that we do, I can tell you I don't see 99 percent of people getting anywhere close to doing prompting as well as you can do it after reading this report.

[00:36:47] Mike Kaput: Yeah.

[00:36:47] Paul Roetzer: Yeah. And I, the other thing I'll tell you, and Mike, I know you've done this, go build like a GPT in ChatGPT or go build a project in Anthropic. And experiment with variations [00:37:00] of the prompt, like these different components they type. It is shocking sometimes how talking to the model differently, like the prompt you provide, the system prompt.

[00:37:10] Paul Roetzer: How it affects the output. And it's a hundred percent true that examples are critical. Like I have built a GPT where I didn't have examples. And then I said, no, no, no. I want it to be like this and boom, it nails it. And then it improves the example I gave it. So you got it. We say this all the time. You have to experiment with these things.

[00:37:29] Paul Roetzer: You cannot just go in and click the help me write button in Gemini or in Copilot and think you're going to get value or. Take an email and say, change the tone to be more professional. That is like so surface level right now. And that again, is the fault of the technology companies selling the stuff that they're not really teaching you how to prompt in ways that matter to you.

[00:37:51] Paul Roetzer: And the companies that are enabling these licenses at 30 bucks per user per month, and then questioning the value,

[00:37:56] Mike Kaput: Right.

[00:37:57] Paul Roetzer: just do the basics, like have a plan, [00:38:00] teach them how to prompt, give them use cases.

[00:38:02] OpenAI “Strawberry”

[00:38:02] Mike Kaput: Alright, let's dive into a bunch of rapid fire topics for this week. So first up

[00:38:07] Mike Kaput: joining

[00:38:14] Mike Kaput: This information comes from an internal document that was reviewed by Reuters and someone who is familiar with the matter that they interviewed.

[00:38:22] Mike Kaput: Strawberry appears to be a novel approach to AI models. Aimed at dramatically improving their reasoning capabilities. The project's goal is to enable AI to plan ahead and navigate the internet autonomously to perform what OpenAI calls, quote, Deep research. Now, details about how all this works and what this is are really, really light.

[00:38:43] Mike Kaput: Like, we don't have a ton right now. but Paul, this is, seems like pretty big news. Is this GPT 5, an important component of it? What's going on?

[00:38:54] Mike Kaput: So,

[00:38:54] Paul Roetzer: So sometimes prepping for this podcast is a lot of fun. And this was [00:39:00] one where

[00:39:01] Paul Roetzer: it was probably a little too much fun for me this morning. So. A couple of things of context here. One, this seems to possibly related to the QSTAR news from fall of 2023, when Sam was temporarily ousted as CEO of OpenAI, and there was this, you know, rumors swirling around QSTAR and these advancements in reasoning.

[00:39:23] Paul Roetzer: It seems like this may be a continuation of that, possibly. Under a different code name or something. So my immediate takeaway is more innovation is definitely coming, whether this is part of GPT 5 or it's, you know, going to be something in the future. It's coming. But the part I thought was hilarious is the code name, Strawberry.

[00:39:43] Paul Roetzer: It's like, what the heck does Strawberry have to do with anything? And I don't remember if I saw a tweet about this. Like, I don't remember how I ended up going down this path, but it appeared as though it might be. An effort by OpenAI to troll Elon Musk. So as we'll [00:40:00] recall from many past episodes, Elon and Sam Altman are not the biggest fans of each other.

[00:40:04] Paul Roetzer: Specifically, Elon is not a fan of OpenAI and Sam Altman. quick recap, Elon puts 40 million in to start OpenAI, creates the name OpenAI, him and Sam and a few others. Build OpenAI, as the counter punch to Google and their, you know, pursuit of AGI. Sam and Elon have a falling out in 2019. Elon exits OpenAI and now he is obviously doing all the things he's doing, including Xai to compete with OpenAI. Okay. If that's all new to you, I apologize if that's to . We'll go back to a few episodes and, and catch up. Okay. So now. The strawberry trolling possibility. So in 2017, April 2017, interview was conducted in March of 2017 in Vanity Fair. so this is a couple months before the Transformer paper. So before attention is all you need, paper [00:41:00] comes from Google Brain.

[00:41:01] Paul Roetzer: the invention of the Transformer that leads to the building of GPT. So we are about three to four months away. prior to the publishing of Attention Is All You Need in Spring 2017. The title of the Vanity Fair, now Elon and Sam are still boys at this point, you know, fresh, you know, building open AI a year into it.

[00:41:21] Paul Roetzer: The headline is Elon Musk's Billion Dollar Crusade to Stop the AI Apocalypse. The lead to the story, Elon Musk is famous for his futuristic gambles, but Silicon Valley's latest rush to embrace AI scares him. And he thinks you should be frightened too. Inside his efforts to influence the rapidly advancing field and its proponents, and to save humanity from machine learning overlords.

[00:41:47] Paul Roetzer: it talks about Musk and a Demis Hassabis meeting. Again, We're back in early days, 2015, 16, 17, at SpaceX. And they're talking about Musk's goals to colonize [00:42:00] Mars. And Hesabis replies, Demis Hesabis, the CEO of Google DeepMind, replied, in fact, he was working on the most important project in the world, developing artificial superintelligence.

[00:42:14] Paul Roetzer: Musk countered that this was one reason we needed to colonize Mars, so that we'll have a better future. Bolthole, I don't even know what a bolthole is, if AI goes rogue and turns on humanity. Amused, Hassabis said that AI would simply follow humans to Mars. The article goes on to say, the field of AI is rapidly developing, but still far from the powerful self evolving software that haunts Musk.

[00:42:39] Paul Roetzer: Facebook uses AI to target advertising, photo tagging, and curated news feeds. Microsoft and Apple use AI to power their digital assistants, Cortana and Surrey. Google's search engine from the beginning has been dependent on AI. All of these small advances are part of the chase to eventually create flexible, [00:43:00] self teaching AI that will mirror human learning.

[00:43:03] Paul Roetzer: There's some very important words being used here. Other quick context. This is about a year after I started marketing at AI Institute. So we were deep in AI research by this time, following very closely along with Demis and Elon and Sam and others. Article goes on to say, again, spring 2017, before DeepMind was gobbled up by Google in 2014, as part of its AI shopping spree, Musk had been an investor in the company, in DeepMind, with Demis.

[00:43:32] Paul Roetzer: He told me that his involvement was not about a return on his money, but rather to keep a wary eye on the arc of AI. Quote, it gave me more visibility into the rate at which things were improving, and I think they're really improving at an accelerating rate far faster than people realize. Musk warned that they could be creating the means of their own destruction.

[00:43:53] Paul Roetzer: He told Bloomberg's Ashley Vance, who just last week launched a HBO show [00:44:00] about space exploration, which I have to watch, sounds awesome. so he told Ashley Vance, the author of the biography, Elon Musk, that he was afraid that his friend, Larry Page, a co founder of Google and now CEO of its parent company, Alphabet, no longer, but at that time, could have perfectly good intentions, but still.

[00:44:18] Paul Roetzer: Quote, produce something evil by accident, including possibly quote, a fleet of artificial intelligence enhanced robots capable of destroying mankind. So again, 2017, Elon Musk is pushing this idea that we may accidentally create robots that destroy mankind. the author also spoke with Eliezer Yudakovsky, who we've talked about on the show before, who stated.

[00:44:44] Paul Roetzer: The AI doesn't have to take over the whole internet. It doesn't need drones. It's not dangerous because it has guns. It's dangerous because it's smarter than us. It's impossible. This is Eliza still. Impossible for me to predict exactly how we'd lose because the AI [00:45:00] will be smarter than I am. When you're building something smarter than you, you have to get it right on the first try.

[00:45:06] Paul Roetzer: This is an inside look at the Doomer's perspective going back to 2017. So now, here is The strawberry thing. So the author says, you know, I'm thinking back after I talked to Eliza about my conversations with Elon and Sam, and this is what Elon told her in explaining the potential for AI to take over and destroy mankind. Let's say, this is a quote from Elon, let's say you create a self improving AI to pick strawberries, and it gets better and better at picking strawberries, and picks more and more, and it is self improving, so all it really wants to do is pick strawberries. So then it would have all the world be strawberry fields, strawberry fields forever, and there would be no room for human beings. So, I don't know for a fact that this is the origin of it, but it's possible, and I [00:46:00] would not put it past them, that Sam and OpenAI are simply trolling Elon Musk for this strawberry quote. But the key here, and this is what I think is very important, If you go back to that quote, uh, picking strawberries and picks more and more, and it is self improving.

[00:46:17] Paul Roetzer: We know self improving is critical. We talked about that on, on, I think it was episode 105, where we talked about advancements that were potentially the foundation of self improvement. Andrej Karpathy talked about self improvement as a goal of all AI research labs right now. And so I think that strawberry may actually be a hint.

[00:46:40] Paul Roetzer: A way for reasoning, but also for self improvement of these. these. Altman tried to capture, again, going back to the Vanity Fair article, the chilling grandeur of what's at stake. This is from the article. Quote, it's a very exciting time to be alive because in the next few [00:47:00] decades, we are either going to head towards self destruction or toward human descendants, eventually colonizing the universe.

[00:47:08] Paul Roetzer: Now I will, I will wrap up. This, segment, because this was supposed to be a rapid fire item, but I get a little carried away this morning, Elon Musk tweets. Nice work by XAI team, X team, NVIDIA, and supporting companies getting Memphis Supercluster training started at 420 a. m. local time with 100, 000 liquid cooled H 100s on a single RDMA

[00:47:35] Paul Roetzer: single RDMA fabric,

[00:47:37] Paul Roetzer: It's the most powerful AI training cluster in the world. This is a significant advantage in training the world's most powerful AI by every metric by December of this year. Then Ahmad Mustaq, the former CEO of Stability AI, who is Well, I was going to say ousted, who left the company a couple of months ago.[00:48:00]

[00:48:00] Paul Roetzer: Replied, likely to be the fastest supercomputer in the world, at 2. 5 exaflops, the current fastest is Frontier at 1. 2 exaflops, then Aurora at one. Will late with latest advances 4 in about a week. So Elon Musk in 2017 is worried about the destruction of humanity and the, you know, energy fields taking over the world.

[00:48:26] Paul Roetzer: Leaves OpenAI, and now apparently has built the largest super cluster in the world of NVIDIA chips and has the most powerful supercomputer in the world. As the AI world turns.

[00:48:40] Apple’s AI Steals YouTube Videos

[00:48:40] Mike Kaput: No kidding. Alright, in some other news, a new investigation has revealed that some of the biggest names in technology, including Apple, NVIDIA, Anthropic, and Salesforce, have used thousands of YouTube videos to train their AI models, without creator's [00:49:00] knowledge or consent. So this investigation was conducted by WIRED and co published with Proof News, and they found that subtitles from over 173, 000 YouTube videos taken from more than 48, 000 channels have been used in AI training.

[00:49:19] Mike Kaput: So this data set called YouTube subtitles contains transcripts from a wide range of sources, and these include educational channels, news outlets, and popular YouTubers. Now, many creators were unaware. their content had been used. YouTube's terms of service prohibit harvesting materials from the platform without permission.

[00:49:41] Mike Kaput: However, several AI companies argue that their use of this data falls under fair usage. So, Paul, it's, I guess, you know, unfortunately at this point, not really a surprise that some companies have used YouTube videos to train their models. This isn't the first time we've heard of this happening, but do [00:50:00] any of the companies mentioned here engaging in this behavior?

[00:50:03] Mike Kaput: Like, Surprise you at all?

[00:50:05] Paul Roetzer: Nothing at this point surprises me. I think that the, again, we don't have a ton of details here, but the way this generally works is Like they, they can hire third parties who then use this data and it gives them a little bit of an arm's length, to be able to say, you know, we didn't necessarily know it was doing it.

[00:50:30] Paul Roetzer: But I think that by now it's pretty common knowledge that YouTube videos and transcripts are being used to train. it seemed quite apparent. We talked, you know, when SORA was first introduced by OpenAI. And, you know, the CTO being asked and her saying, Oh yeah, I don't, I don't know if we use YouTube videos.

[00:50:49] Paul Roetzer: It's like, yeah, okay, we, we know you did. Like everybody just admit that this is going on. Question then becomes, well, why doesn't Google, which owns YouTube, stop it? Well, because Google's [00:51:00] likely training on them too, which is a violation of their own terms of use. So it just seems like. A well known secret that everyone is doing this.

[00:51:09] Paul Roetzer: And what does that mean to the creators? Like, who knows? Like this is that again, there's every time we talk about innovation and how amazing this stuff is. We can't every single time stop and say, but they're training on stuff they shouldn't be training on. But that is the underlying story here.

[00:51:27] Paul Roetzer: That's going to keep coming up is how these models were trained, how they are trained, how they will be trained moving forward and whether or not. It'll end up being legal, how they did it, whether or not it is currently ethical. These are debates that don't really have an answer right now, but it's important for people to stay educated on what's going on.

[00:51:48] Crowdstrike Outage

[00:51:48] Mike Kaput: Alright, another kind of huge piece of news from this past week was a massive global IT outage that caused widespread disruptions to travel healthcare businesses [00:52:00] worldwide. Now this is, of course, what we've all heard of due to a defective software update from security firm CrowdStrike and largely affected Windows computers and systems.

[00:52:11] Mike Kaput: Even though this issue was kind of identified and fixed pretty early on Friday, like the ripple effect affects a really rapidly caused chaos. I mean thousands of flights were canceled, hospitals were affected, shoppers were encountering the blue screen of death on self checkout terminals. And even the Paris Olympics organizers reported issues with things like getting uniforms delivered.

[00:52:36] Mike Kaput: Now that companies are scrambling to pick up the pieces and get everything back online and repair the damage done from the outage, there's actually, Paul, like we've discussed, kind of a compelling AI angle. So, AI was obviously not involved necessarily in causing any of these issues, but there have been some interesting ways that people are using AI to pick up the [00:53:00] pieces.

[00:53:00] Mike Kaput: Could you maybe walk us through, connecting the dots here?

[00:53:04] Paul Roetzer: There's a couple of things we wanted to touch on this. So yeah, on the surface, not an AI story. There was like, you know, obviously people start pushing rumors that the code was written by AI and that's what happened. There's no validation to that.

[00:53:17] Paul Roetzer: But I think for us, like a couple of things worth, worth noting, how fragile our infrastructure is.

[00:53:25] Paul Roetzer: So this is a single piece of code, wasn't a cyber attack, wasn't a hack of any kind, nothing nefarious. It was just a piece of code that was executed on Windows devices. And I believe they have 29, 000 customers, I think is what I read. That's just customers. And then some of those customers have tens of thousands of, You know, devices that use Windows.

[00:53:46] Paul Roetzer: So this single piece of code that was executed at like 4 30 AM on a morning, like Eastern time is when I woke up at 6 AM and it's just like this crowd strike things everywhere. So that single point of failure can [00:54:00] impact so many industries. And so I think it's more, um, it's more important that people are aware how fragile the infrastructure is worldwide.

[00:54:11] Paul Roetzer: This was not just a U. S. based thing. This was everywhere. And as AI becomes more involved, in the writing and editing of code. Again, not saying it had anything to do with this, but more and more coders are relying on things like Copilot to assist in the creation of code. And the question becomes, like, do risks like this become greater?

[00:54:37] Paul Roetzer: Like, is there a higher risk that these sorts of events start to occur more regularly? Um, does it become harder for humans to to monitor and discover errors if they become too trusting of AI's ability to do this. And that's the challenge we're going to have with all AI, is like, if it gets to human level or beyond, [00:55:00] and people just start trusting it to do the work for them, If it's, if it's in things like this, where the risk is high of it being wrong, that imprecision we've talked about, that those little errors can have massive impact.

[00:55:13] Paul Roetzer: So that's, that's the thing I started thinking. I was like, wow, as we race forward in this future, where AI is having more and more involvement in the creation of things, including code, how does that play out when our infrastructure is so fragile? So that was the one. And then the other that you mentioned or alluded to is Is AI the answer to help fix this?

[00:55:33] Paul Roetzer: So not being an IT person, my understanding from having read some articles is the fix to this is largely manual. Like it requires the rebooting of these systems and going through a series of tasks to fix them, which in many cases requires IT support to do it. Which is why it may be weeks before, or in some cases months, before all of these devices are fixed.

[00:55:57] Paul Roetzer: So, Cognition, an AI research [00:56:00] lab building Devon, which we talked about in episode 88. So we'll put the link in the show notes to episode 88 where we talked about first AI software engineer. Devon came out with a lot of acclaim, didn't really live up to, you know, what it was Portrayed to be, but certainly a signal of where AI agents are going.

[00:56:20] Paul Roetzer: And so Cognition published a post. They said to test Devin's ability to help, with something like this. We set up a Windows machine in a cloud environment with simulated CrowdStrike failure conditions. And then they played out like what they did. And they basically instructed Devin with a playbook of how to recover a machine, gave it the initial eight steps, and then they let it go.

[00:56:40] Paul Roetzer: And Devin was able to do it. And so their whole point was, listen, there's like hundreds of thousands of devices that need to get fixed. There aren't enough IT people in the world to go do this. Maybe AI agents are a way to actually accelerate this kind of thing. Not replacing IT people, but assisting and stuff like this.

[00:56:58] Paul Roetzer: So, um, yeah, [00:57:00] just worth noting because it was a major event that happened in the world that has a relation to where we're going with AI. all

[00:57:09] Llama 3 405B Release

[00:57:09] Mike Kaput: All right. Next up, according to reporting from the information, Meta plans to release the largest version of its open source Llama 3 model on July 23rd, the day this podcast is coming out. And this is according to a Meta employee interviewed by Dall E. The, publication, the new model is 405 billion parameters.

[00:57:33] Mike Kaput: And this version of Llama3 is also multimodal. This capability puts it in direct competition with all sorts of other advanced AI models on the market. With the notable exception that Meta's approach to AI is embracing an open source strategy. So in April, they had released two smaller LLAMA3 models that we talked about, both of which were open source.

[00:57:56] Mike Kaput: And as a result were quickly adopted by [00:58:00] developers building solutions on top of these models. So Paul, it sounds like if this information ends up correct, LLAMA3 405B could drop the day this episode airs. Like, what does this kind of mean for meta, for businesses, for open source as a whole?

[00:58:17] Paul Roetzer: Well, they're obviously pushing the limits of open source. We don't know that this model would be open source. That was unknown at the time. So we'll find out. But, you know, I think it just continues to show the race is on. It's not just OpenAI and, Anthropic and Google. Meta has a major say here. XAI, we just talked about with Elon Musk as a major say.

[00:58:40] Paul Roetzer: There's Mistral, there's Cohere, there's others, and it's just this constant leaping of capabilities. But if, if they open source this, it would be a big deal. But I think this will also give us an indication of Meta's larger play into enterprise AI potentially, you know, are they actually going to try and make money?

[00:58:59] Paul Roetzer: Are they going to try and [00:59:00] build enterprise solutions for people? that'll, that'll be, you know, interesting to see when they do finally announce this, whether it's, you know, July 23rd or later date.

[00:59:09] Meta Will Avoid the EU

[00:59:09] Mike Kaput: So some other interesting Meta news. They've actually decided to withhold the upcoming multimodal Llama model and future ones right now from customers in the European Union. So Meta says that they are getting a lack of clarity from EU regulators. So according to Meta, it's not just impacting their direct customers.

[00:59:33] Mike Kaput: but European companies won't be able to use these new models, even when they're being released under an open license. This could also prevent non EU companies from offering products and services in Europe that use these new models. Now, we've talked a lot about EU legislation, but interestingly, META's issue is not with the EU's upcoming AI Act.

[00:59:55] Mike Kaput: They're more concerned with how they're allowed or not allowed to train models [01:00:00] using data from European customers in compliance with GDPR, which is the EU's data protection law that's been in effect for quite some time now. Now, META claims that they had briefed EU regulators on their intentions to train these models on EU citizen data, giving them plenty of opportunities to opt out and notices, et cetera.

[01:00:22] Mike Kaput: Yet, it sounds like, at least from their side of the story, that EU regulators surprised them in June, telling them to pause training on EU data after META publicly announced that they were doing such training. So, Paul, do you see this becoming a trend? Should we expect to see more major AI companies start avoiding Europe without Appropriate clarification on some of these issues.

[01:00:47] Paul Roetzer: Yeah, I don't think there's any doubt. I mean, we saw this already last year with Google where they didn't. Release Gemini to 450 million citizens in the EU. I think there's just a lot of uncertainty around AI Act [01:01:00] and now that it's official, and I think it's like 180 days or something, they have or maybe over the next year they have as the regulations play out, people have that time to get in line.

[01:01:12] Paul Roetzer: And I think a lot of these tech companies are just going to be uncertain about what exactly that means. And so I think we're going to probably see a lot of these, slow playing rollouts or withholding things to get clarifications. And there's probably going to be a story about this every week related to one of these companies.

[01:01:28] Paul Roetzer: So I think the interesting thing will be. You know, what it does to innovation there. And I've seen a lot of stories around just how, how hard it's going to be to build AI startups within the EU because of these, and, you know, the big thing we'll talk, maybe we'll talk about this more next week is in the U S what lessons are learned as we look at, you know, more, you know, tighter regulations, and what the different potential administrations may, may do there. all for

[01:01:57] Intuit’s AI-Reorganization Plan

[01:01:57] Mike Kaput: So next up the financial software company [01:02:00] Intuit has announced a major AI focused reorganization plan. So this plan includes laying off about 10 percent of their workforce. It's about 1800 people, though they say they expect to hire at least that many new employees in fiscal year 2025. As they're basically focusing on reshaping their business as they incorporate AI into their products and services.

[01:02:22] Mike Kaput: Their CEO Sassan Goudarzi stated in an email to employees that quote, companies that aren't prepared to take advantage of the AI revolution will fall behind and over time will no longer exist. So Paul, that last quote kind of sounds very similar to things we've been saying for a couple years. like, how did you, how do you look at this news fitting into the overall picture of, like, existing companies becoming AI emergent

[01:02:50] Paul Roetzer: Yeah, I mean, we've talked about it before, but I wrote a blog post called the future of business as AI are obsolete back in 2022. And this is exactly the premise that [01:03:00] every company is going to come to the realization, no matter what industry they're in. That if they don't become AI emergent, if they don't become AI forward and find ways to infuse it into their operations, their tech stacks, their, of all of their talent, they're just going to become obsolete.

[01:03:13] Paul Roetzer: And depending on the industry, it'll happen over different time periods, SaaS this year, legal, healthcare, financial services, three to five years, like it's going to vary by industry. But this is the inevitable outcome and CEOs are going to realize it if they haven't already, and there's going to be massive pressure to do things like this.

[01:03:33] Paul Roetzer: I think 2024 will hear sporadic stories about this. 2025, 2026, we're going to hear about this daily. You know, companies that are making massive shifts in their workforce because of AI, it, it, Again, I talked about this at length last week. I, I just think it's inevitable. there's only a few things that can slow it down.

[01:03:55] Paul Roetzer: And I don't think any of those things are going to stop the wide, wider spread application of [01:04:00] this.

[01:04:00] Andrej Karpathy Starts Eureka Labs

[01:04:00] Mike Kaput: All right, so former OpenAI star researcher Andrej Karpathy, who we had just mentioned previously, has launched Eureka Labs, which is a company designed to create a, quote, AI native. School. Now, basically, they envision a learning experience where students can work through high quality course materials with the guidance of an AI teaching assistant that basically embodies the knowledge and teaching style of world class experts.

[01:04:28] Mike Kaput: So they're looking to create a symbiosis between human teachers who design course materials and AI assistants who help guide students through them. They have also announced as part of this release their first product, which is going to showcase this approach. It is an undergraduate level AI course called LLM 101 N.

[01:04:50] Mike Kaput: This course will teach students how to train their own AI, similar to the AI teaching assistant that's going to be teaching and shepherding the students through the [01:05:00] course itself. They also plan to offer this course both online and in physical. cohorts. Now, none of these courses are live yet. This is really just kind of an intro announcement previewing the company, but given the people behind it, definitely worth paying attention to.

[01:05:16] Mike Kaput: Paul, like, how much of a need is there to kind of, this is like AI first education, to create a total reinvention of education based on what AI is capable of?

[01:05:27] Paul Roetzer: My first thought here is, I mean, Andrej could literally do anything right now. I mean, he, Google, OpenAI, Microsoft, Anthropic, start his own thing. I mean, he can name his price and go work anywhere in the world right now. So the fact that he's choosing to put his time and energy into education and training, I think is all you need to know about the importance of education and training like we talked about last week and this week.

[01:05:56] Paul Roetzer: So, it just validates for me how important it is that we [01:06:00] You know, prepare people. my initial look at the site, which is a single page, like I love developers. Like it's just like six paragraphs. Here's what we do. Here's some links. Um, it appears to be for technical audiences. So this is not like you're a business leader, you're a marketer.

[01:06:16] Paul Roetzer: You're going to go take this. This is like for developers. But the other thing to your question, I think what they do here. You're going to see their innovations trickle down into the future of education. Like as an organization that does education and training, I'm extremely interested in what they're doing and how they're doing it, because we're constantly saying, what is the next gen version of what we do?

[01:06:37] Paul Roetzer: And so we will follow this very closely. And some of

[01:06:41] Update on Lattice Digital Workers

[01:06:41] Mike Kaput: All right. Very quickly. Our last topic today is just a quick update from last week. So last week we talked about how an HR platform, Lattice, their CEO announced that they were going to start treating digital AI workers within the platform, just like human employees. So AI was going to be onboarded, [01:07:00] given appropriate systems access, and have their work facilitated in all the ways that Lattice already does for human employees.

[01:07:07] Mike Kaput: HR professionals. Now, many, perhaps most, in the world of AI were both a bit skeptical and confused by the short announcement. It was like very light on details, pretty heavy on jargon. We are also skeptical and confused, and now the company has issued a retraction on the original post, backpedaling from its initial messaging and plan here, saying, quote, This innovation sparked a lot of conversation and questions that have no answers yet.

[01:07:37] Mike Kaput: We look forward to continuing to work with our customers AI, but we will not do that. further pursue digital workers in the product. Paul, I can't say this is surprising, but it did happen really fast. Like I know we were both very, unsure of this to begin with. Like, what, how, what are your thoughts here?

[01:07:55] Paul Roetzer: I mean, one kudos to them for [01:08:00] realizing the errors in their ways. yeah, I think I said last week, I didn't want to crush them for what was obviously a really bad idea. And, you know, they did what they had to do and retracted it and just kind of like said, okay, we heard you. So good. the second thing in the final thing I'll say is like,

[01:08:17] Paul Roetzer: there's

[01:08:18] Paul Roetzer: there's going to be a lot of innovation in the iSpace.

[01:08:21] Paul Roetzer: There is a lot of hype, And you have to be able to see through that. So, we try on this podcast to be as objective as possible about these innovations. So, you know, obviously I wasn't a big proponent of Rabbit when that device came out. I kind of thought it was a scam and it was. AI PIN, you know, just didn't see that playing out.

[01:08:46] Paul Roetzer: It didn't. Like We try really hard when we see something that we think is not legitimate or not like the best path forward to be as, again, as objective as [01:09:00] possible. not overly negative. I think innovation is critical. People have to try and push the limits, but sometimes it's being pushed in the wrong directions.

[01:09:08] Paul Roetzer: And this was a case of that. And I'm happy that they realized that very quickly and, Um, made a change in direction.

[01:09:15] Mike Kaput: Alright that's a wrap on this week Paul thanks

[01:09:19] Mike Kaput: as always for walking us through the

[01:09:22] Mike Kaput: complex, sometimes confusing,

[01:09:23] Mike Kaput: but always exciting

[01:09:25] Paul Roetzer: man, it's fun to dive back into the history sometimes,

[01:09:27] Mike Kaput: No, no kidding. I love it. Yeah. I'd say at least where it doesn't feel like drinking as much from a fire hose every week now that we're back to our weekly cadence.

[01:09:35] Mike Kaput: Yeah,

[01:09:36] Paul Roetzer: seriously.

[01:09:36] Mike Kaput: But few quick final announcements. if you have not checked out our newsletter yet, go to marketingainstitute. com forward slash newsletter. We cover more in depth, all the news. that we talked about here, as well as other topics we don't have time to get to. And if you have not left us a review on your podcast platform of choice, we would love if you could do that for us.

[01:09:58] Mike Kaput: We really take into account [01:10:00] all feedback possible to make the show as good as possible and get it in front of as many people as possible. All right, Paul, thanks so much.

[01:10:08] Paul Roetzer: Thanks Mike, we'll do it again next week. Thanks everyone for listening

[01:10:12] Paul Roetzer: Thanks for listening to The AI Show. Visit MarketingAIInstitute. com to continue your AI learning journey and join more than 60, 000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, attended virtual and in person events, taken our online AI courses, and engaged in the Slack community.

[01:10:35] Paul Roetzer: Until next time, stay curious and explore AI.

View full post