This week in AI, a $157 billion valuation, groundbreaking AI tools, and a tech giant's bold move. Join Mike and Paul as they unpack OpenAI's massive $6.6 billion funding round and the growing tension between its nonprofit roots and financial ambitions. They'll dive into OpenAI's latest product announcements and Accenture's bold move to form a dedicated Nvidia business group. Stay tuned for our rapid-fire section covering updates on NotebookLM, Copilot, Meta's smart glasses, and more in this AI-packed episode.
Listen or watch below—and see below for show notes and the transcript.
Today’s episode is brought to you by rasa.io. Rasa.io makes staying in front of your audience easy. Their smart newsletter platform does the impossible by tailoring each email newsletter for each subscriber, ensuring every email you send is not just relevant but compelling.
Visit rasa.io/maii and sign up with the code 5MAII for an exclusive 5% discount for podcast listeners.
Listen Now
Watch the Video
Timestamps
00:03:18 — OpenAI’s Latest Funding Round
- New funding to scale the benefits of AI - OpenAI
- New Credit Facility Enhances Financial Flexibility - OpenAI
- Tim Brooks X Status
- A co-lead on Sora, OpenAI’s video generator, has left for Google - TechCrunch
- OpenAI Completes Deal That Values Company at $157 Billion - New York Times
- OpenAI Nearly Doubles Valuation to $157 Billion in Funding Round - Wall Street Journal
- OpenAI asks investors to avoid five AI startups including Sutskever's SSI, sources say - Reuters
- Why Microsoft Will Likely Keep Its Iron Grip on OpenAI’s Future Profits - The Information
00:11:32 — OpenAI Canvas / DevDay
00:23:13 — Accenture Forms Nvidia Business Group
- Accenture forms Nvidia business group to scale enterprise AI adoption - VentureBeat
- Episode 91 of The Artificial Intelligence Show
00:30:52 — Nvidia Open Model
00:35:30 — Google AI Search Updates
00:38:55 — NotebookLM Updates
- Raiza Martin X Status
- Logan Kilpatrick NotebookLM Updates X Post
- NotebookLM adds audio and YouTube support, plus easier sharing of Audio Overviews - Google Blog
- Andrej Karpathy X Status
- Ethan Mollick X Post
- Jason Spielman LinkedIn Post
00:46:07 — Microsoft Copilot Updates
- Microsoft Copilot can now read your screen, think deeply, and speak aloud to you - TechCrunch
- An AI companion for everyone - Microsoft Blog
00:51:41 — Meta Smart Glasses Training
00:56:34 — Meta Smart Glasses Doxxing
01:02:13 — Meta Movie Gen
- Meta Movie Gen - Meta AI
- How Meta Movie Gen could usher in a new AI-enabled era for content creators - Meta AI Blog
- Movie Gen: A Cast of Media Foundation Models - Meta AI
Summary
OpenAI’s Significant Funding Round
OpenAI has completed a $6.6 billion funding round that values the company at $157 billion. This nearly doubles the company's valuation from just nine months ago, when it was valued at $80 billion.
The funding round was led by Thrive Capital, with participation from tech giants Microsoft and Nvidia, as well as SoftBank and the United Arab Emirates investment firm MGX. Thrive Capital alone invested about $1.3 billion, with an option to invest up to $1 billion more at the same valuation through 2025.
This massive valuation comes despite OpenAI's current financial losses. While the company expects about $3.7 billion in sales this year, it's projecting losses of roughly $5 billion due to the high costs associated with developing and running AI technologies like ChatGPT.
The funding comes with certain conditions. OpenAI has two years to transform into a for-profit business, or the funding will convert into debt.
This highlights the ongoing tension between OpenAI's original nonprofit mission and the financial realities of developing cutting-edge AI technology.
OpenAI Canvas and DevDay Updates
OpenAI has announced major updates to its AI offerings, improving developer tools and user interfaces to enhance accessibility, efficiency, and customization for various applications.
First, OpenAI introduced Canvas, a new interface for ChatGPT designed for more complex writing and coding projects. Canvas allows users to collaborate with the AI in a separate window, offering direct control, inline feedback, and targeted editing. This feature enhances ChatGPT's ability to assist with tasks that require multiple revisions and contextual understanding.
Second, the company launched the Realtime API in public beta. This API enables developers to integrate fast speech-to-speech functionalities into their apps, supporting natural, multimodal conversations with low latency.
Third, OpenAI introduced vision fine-tuning for the GPT-4o model. This allows developers to fine-tune the model using both images and text, opening up new possibilities for applications in visual search, object detection, autonomous vehicles, and medical image analysis.
Fourth, the company unveiled Prompt Caching, a feature that helps developers reduce costs and processing times when using repeated inputs across multiple API calls.
Lastly, OpenAI announced Model Distillation, a new offering that allows developers to fine-tune smaller, cost-efficient models using outputs from larger, more capable models. This streamlines the process of improving smaller models with real-world data, making it easier to deploy powerful AI capabilities at a lower cost.
Accenture to Form a Dedicated Nvidia Business Group
NVIDIA’s new initiative aims to accelerate the adoption and scaling of AI technologies across various industries, with a particular focus on generative AI and the emerging field of agentic AI systems.
The newly formed group will comprise 30,000 professionals who will receive specialized training to help enterprises reinvent processes and scale their AI adoption.
This massive investment in AI expertise comes as Accenture reported $3 billion in bookings related to generative AI in its recent fiscal year, highlighting the surging demand for these technologies.
One of the key focuses of this collaboration is the development and implementation of agentic AI systems. These represent the next frontier of generative AI, capable of acting on user intent, creating new workflows, and taking appropriate actions to reinvent entire processes or functions without constant human input.
To support this initiative, Accenture is expanding its network of AI Refinery Engineering Hubs globally, adding new locations in Singapore, Tokyo, Malaga, and London. These hubs will provide engineering skills and technical capacity for transforming large-scale operations using agentic AI systems.
Read the Transcription
Disclaimer: This transcription was written by AI, thanks to Descript, and has not been edited for content.
[00:00:00] Paul Roetzer: As a society, we're struggling to grasp the current technology like people's heads are going to explode if they try and start comprehending the complexities of everyone walking around with glasses that can record them
[00:00:13] Paul Roetzer: This is already the reality that people are wearing these things and they're going to be able to analyze things.
[00:00:18] Paul Roetzer: And that stuff you ask it to analyze is going to automatically record if it's in your home, if it's your family. Like it's just. There's no way to get that data out. And it's just a thing that I don't feel like we're prepared for.
[00:00:31] Paul Roetzer: Welcome to the Artificial Intelligence Show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Roetzer. I'm the founder and CEO of Marketing AI Institute, and I'm your host. Each week, I'm joined by my co host. and Marketing AI Institute Chief Content Officer, Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your [00:01:00] career.
[00:01:01] Paul Roetzer: Join us as we accelerate AI literacy for all.
[00:01:08]
[00:01:08] Paul Roetzer: Welcome to episode 118 of the Artificial Intelligence Show. I am your host, Paul Roetzer, along with my co host, Mike Kaput. We are coming to you We're actually recording this one on Friday afternoon, October 4th, 2 m. Eastern. So if anything, last minute happens on a Friday, and we miss it, that is why.
[00:01:25] Paul Roetzer: we have a full agenda to get through as is, so hopefully nothing happens in real time while we're doing this. Um Today's episode is brought to us by Rasa again. We talked about Rasa last week. let's talk about a common challenge we all face, making our email newsletters truly engaging. I've been, I've been at this marketing institute thing for a while.
[00:01:45] Paul Roetzer: Obviously I started in 2016. Confidently say that Rasa. io is changing the email newsletter landscape. We've been followers of theirs for a long time. Mike and I use it more as like an internal tool. so we, our newsletters are not run through Rasa. But we'll use it to keep track of like research in [00:02:00] the industry and things kind of, you know, find things for us, send us, links that we can look at and kind of track what's going on.
[00:02:07] Paul Roetzer: So imagine each of your subscribers receiving a newsletter tailored just for them. sounds impossible, right? Well, Rasa. io's AI powered platform makes this easy. We've known the team at Rasa for about six years and they were one of our earliest partners and sponsors. and so they've been doing the personalized newsletter game for a long time.
[00:02:24] Paul Roetzer: You can check them out at Rasa.io/MAII and use the code 5MAII. And that's for 5 percent discount on a Rasa subscription. So you can give it a try. your subscribers and your engagement rates will thank you. So again, Rasa. io slash M A I I. All right. yeah. It seems impossible, Mike, to go a week without talking about OpenAI.
[00:02:49] Paul Roetzer: Last week was like the madness that was going on within. this week we've got some new funding, some new product updates, some more people leaving. It's just the never ending [00:03:00] saga with OpenAI, but, a lot of other stuff going on this week too with Accenture and NVIDIA and Meta just showed up this morning and dropped a new MovieGen model on us been kind of scrambling to figure out.
[00:03:11] Paul Roetzer: So tons to talk about, but Let's kick it off with OpenAI, the latest on their news.
[00:03:18] OpenAI Latest Funding Round
[00:03:18] Mike Kaput: Sounds good, Paul. Okay. So first up, the topic is OpenAI. They have completed a significant 6. 6 billion funding round that values the company at 157 billion, which basically doubles the company's valuation from just nine months ago when it was valued around 80 billion. This funding round was led by Thrive Capital.
[00:03:43] Mike Kaput: There was participation from Microsoft and NVIDIA, as well as SoftBank and the United Arab Emirates investment firm, MGX. Thrive Capital alone invested about 1. 3 billion. They have an option to invest up to a billion more at the same valuation through [00:04:00] 2025. This of course, as we've talked about on past episodes, comes despite OpenAI's current Financial losses.
[00:04:09] Mike Kaput: I mean, the company expects about 3. 7 billion in sales this year but is projecting losses of roughly 5 billion due to the costs associated with developing and running AI technology like ChatGPT. Interestingly, this funding comes with certain conditions. OpenAI has two years to transform into a for profit business or the funding will convert into revenue.
[00:04:34] Mike Kaput: debt.
[00:04:35] Mike Kaput: So, Paul, we have been talking about this news for a while. It's been very well rumored, we've covered many of the salient details here, now it's official. Can you kind of walk through, like, what matters most to pay attention to here now that we know exactly what the details are of the fundraise?
[00:04:54] Paul Roetzer: Yeah, anyone who listened to episode 117, this was the number that was being rumored. We talked about [00:05:00] that and we touched on the valuation, like how do you get it 150 pre money, 157 roughly post money. it's because they're projecting 11. 6 billion in future revenue the next 12 months, basically. And then you, you apply a multiple to that is roughly how it's done.
[00:05:18] Paul Roetzer: It, that there may be some nuances this time around, but it gives you a ballpark of, of that number. So it's actually a reasonable number given that approach. a couple of other elements to this, one, in addition to the 6. 6, was it 6. 6 billion? 6. 6. They also secured a 4 billion line of credit and a revolving line of credit.
[00:05:43] Paul Roetzer: so they, in their own post where they announced the 4 billion credit facility, they said they have 10 billion in liquidity, which gives them flexibility to invest in new initiatives. And then in their, Other posts. So they had a post announcing the credit line and a post [00:06:00] announcing the equity.
[00:06:01] Paul Roetzer: so we're making progress on our mission to ensure that AGI benefits all of humanity. That was the lead to the blog post. So again, bringing it back to like their overall mission and said that the new funding is going to go toward leadership. so invest in talent, basically increasing compute capacity, i.
[00:06:17] Paul Roetzer: e. buying more Nvidia chips and continue building tools that help people solve hard problems. They're going to need the money. They continue to lose people. So just, you know, last week we talked about all the people who have left this year, they just lost the co lead for Sora. So the guy who was building Sora, which we talked about last week, had its delays.
[00:06:36] Paul Roetzer: he is leaving to go to DeepMind and he tweeted, I will be joining Google DeepMind to work on video generation and world simulators. And then they also had, another, let's see, what was this guy's name? He was one of the co founding members, Dirk. Kingma, and Dirk, said, I'm joining AI development resonates significantly with my own beliefs.
[00:06:59] Paul Roetzer: [00:07:00] Looking forward to contributing to Anthropic's mission of developing responsible AI. so, you know, things kind of keep evolving. The other thing that's sort of a nuance to this is apparently, the word is, That they wanted exclusives with their investors. So they apparently, and I think OpenAI has come out and just, said this is not true, but there's lots of sources saying it is in fact true.
[00:07:26] Paul Roetzer: They asked for exclusives from their investors, meaning they were not allowed to invest in five, Companies that OpenAI identified as key competitors. One was Elon Musk's XAI, another is Anthropic. Another is Safe Superintelligence, which is Ilya Sutskever, one of the co founders of OpenAI. That's his, company.
[00:07:46] Paul Roetzer: Perplexity is another one. And then Glean, I thought was interesting, that made this list. And anybody who listens, you know Weekly to the show might recognize Glean on episode 115. Mike and I talked about Glean. [00:08:00] they had just raised a 260 million series E funding at a 4. 6 billion valuation to build what they called the Google for Work using generative AI.
[00:08:09] Paul Roetzer: It is, co founded by four, guys. Three of them are former Googlers and one. It's formerly from Facebook. So on the surface, it's a ton of money. One of the biggest, if not the biggest raise in history, a massive valuation. But as we have talked about many times on this show, it is a bridge to the next round.
[00:08:30] Paul Roetzer: This is not enough money to go where OpenAI and Sam Altman intend to go. They are going to need at least another 50 to a hundred billion in the next 12 months is my guess. I I'm guessing it's going to probably be North of a hundred billion. so sometime in the next 12 to 18 months, they're going to do another massive round and, or they're going to go public.
[00:08:50] Paul Roetzer: My guess is it's going to be really complex to switch to the for profit that they're going to need to do before they can go public. So chances are they raise another 50 to a hundred billion in the [00:09:00] next 12 months. as the final bridge to, to the IPO. And at that point, they're probably valued at half a trillion dollars or more.
[00:09:08] Paul Roetzer: Like it's just, and I know the numbers are nuts, but that's the thing is keep in context, while this sounds like a whole bunch of money, this is not enough money to do what they're intending to do.
[00:09:18] Mike Kaput: So I wanted to talk just a little bit about that funding condition about becoming a for profit business in the next couple years, like is that the biggest hurdle they have to figure out right now?
[00:09:29] Paul Roetzer: it's, it's does seem to be a massive hurdle. I'm sure that there's other complexities such as Microsoft's deal with them. So Microsoft is rumored to have put in about 13 billion and I think that gives them from this going back a year or so. But if I remember correctly, it's like 49 percent ownership of the for profit, the current for profit arm that's underneath the nonprofit right now.
[00:09:50] Paul Roetzer: And I think Microsoft has access to like, I want to say it was like the first A hundred billion in profits or something crazy like that. Like it was some sort of [00:10:00] condition. So Microsoft's not going to just give away this position they had. So I'm sure there's all kinds of complexities. And I, the other thing that caught my attention in the, OpenAI's blog post announcing the 6.
[00:10:14] Paul Roetzer: 6 billion, they said. We aim to make advanced intelligence a widely accessible resource, and then it went on a little bit, and that's it, by collaborating with key partners, including the U. S. and allied governments. That's really interesting. That is a very intentional phrase, I would say, so my expectation is the next round I don't know that we'll ever hear about the U.
[00:10:38] Paul Roetzer: S. government money going into this, but I wouldn't be shocked if there's something there. it's the allied governments that's setting the stage for other governments, which we've previously talked about on the show, possibly getting involved heavily in the funding of the future build out. Because what that means is you look at allied countries where data centers can be built.
[00:10:59] Paul Roetzer: And [00:11:00] so this country, you know, country A may put in, I don't know, like 50 billion and in exchange, we're going to build 50 data centers and they're like that kind of, that's the sort of stuff you're going to hear about over the next three to five years is these really complicated partnerships that are money plus basically.
[00:11:17] Mike Kaput: Yeah, we are starting to see some of that, from Ashen Brenner's situational awareness, his kind of contours that he painted of like great power competition, essentially, or geopolitical wrangling around funding these companies.
[00:11:29] Paul Roetzer: It's going to be complicated, for sure.
[00:11:32] OpenAI Canvas / DevDay
[00:11:32] Mike Kaput: All right, so next up. Some more OpenAI news. So, OpenAI unveiled several significant updates to its AI offerings. This was both an individual announcement of a specific update we're going to talk about, and also a bunch of announcements that came during their recent Dev Day. So, first up, before Dev Day, they introduced something called Canvas, which is a new interface for ChatGPT.
[00:11:58] Mike Kaput: that is designed for [00:12:00] more complex writing and coding projects. And Canvas allows users to collaborate with AI in a separate window, basically side by side, both prompting it and seeing the outputs during, kind of inline feedback and targeted editing projects when you're doing things like writing or coding.
[00:12:19] Mike Kaput: So this kind of enhances ChatGPT's ability to assist with tasks that require quick multiple revisions and contextual understanding. Now second, at DevDay, and the rest of these updates come from DevDay as well, the company launched the real time API in public Beta. So this API enables developers to integrate fast speech to speech functionality into their apps, supporting natural multimodal conversations with low latency.
[00:12:49] Mike Kaput: And it appears to be doing that through using many of the features of advanced voice mode, which we all got access to this past week. So this basically simplifies the process [00:13:00] of managing speech interactions by combining multiple steps into a single. API call. Now third, OpenAI introduced vision fine tuning for GPT 4.
[00:13:11] Mike Kaput: 0. This allows developers to fine tune the model using both images and text, which opens up tons more possibilities for applications in visual search, object detection, medical image analysis, etc. Fourth, The company unveiled Prompt Caching, which is a feature that helps developers reduce costs and processing times when using repeated inputs across multiple API calls.
[00:13:38] Mike Kaput: This offers a 50 percent discount on reused input tokens, which optimizes expenses, improves latency for applications that have repetitive interactions. Last but not least, OpenAI announced Model Distillation. Which is a new offering that allows developers to fine tune smaller, cost efficient models using outputs [00:14:00] from larger, more capable models.
[00:14:02] Mike Kaput: This basically streamlines the process of improving smaller models with real world data, making it easier to deploy powerful AI capabilities at a lower cost. So, that's a lot to unpack here, but Paul, let's first talk about Canvas. Like, this basically seems like an answer to Some of the functionality that Claude has, like projects and or artifacts where it kind of shows up and like pairs with you as you're building an app, writing code, writing complex language, like you also talked offline with me a bit though about other possible Businesses and use cases that Canvas kind of challenges.
[00:14:42] Mike Kaput: Can you walk us through your thoughts there?
[00:14:44] Paul Roetzer: Yeah, so the, so Canvas was split off as an announcement, so most of the stuff, Mike, you were outlining came from their Dev Day, was October 2nd, I think was the Dev Day. And, so for our listeners who aren't developers, [00:15:00] Some of that may be like, okay, yeah. So what the, so what is developers are going to build a lot of cool stuff.
[00:15:06] Paul Roetzer: Like in essence, what it means is open AI is making their capabilities, their models available by these open APIs, including some variation of advanced voice, that's going to allow developers to accelerate innovation. Very affordably and start building more and more tools and applications that the non developer crowd like us can enjoy and benefit from.
[00:15:29] Paul Roetzer: So that's kind of like the key takeaway from the dev. So Canvas comes out, I think this was October 3rd, I think it was yesterday or something, or in the last two days, this came out and it's one of those initially you're like, Oh, this seems like a big deal. Let me, let me go kind of play around with this a little bit.
[00:15:44] Paul Roetzer: So I did have a chance to do it because everyone should have it. So if you're a paid user, I think it's plus an enterprise or team, everybody now has access. And you can just go in and choose ChatGPT 4. 0 with Canvas. It's actually one of the models in the drop down. So they [00:16:00] position as an early beta, introduces a new way of working together, not just through conversation, by creating and refining ideas side by side.
[00:16:08] Paul Roetzer: My first reaction when I saw it was, oh, this is like just their copy of Anthropic's projects. It doesn't actually seem true. I think the user interface that is true for, but it seems like it's doing a lot of interesting things that I don't think Anthropx currently does. Now I'm not the biggest user of Claude and projects, so I don't, I can't say that confidently.
[00:16:29] Paul Roetzer: but but in my early testing, it's, it definitely starts blurring into a true writing assistant because, so like the example I gave, just to play around, I just went in. Chose the dropdown. I said, write an abstract for an AI presentation. And then it does its usual thing. Like, sure thing. Here you go. Before I give it anything else, it just writes it.
[00:16:50] Paul Roetzer: And it's like, okay, that was jumping the gun a little bit. Like I wasn't really like, I didn't tell you anything else about it. It just gave me one. And so then I just started kind of playing around with it. So now what happens is your chat, [00:17:00] rather than just being down the center, moves to the left and now three quarters of your screen.
[00:17:05] Paul Roetzer: is the canvas. And now your, your document is kind of hand holding off to the right there. So I said, let's use bullets to make it easy to read. So it immediately kind of rewrites it. And then it gives you the edited form of that. And I was like, make it more fun. Don't use the word delve ever. Cause it had word, the delve in there.
[00:17:21] Paul Roetzer: and so it started doing it. And then I was just playing with it. I was like, okay, can you write this more in the style of Paul Roetzer? And I was like, I wonder if it'll do this. So it did. Revised it, and it's like, I rise to the text while I'm more closely with Paul Roetzer's style, making it more engaging and focused on transformative power of AI.
[00:17:35] Paul Roetzer: And so I said, do you know who Paul Roetzer is? Now I was like curious, is he actually doing this or is he just BSing me? And it said, yeah, he's the founder and CEO of Marketing Institute. He is well known for his work in making AI more accessible and applicable for marketing professionals. He's also an advocate for integrating AI into business strategies to drive growth, efficiency, and innovation.
[00:17:52] Paul Roetzer: He often uses, focus on simplifying AI concepts and making them actionable for business leaders. With an emphasis on practical use cases, responsible adoption, and ethical [00:18:00] considerations. I was like, damn, this is actually, this is actually pretty good. good. his approach is characterized by a practical, optimistic view of AI's potential, while also ensuring transparency and responsible usage.
[00:18:10] Paul Roetzer: I actually, I couldn't have written that one.
[00:18:12] Paul Roetzer: that well. so then I said, okay, what specific elements of what you just wrote in that draft are in my style? And then it actually explained in five bullet points, The components of the writing that were like tied to my style. So you can see like the power of the underlying model that's able to do the things GPT 4.
[00:18:33] Paul Roetzer: 0 has always been able to do. But now you have these style buttons. So in the bottom right corner, you can click and it's got suggest edits. Adjust the length, reading level, add final polish, add emojis. And what it does is it lets you, it gives you like these cool sliding scales. So like the user interface is really slick how they did it.
[00:18:53] Paul Roetzer: So when it pops up reading level, you just drag like this book up and down. I want a high school PhD level. I want like, so it's pretty, [00:19:00] it's pretty cool. my, when I stepped back for a moment after kind of testing it myself, the thing I realized is like. We're just still not ready for this. Like most companies we meet with, most schools I talk to aren't teaching how to use ChatGPT at all.
[00:19:18] Paul Roetzer: And now all of a sudden we have this whole true writing assistant, and I'm not going to get into the coding. Cause I, Mike and I aren't coders. I couldn't really tell you how good the coding part is, but as a writer, as someone who came out of journalism school, as you know, Mike and I spent our lives writing, this is really impressive and I have, I don't know if I've talked about this on the show.
[00:19:38] Paul Roetzer: But I've been teaching my daughter, who's 12, how to use ChatGPT to become a more creative writer. And so I went to school to become the kind of writer I am today. I spent years trying to become a really good creative writer. And what I have found is Because she likes to develop story ideas and things like that.
[00:19:57] Paul Roetzer: What I have found is I'm able [00:20:00] to teach her how to be a creative writer way, way faster. And what I'm doing is having her go in and say, okay, let's have ChatGPT write the first paragraph of this idea. Now, I want you to write it in the style that you learned by how it used different words, how it create these visualizations.
[00:20:17] Paul Roetzer: Like, you know how when it wrote that first paragraph, you could see what it was saying? I want you to now write the next paragraph in that same way. And so rather than me Trying to figure out how to teach her to be a writer. Cause I don't, I've never been an instructor. Like I don't know how to actually teach her that way, but I am able to explain to her how to use the tool to learn that way.
[00:20:36] Paul Roetzer: And I just, I really find myself with tools like Canvas. So again, I'll kind of leave it at, go try it. It is really impressive. My early work with it is fascinating. I do think it starts to creep into tools like Grammarly and things like that. You do start to wonder the competitive environment as you see what these things are, knowing this is a beta, but they're obviously coming for writing.
[00:20:57] Paul Roetzer: but my bigger thing becomes, [00:21:00] how are we going to use these tools in schools and in businesses to, to accelerate people's capabilities in learning and not have it become a crutch to critical thinking? And. I don't know the answer to this, but every time I go do a talk, I get asked these questions. Like, how are we going to teach the next generation to do things when they can just have ChatGPT do it?
[00:21:19] Paul Roetzer: And every day these tools get smarter and they have more and more capabilities to where if you want to take the shortcut, it is there to be taken. And I don't know. So that's, that's kind of my overall thoughts. Like it's, it's just, it's an impressive user interface. It's a really cool tool. It. Creates more questions in my mind about how people do work in the future where they don't just let the AI do it for them.
[00:21:42] Mike Kaput: Yeah, and it's interesting with the previous topic that something like a company like Glean is being mentioned as a big competitor. This, compared with that, this certainly feels like, okay, we're trying to get into enterprise productivity essentially in a more formal way than [00:22:00] ChatGPT
[00:22:01] Paul Roetzer: I mean, honestly, like I even had the thought about what about Microsoft,
[00:22:03] Mike Kaput: Right. Right.
[00:22:05] Paul Roetzer: Word, Google Docs. Like that was the first time where I wondered is OpenAI going to build like a productivity platform? Like, are they going to just build their own version of Excel and docs? And it sure seems like they, they could be going that direction, which is a really fascinating thing I hadn't really thought about before, but I would certainly understand why Microsoft a few episodes ago, Mike, I think you called out, Microsoft was now listing OpenAI as a competitor in like their public filings.
[00:22:33] Paul Roetzer: it starts to make a lot more sense when you think that maybe they are going to go at that enterprise productivity market, not just through a ChatGPT interface, but different interfaces.
[00:22:44] Mike Kaput: Yeah, especially with all our talk of this research company becoming a product company, they have to find revenue to
[00:22:51] Paul Roetzer: Hired a chief product officer. Yeah. They're very much positioning themselves. I would be, Oh man, I would give anything to see their pitch deck. I would love to know their roadmap of what, [00:23:00] where the, cause they're what? 11. 6 billion in revenue next year. And then I think it was 20. It was in the 20s. 20 some billion the year
[00:23:07] Mike Kaput: Yes. Yeah. Where is that coming from in there? Yeah. Yeah.
[00:23:13] Accenture Nvidia Group
[00:23:13] Mike Kaput: All right. Our third big topic this week. Accenture, the consulting firm, is forming a dedicated NVIDIA business group. So this newly formed group will comprise 30, 000 professionals who will receive specialized training to help enterprises reinvent processes and scale AI adoption.
[00:23:34] Mike Kaput: At the heart of this Kind of initiative is Accenture's AI refinery platform, and this uses a ton of NVIDIA products. It leverages NVIDIA's full AI stack, including NVIDIA AI Foundry, NVIDIA AI Enterprise, and NVIDIA Omniverse. So, one of the key focuses here is the development and implementation, interestingly, of agentic AI systems.
[00:23:59] Mike Kaput: [00:24:00] So, basically, the next frontier of generative AI, agents that are capable of acting on user intent, creating new workflows, and taking action to reinvent processes or functions without Total, constant, human input. Now apparently to support this initiative, Accenture is expanding its network of AI refinery engineering hubs globally.
[00:24:23] Mike Kaput: They're adding new locations in Singapore, Tokyo, Malaga, and London. basically these provide deep engineering skills and technical capacity for transforming operations using agentic systems. And Accenture claims the partnership with NVIDIA is already yielding practical applications. They developed an NVIDIA NIM agent blueprint for virtual facility robot fleet simulation, which basically could help industrial companies build autonomous robot operated software factories and facilities.
[00:24:56] Mike Kaput: So, Paul, obviously, it's a bit of a, you [00:25:00] know, PR win for Accenture and NVIDIA, but it does seem like a pretty substantial initiative and interestingly focuses on agents. Like, what does this mean for enterprises trying to deploy both AI and AI agents?
[00:25:16] Paul Roetzer: guess I'll start saying we do not give investing advice on this show. Do not take anything I say as investing advice. I will just say, if you think all Nvidia does is make chips, Like you got to zoom out a little bit. They are everywhere. Like they're embedded in the future of business and the economy at almost every level of the infrastructure.
[00:25:40] Paul Roetzer: It is remarkable how every other major tech company just wants to tout the relationship between them. With NVIDIA, like that is, it is shocking to me how prevalent they are in all technology circles. so good on Accenture for, you know, deepening the relationship [00:26:00] with NVIDIA. That is a great win. you and I talked about Accenture's GenAI bookings back in episode 91.
[00:26:07] Paul Roetzer: I went back and looked. Episode 91, April 9th. of this year. And at that time, they were on track to do 2. 4 billion. So this is obviously, like, if we can just zoom out, this is a massive growth area for not only them, but other consulting firms. We also talked in episode 104 about the people who we know are making money in Gen AI are the consulting firms, McKinsey, Deloitte, Accenture, obviously.
[00:26:33] Paul Roetzer: Now, How much of this is net new? I have no idea. And I don't even know if they broke it out in their earnings calls, but like, it's great. They're doing 3. You know, 3 billion or whatever, but is that net new consulting that they wouldn't have done prior to Gen AI? Or is like, is the money just moving from, we used to do this.
[00:26:51] Paul Roetzer: Consulting, now we're doing this instead, I don't know, but the growth is there. The demand is there. We talked in episode 104 about like, what are the services people [00:27:00] need? It's what to do with these language models, whether you're fine tuning them, whether you're integrating them into your business, finding use cases, personalizing use cases, driving innovation, like new markets, new ideas, new products, change management.
[00:27:13] Paul Roetzer: Like there's so much that needs to happen in enterprises. And there's so few people in those enterprises. trained to do this. And I'm not talking about the technical stuff. I'm talking about the business side, the HR side of all of this. And that's where the consulting firms have a massive window of opportunity here.
[00:27:31] Paul Roetzer: And I don't see it going away anytime soon. And then you mentioned the agentic systems. That's a whole nother element to the service mix that we didn't talk about in episode 104. But if you go back to episode 116, where we had the AI and the AI agents, in the enterprise conversation, that's where this is all going.
[00:27:48] Paul Roetzer: Like now you have this whole world of, we can go build agents and HubSpot and Salesforce and Google and wherever we're going to build our agents. Who's going to build those? These, they don't have to be developers. They can be [00:28:00] business people. Like I've built JobsGPT and CampaignsGPT. I'm not a developer.
[00:28:04] Paul Roetzer: So who's going to go in, identify business problems. Analyze business processes, build agents and GPTs that do those things more efficiently, in a more innovative way, more creatively. Who, who on your, if you're in an enterprise and you're listening to this, who on your team can do that? My guess is you're going to struggle to, to count on one hand.
[00:28:24] Paul Roetzer: At least don't know how many people could actually do that. There just aren't business people trained to do these things. And in my opinion, those are the people who should be doing them. It's the people who understand the business pain points and the processes and can interview the people on the team and understand what they go through each day, identify the tasks, build agents.
[00:28:43] Paul Roetzer: That's the opportunity here. And so either you build those capabilities yourself within your company, or you got to turn to somebody like Accenture to do it for you. and I think a lot of big companies are going to be turning to companies like Accenture to do it for them.
[00:28:55] Mike Kaput: Yeah, I know that in some circles, big consulting firms occasionally get a [00:29:00] bad rap. It's like, ah, we're paying a bunch of money for someone to come in and tell us what we already know or that we've been saying, and it just doesn't come from us. So I sympathize with that. I'm not saying like, go hire a big consulting firm, but to your point.
[00:29:12] Mike Kaput: We, how many enterprises have we talked to at this point where it's like, good luck if you, if you think you have all this talent ready to go today to do this stuff, very few people do.
[00:29:23] Paul Roetzer: Yeah. And I think that was the thing that stuck out to me most. And again, who knows how real these numbers are? Like this is a press release from them basically, but 30, 000 people is a big commitment. I don't know the total employment at Accenture. but what they're basically saying is like, We're going to make a massive bet here.
[00:29:40] Paul Roetzer: We are going to train our workforce on this. We're going to, I assume, you know, infuse the internal education training around AI, drive change management among their team, improve staffing, add new staffing, like good on them. Like I love to see this idea that AI is actually creating a growth engine for the economy [00:30:00] and for this company.
[00:30:01] Paul Roetzer: To hopefully employ more people and train those people to do this thing. That's the kind of stuff we want to see. So again, is it actually 30, 000 people? Is it going to be everything they're claiming it is in the post? Who knows? Like it never, never really is. There's always a PR element to this. I love the vision for it.
[00:30:17] Paul Roetzer: Like, I hope, I hope they see it through. I hope. They build it and I hope they help a bunch of companies along the way, because a lot of companies need the help right now. And, you know, it's, it's on a lot of these consultancies, as you're saying, like sometimes they can get a bad rap or just assume it's, you know, kind of blowing your money, just getting those outside opinions.
[00:30:32] Paul Roetzer: But a lot of times that's, that's what these companies need and nothing's going to happen until they go get that third party to come in and drive this change for them.
[00:30:39] Mike Kaput: Alright, let's dive into this week's rapid fire. So first up, some other NVIDIA related news.
[00:30:47] Paul Roetzer: Not stock advice, but more NVIDIA
[00:30:51] Mike Kaput: but pay attention.
[00:30:52] Nvidia Open Model
[00:30:52] Mike Kaput: NVIDIA actually just released a very powerful open source AI model called NVLM [00:31:00] 1. 0. The flagship model in this model family is called NVLM D 1. Dash 72B. So, you know, real good marketing here.
[00:31:09] Paul Roetzer: Oh, it
[00:31:10] Mike Kaput: It sounds like a robot from like a sci fi movie. But this model has 72 billion parameters. It's designed to compete with proprietary systems from OpenAI, Google, and others. It is set apart by its exceptional performance across both vision and language tasks. It demonstrates state of the art results in vision language tasks, rivaling leading proprietary models like GPT 4.
[00:31:38] Mike Kaput: 0. Now, notably, unlike many multimodal models, NBLM D 72B actually improves its performance on text only tasks after multimodal training, which is an interesting development. Now what's also worth noting here is that NVIDIA has made the decision to make the model [00:32:00] weights publicly available, and they have promised, as of right now, to release the training code, which is kind of a departure from the trend of keeping Both systems close, but also some of the ones that are open don't always go this far in their openness.
[00:32:15] Mike Kaput: So, Paul, while the name is a mouthful, sounds a bit technical, it's notable because it's NVIDIA. It's also sounds like this is actually open source with the publicly available model weights and training code eventually if they follow through on that. Like, how big a deal is that? Because I don't think even Meta has gone necessarily that far.
[00:32:35] Paul Roetzer: Yeah. I mean, welcome to the party, I guess. Like, you know, you talk about companies with infinite resources. This is where I think like, it's hard to underplay Meta's role because they have billions to throw at this stuff. It's the argument I made for Google over OpenAI last week. It's like these massive companies, I mean, I don't know what the R& D budget at Nvidia is, but it's gotta be.
[00:32:55] Paul Roetzer: 20 billion a year or something. I mean, it's nothing like they can throw stuff at this [00:33:00] with no big deal. And they're using their own chips. Like that's the thing is if they want to be a major player in the model game, all these other companies are lined up to get Nvidia's GPUs to train their frontier models.
[00:33:12] Paul Roetzer: If NVIDIA wants to be a major player in the frontier model world, they just pull them off the shelves. Like it's their inventory that's doing this. So I find that fascinating. And then I think just going back to the technical side of like the OpenAI Dev Day stuff, why does this matter to the average listener that isn't going to be building these models?
[00:33:31] Paul Roetzer: Because it accelerates innovation. It pushes the other model companies to do more. It drives the cost of intelligence down close to zero. And that's what we keep seeing time and time again is these models come out, like let's say advanced voice, for example, I think the numbers I saw was roughly, you know, If you wanted advanced, advanced voice to be used in a customer service environment, like to do calls and stuff, it would come out to somewhere between like 18 and 21 per hour [00:34:00] to use it.
[00:34:01] Paul Roetzer: So we start talking about what does it cost for AI to start doing the work humans do that's around where advanced, advanced voices today by next year. It'll be under 10 bucks. It may be within six months. It'll be under 10 per hour. And a year after that, it'll be down to a dollar per hour or a penny per hour.
[00:34:16] Paul Roetzer: Like the cost of intelligence is plummeting to zero and every new frontier model that comes out or any open source model that comes from someone like an NVIDIA, it just pushes the cost down. The pressure for Google and open AI to drop their prices become so massive if meta and NVIDIA just give this stuff away.
[00:34:36] Paul Roetzer: And so that's the outcome is all of us. In theory, benefit from commoditized intelligence because every, like you're going to have five or six companies spending billions a year to build the most advanced intelligence and then fighting each other to push that cost to basically zero for all of us. So that's what it means.
[00:34:55] Paul Roetzer: This is like intelligence keeps getting more affordable. And better, [00:35:00] and smarter,
[00:35:00] Mike Kaput: And like we've talked about on a few episodes, not enough people are ready for essentially free intelligence.
[00:35:07] Paul Roetzer: on demand, everywhere, yep. Yeah, we just, like, yeah, people are still trying to figure out how to use, like, a custom GPT or, like, ChatGPT, and now you have Canvas, and now you, like, yeah, just, it just keeps coming. This is why every day we say, like, every day is, like, the dumbest form of AI you're ever gonna have.
[00:35:21] Paul Roetzer: Like, it only gets smarter from here, it only gets more capable. And enterprises are not keeping up with the rate of innovation.
[00:35:30] Google AI Search Updates
[00:35:30] Mike Kaput: All right. So next up, Google announced some updates to search, mostly around AI advancements. So one of the key developments is an evolution of Google Lens, which now incorporates video Understanding capabilities. That means users can take a video and ask questions about the moving objects they see in the video.
[00:35:52] Mike Kaput: AI can provide comprehensive answers. This is available now globally in the Google app for Search Labs [00:36:00] users. Additionally, Google has introduced voice input for Lens. It allows users to ask questions verbally while taking photos. And the company has also improved Google Lens shopping capabilities. So when users photograph products, Lens now provides a more detailed results page, including reviews, price comparisons, and purchase options.
[00:36:23] Mike Kaput: In audio search, Google has expanded its functionality around circle to search, to identify songs playing in various contexts. That's available on 150 million Android devices. They are also rolling out AI organized search results pages, starting with recipes and meal inspiration on mobile. in the US.
[00:36:45] Mike Kaput: To enhance connections to web content, Google has redesigned AI overviews, which we've talked about in the past, to include prominent links to supporting web pages within the text. This change reportedly [00:37:00] has increased traffic to those websites and improved user experience. And lastly, Google is introducing ads in AI overviews for relevant queries in the US.
[00:37:12] Mike Kaput: So, Paul, definitely seems like Google is Leaning in even more to AI powered search, AI overviews are starting to get monetized. Like, as a marketer or business leader, how should I be thinking about these changes to Google search?
[00:37:28] Paul Roetzer: yeah, I mean, we're seeing all this tech we keep hearing about gradually infused into different features. Sometimes I have trouble like knowing where to go for some of this stuff. Like Google Lens, I thought I knew in my browser, cause I'm a Chrome user. We have Google Workspace. Like we use Google all the time.
[00:37:44] Paul Roetzer: I don't know where to use Lens at. Like, I'm not even sure how to get to it. I thought it was in my browser, but I'm not seeing it there. So sometimes it's like, I want to try some of these things, but I'm not even sure where to go. Maybe it's only in their mobile device and sometimes I get confused.
[00:37:57] Paul Roetzer: Like, is this a mobile only thing? Is it only on Pixel? Is it also on the [00:38:00] iPhone? Is it just in my Chrome browser? I did notice the iOverviews though. I actually searched something yesterday or this morning and I noticed the citation. There's the link now next to each thing and you click it and it'll pop up over to the right and it'll show you where that.
[00:38:13] Paul Roetzer: That information is coming from, and I assume that's kind of how they're doing like the ad units and stuff too. So yeah, it's just, it's a lot. And I sympathize with Google. I mean, they have so much reach, so much distribution, so many different products and features. How you manage that portfolio of features and capabilities and AI.
[00:38:31] Paul Roetzer: It's challenging, but yeah, some of these things like I'd love to check out. I just got to figure out how to check them out.
[00:38:38] Mike Kaput: Well, and we've talked about this a little bit with like Project Astra, right, where it's like we're getting into that idea of being able to actually look at stuff in the physical world, answer questions about it, kind of a prelude to that. Okay, so more Google news coming up.
[00:38:55] NotebookLM Updates
[00:38:55] Mike Kaput: The popular Google tool, Notebook LM. Which we've talked about is an [00:39:00] AI research assistant that allows you to engage with, understand, summarize, query up to 50 different sources of material. The tool attracted a ton of buzz when it launched audio overviews a few weeks ago, which turns your material into a deep dive podcast between two AI hosts.
[00:39:17] Mike Kaput: Both of whom sound ultra realistic. Now we've seen NotebookLM get a bunch more updates, including a big one that allows you to add public YouTube URLs and audio files to your notebooks. And we actually just heard from Google's product lead on NotebookLM, Reiza Martin, who recently teased some new functionality coming out in the tool around custom chatbots.
[00:39:41] Mike Kaput: So Martin responded to a post. On X, that showed a video of a work in progress custom chatbots feature that basically allows you to build a custom chatbot based on the notebooks you build in Notebook LM. And so Martin said of this feature, quote, custom chatbots, I have a [00:40:00] lot to say. This is pretty widely used internally at Google, and literally every day someone pings me to say this has 10x'd our team's productivity.
[00:40:09] Mike Kaput: Not joking. And in the video, she references, that has been posted, you're still looking at the old version, so I'm excited for what you all think when the new version launches. So Paul, Notebook LM certainly seems to be the darling of the AI world right now. It's pretty incredible. You and I both used it quite a bit.
[00:40:28] Mike Kaput: Bye. How big an update would custom chatbots be for this tool? Yeah,
[00:40:34] Paul Roetzer: how it could work, but I think what we're seeing is getting just the continued evolution of the user interface, like for two years, we've all just been interacting with chatbots. Now we're kind of starting to interact with voice, more regularly in a more reliable way.
[00:40:47] Paul Roetzer: But you're seeing kind of innovation at the user interface level, where it's a mix of chat and something else. So Notebook. LM allows you to have a chat with these documents, but you can also create a podcast with the documents. You [00:41:00] can have it output, you know, FAQs and all these things. And so, yeah, it's like, it's interesting now to start seeing the evolution of how these tools allow people to interact with the information and, quick note on the YouTube, so what they they're doing right now is they're not actually, you Processing the YouTube video and using like computer vision to know what's in the, you know, in the videos and everything, they're pulling the transcripts.
[00:41:22] Paul Roetzer: So it's adding a transcript to it, but you can add, the URL links. You can add YouTube videos and then it'll automatically get the transcripts. But I have seen a lot of this product just in the last few days on mainstream media, and it's always hilarious when you see these like CNBC or something like that, that's using it and they're just like completely, completely Shocked by what they're hearing that it's like real people.
[00:41:44] Paul Roetzer: so we're seeing a lot of that. And then one other thing I'll mention is there's another guy on the team that I just kind of came across. I think it was yesterday or this morning, Jason Spielman. He's a senior interactive designer at Google labs. He seems to be more active on LinkedIn than on Twitter.
[00:41:59] Paul Roetzer: He [00:42:00] doesn't have an overly active Twitter account. but he posted kind of a cool, like what I love with, Raza and Jason are doing is like this inside look, which we don't always get at Google. Like, it's like these very approachable personalities. And so he posted, said, our newest feature audio overviews has taken over the internet the past few days.
[00:42:17] Paul Roetzer: The team has been sprinting. We went from idea to prototype in weeks, then launched publicly in under two months, which is very un Google like. And he said, it's not perfect, but that's the point. Here are a few takeaways. And I'll just highlight these because I thought a couple of these are really good.
[00:42:30] Paul Roetzer: So it's not about building products with our users. Or it's about building products with our users, not just for them. We're not waiting to launch. We're shipping early and iterating. So getting it out to users, getting their feedback. They have an active discord channel, apparently. And then the second one I really like built in, not bolted on.
[00:42:45] Paul Roetzer: We're building net new AI native products. This isn't just AI for the sake of AI. We're working to bridge the gap between state of the art research and human problems. I love this. approach. And this is the challenge like a Microsoft faces or even Google with [00:43:00] Workspace is you're putting AI into places people already are, and it might not be natural to them to find the value.
[00:43:06] Paul Roetzer: Maybe that's part of the issue people are having with Copilot and Workspace is I'm good in Excel. I don't need this AI thing here. Whereas what they're doing here is create this standalone thing that has such immense value because there's such obvious use cases that sometimes net new product. Is what is needed to drive adoption.
[00:43:25] Paul Roetzer: And so that's what we're seeing here. he also said meetings are spent building, not talking about building. I love that just as an overall business takeaway. have a point, have a reason to be there, have an output you're looking to get, and then meet. If you don't, don't meet. And then putting user feedback and community engagement at the heart of everything we do, building quickly, and have a lot more coming soon.
[00:43:45] Paul Roetzer: So yeah, again, good on Google, good on, you know, the Labs team for giving these two the freedom to like share a little behind the scenes. I think it gives more personality to Google and that's not a bad thing. And so hopefully we see more of this, this kind of stuff [00:44:00] from their teams. I think people love to get that inside look and It also gives more patience when stuff goes wrong, I think.
[00:44:07] Paul Roetzer: Like when you've got someone who's like a voice, we see that with Logan Kilpatrick there, you know, in the, you know, building the AI studio and working with the development group came from OpenAI, that's like a personality people like, you know, respect and that, that, it's like something doesn't go wrong, it's cool as long as you're transparent with us.
[00:44:23] Paul Roetzer: And so I can see that kind of stuff working well with this team here.
[00:44:26] Mike Kaput: There was an interesting post on X yesterday from Ethan Mollick that I think really hammers home what you're talking about. He said, Google's Notebook LM has been available for a year before this new podcast feature made it go viral. And I love this part, because I think it's underrated by a lot of people.
[00:44:44] Mike Kaput: There is a lesson here about accessible magic. Making this stuff more tangible and accessible through these, like, Lightbulb features almost where like the lightbulb goes off and you say, Oh my God, it can do this. That's really important to getting [00:45:00] more attention and adoption to these tools. And I wonder how much this feature has like caused Google to devote more resources to Notebook LM.
[00:45:08] Paul Roetzer: Yeah, that's like, it goes back to, we talked so much about, you know, if you think about ChatGPT, how many companies still struggle to justify the 20 bucks a month per license for ChatGPT? Why? It's because they don't get it. There isn't that light bulb moment. And that's why I'm a huge believer, like, Just, if you're, if you're in charge or involved in rolling out ChatGPT or Google Workspace, Gemini or whatever, Copilot, roll it out with customized or personalized use cases for the people you're rolling it out to.
[00:45:38] Paul Roetzer: You got 20 writers on your team, show them how to use Canvas with their right, like give them the one or two or three use cases where they're going to immediately understand the value. And if that's all they use the tool for, fine. But if they discover the other thousand ways they can use it, even better.
[00:45:54] Paul Roetzer: But so many people don't just hold the hand to the first few use cases where the value becomes so obvious. And [00:46:00] yeah, I think like a tool like Notebook LLM, it's just, you immediately see it. They use it once. It's like, Oh my God, I got 20 other ways. I want to use this tool right now.
[00:46:06] Mike Kaput: Yeah.
[00:46:07] Microsoft Copilot Updates
[00:46:07] Mike Kaput: All right. So speaking of you had mentioned Copilot, Microsoft actually announced a few other features. Additions to its Copilot products. One of the first is most notable rather is Copilot Vision, which is an experimental feature available to Copilot Pro subscribers through Copilot Labs. And this tool allows Copilot to actually analyze and respond to questions about what's on your screen, particularly content in Microsoft Edge.
[00:46:37] Mike Kaput: So users can ask questions about images or text on web pages and Copilot will provide insights.
[00:46:44] Mike Kaput: and suggestions.
[00:46:45] Mike Kaput: Microsoft emphasizes, based on some past issues they've had, that this feature is designed with privacy in mind, immediately deleting process data after conversations. Another new capability is something called [00:47:00] Think Deeper, which enables Copilot to reason through more complex problems.
[00:47:05] Mike Kaput: Using advanced reasoning models, Think Deeper takes more time to provide step by step answers to challenging questions.
[00:47:12] Mike Kaput: questions.
[00:47:13] Mike Kaput: This feature is initially available to a limited number of Copilot Labs users in select countries. We are also seeing Copilot Voice being introduced, allowing users to have spoken conversations with Copilot.
[00:47:26] Mike Kaput: This also includes four synthetic voices, it can adapt its tone based on the user's conversation style, and it's launching in English with, in several languages.
[00:47:35] Mike Kaput: languages.
[00:47:37] Mike Kaput: So, Paul, like, what do you make of these updates as you're reading them? Like, I'll be honest, think deeper and voice don't seem like coincidences given that OpenAI just
[00:47:47] Paul Roetzer: the morning of dev day.
[00:47:50] Mike Kaput: just as, since we just got O1 and advanced voice and the real time API, like, and also what do you make of this vision feature?
[00:47:57] Paul Roetzer: So the vision, I don't remember when we talked about this, that it [00:48:00] wasn't called Copilot Vision.
[00:48:01] Mike Kaput: It was like recall, I
[00:48:02] Paul Roetzer: Recall there. Is that what it was? So that was a few months back. And the pushback is like. A light way of saying they made a really stupid move and had to back off it. So what happened was when they first debuted this product, it was like ready to go, like they were going to be shipping computers with this baked into it.
[00:48:24] Paul Roetzer: And it was going to remember everything on your screen. So anybody who's listened to the show for a while would recall this conversation. We'll put, we'll find the episode and put it in the show notes for reference. But they were basically going to out of the box by default, record everything that happens on your screen.
[00:48:37] Paul Roetzer: And anyone. Apparently outside of Microsoft who heard this was like, well, what about this? What about that? How about when I do this? How about when I do that? And they didn't have answers for this. Like they, they apparently just thought it was a good idea to just record everybody's stuff and didn't think through the ramifications of that.
[00:48:53] Paul Roetzer: So they've had to now, what did you say? With privacy in mind, I
[00:48:57] Mike Kaput: Yeah.
[00:48:58] Paul Roetzer: term, the term you used. So [00:49:00] after significant pushback on a terrible idea, they have re bundled it as CoPilot Vision. With something you apparently probably opt in to use now. So yeah, that, that's my thoughts on that. Like, I guess there's some useful purposes for that product, but, I still have massive concerns around the privacy side.
[00:49:21] Paul Roetzer: okay. Then yes, I'm kind of with you on this voice and reasoning thing. So what I find myself wondering is I know Microsoft is a very innovative organization. I know they build their own stuff. I know they're building their own smaller models. I know they're invested heavily in OpenAI, but. All this stuff I hear from, I know they acquired Mustafa Suleyman, or at least acquired him from Inflection AI, who was one of the founders of Google DeepMind.
[00:49:46] Paul Roetzer: And he's now head of Consumer AI. Like I get it, but it just seems like everything they do is a wrapper on top of OpenAI
[00:49:51] Mike Kaput: Right.
[00:49:52] Paul Roetzer: And
[00:49:52] Paul Roetzer: And I find myself wondering like, what in the world would I need Microsoft's voice for? I have advanced voice from OpenAI. What would I need their [00:50:00] reasoning for? I have O1 from ChatGPT.
[00:50:02] Paul Roetzer: Like, I don't understand how Microsoft is differentiated, like how their Other than the fact that they can put them into Microsoft Word and Excel and PowerPoint. What else is different? That they're, because they're just wrapping everything on top of OpenAI's models. And if OpenAI chooses to come after the productivity market and build ChatGPT docs and ChatGPT spreadsheets or whatever they want to call it, then it's like literally they're in direct competition.
[00:50:31] Paul Roetzer: And the only thing Microsoft has is distribution because they're built on top of the same infrastructure as OpenAI. And it's not theirs. I don't know. It's weird. It's a very weird relationship that just keeps getting more bizarre. And it doesn't look great from an innovative perspective for Microsoft because it looks like just, yeah, we were actually built on top of that too.
[00:50:52] Paul Roetzer: We call it co pilot voice or like we're built on top of doing co pilot
[00:50:55] Mike Kaput: Yeah, in a weird way I wish they had just Instead of giving this, like, a name, like, [00:51:00] think deeper, like, just tell me.
[00:51:01] Paul Roetzer: Yeah. It's Straub. Why don't we just call it strawberry? Yeah, I don't, I don't know. It's, it's weird. And maybe I'm just not understanding their marketplace, but I feel like I have a pretty decent understanding of. Their partnership with OpenAI and how they're building things. But maybe I got to go listen to some recent Mustafa Suleyman.
[00:51:20] Paul Roetzer: Stuff like maybe he's explained this differently and maybe they're not just doing everything on top of, opening eyes models. but that's my current understanding. So I don't know if anybody in Microsoft listens to the show and wants to like hit us up and give us, you know, a better understanding of the situation.
[00:51:35] Paul Roetzer: Like I'm all ears, but everything I've researched to date. That's kind of what it seems like.
[00:51:41] Meta Smart Glassses Training
[00:51:41] Mike Kaput: Alright, our last few stories here all are about Meta. So, first up, Meta has confirmed that it may use any image analyzed by its AI assistant on its Ray Ban Meta smart glasses for training its AI. So, according to Meta's policy [00:52:00] communications, images and videos shared with Meta. ai in regions where multimodal AI is available, which is currently the U.
[00:52:06] Mike Kaput: S. and Canada, can be used to improve the AI as per the company's privacy policy. That means that while photos and videos captured on Ray Ban Meta are not used for training if users don't submit them to AI. The moment a user asks MetaAI to analyze them, they fall under a different set of policies. So, Paul, this is a thing where it seems like it's going to just become more of a problem as AI powered wearables roll out.
[00:52:36] Mike Kaput: Like, can you even build an AI wearable or AI glasses that don't collect data from what they see?
[00:52:42] Paul Roetzer: I don't, I don't know how you would. I think that this is, again, we're as a society, as a business community, we're struggling to grasp the current technology. We're, we're struggling to grasp the implications of language models and the ability to put text in and text out and now [00:53:00] images in and videos and things like that, like that's still so new to everyone. If you try, like people's heads are going to explode if they try and start comprehending the complexities of. Everyone walking around with glasses that can record them on. And the thing is like, we don't need meta Orion that we talked about last week, that isn't coming for years. We already have Ray Ban glass.
[00:53:24] Paul Roetzer: Like these things are already in the world. Maybe you have a family member, maybe you have them, you know, maybe you're using them. this is already the reality that people are wearing these things and they can see things and they're going to have computer vision. They're going to be able to analyze things.
[00:53:37] Paul Roetzer: And that stuff you ask it to analyze is going to automatically record if it's in your home, if it's your family. Like it's just. There's no way to get that data out. And that's a, again, it's just a thing that I don't feel like we're prepared for. I wonder if, I just started thinking like, I wonder if there's like school policies around this.
[00:53:55] Paul Roetzer: Like are, you know, I know a lot of school is not like, hey, leave your phone here. Like, do you have to take your [00:54:00] metaglasses off too if you walk into a classroom? I assume you would. I don't know.
[00:54:04] Mike Kaput: I would get, that's probably, that's when it hits, you know, real mass consumer product if we have policies
[00:54:10] Paul Roetzer: Right. It's like, I mean, I feel, and again, this kind of goes to like, some of this is just personal, but, if, if I'm in a meeting with someone, it's kind of like when their note taker shows up on zoom and
[00:54:21] Mike Kaput: I was gonna say this exactly,
[00:54:22] Paul Roetzer: your note taker there, if I'm talking to you, And you're wearing Meta glasses, I'm not saying anything that I wouldn't assume is being recorded.
[00:54:33] Paul Roetzer: And that, that doesn't mean I'm saying something that I'm, like, ashamed to say or anything like that. It's like, I'm not going to talk to you about my personal life. I'm not going to talk to you about, like, financials of my business. Like, if I'm talking entrepreneur to entrepreneur and we're just, like, kind of, like, having an honest conversation with each other.
[00:54:47] Paul Roetzer: If you're wearing those glasses, I'm just kind of assuming. That maybe they're recording. I don't, I don't even know how they work. So I feel like there's all these unanswered questions in society. And I know the next topic, Mike, we're going to [00:55:00] talk about sort of expands on this a little bit. there's a lot of open questions and problems with this technology that we just haven't dealt with yet.
[00:55:09] Mike Kaput: Before we even get to wearables, like, isn't it possible I could be running an app right now that just records our facial expressions on our screens to, like, figure out what you're feeling? Like, we know that technology already exists. I've tried it.
[00:55:23] Mike Kaput: it this is going to just open up this whole can of worms around surveillance and how we interact, which is really
[00:55:30] Paul Roetzer: Yeah. And I think we, we've focused a lot of our talks on this show over the last year and a half have been about laws and regulations related to like copyright and intellectual property and the training of these models and the, you know, the harm and risk, my guess is the reality going into next year now that I'm kind of thinking about this is going to, there's gonna be far more movement on privacy and things like this, like protecting people against, you know, yeah.
[00:55:59] Paul Roetzer: Someone [00:56:00] running emotion detection software when they're interviewing for a job or like at that application level where you start to find these things where there's bias and there's, you know, more harmful things where it's not catastrophic, but at an individual level and it starts to invade people's privacy and, their rights.
[00:56:21] Paul Roetzer: And I could see a lot of legislation. Soon that start to focus on that. Maybe that's already out there and we just haven't, you know, we haven't dug deep on it, but there's a problem. And. Again, we're going to find out why in a moment.
[00:56:34] Meta Smart Glasses Doxxing
[00:56:34] Mike Kaput: Yeah, let's talk about that because we have another meta related story about how this can go really wrong. So, two Harvard students created a controversial project called iXray, which combines meta's Ray Ban smart glasses with facial recognition technology. to instantly identify and gather personal info about strangers.
[00:56:58] Mike Kaput: The eye x ray [00:57:00] system works by using Meta's commercially available Ray Ban smart glasses to capture images of people. It then employs the facial recognition service PIM Eyes to match faces with online images. We talked about PIM Eyes like many, many episodes ago about how crazy it is. You can find literally many people's faces online.
[00:57:18] Mike Kaput: That system scrapes information from webpages and uses a large language model to then infer personal details about the individual. Going a step further, iXray then performs a lookup on people search sites, which are data brokers that offer extensive personal information. This process allows the glasses wearer to potentially access a stranger's name, job, education history, home address, phone number, and even information about their family members.
[00:57:47] Mike Kaput: Now, the two students that created this, they claim their project is designed to raise awareness about the potential risks of the technology. They tested it on unsuspecting people in public [00:58:00] places. They're not releasing their code, but it's pretty Noteworthy in the sense that despite being designed to raise awareness, Paul, like you could do this.
[00:58:12] Mike Kaput: Someone can replicate some version of this using off the shelf technology. Like, is this the future we're headed towards with these things?
[00:58:19] Paul Roetzer: Yup. I don't have a better answer. Like, this is exactly the stuff I worry about all the time. Like, again, I have a 12 year old daughter. Like I think deeply about this stuff all the time. just like you, you, you, I don't want to get into like exact scenarios and stuff, but you can imagine like, even me, like I don't need people at the gym.
[00:58:44] Paul Roetzer: Knowing, like, who I am or what I do or anything like that, like, you just assume some level of privacy, even when you're out in public. And I get that, like, people may go on Facebook or, you know, wherever and try and find people, [00:59:00] but to think that someone's just wearing MetaGlasses, which I think are harmless, like, I don't, you know, maybe I don't know any better, and they've just got some off the shelf, open source thing that some college kids built and they're actually like scanning faces unbeknownst to everyone and doing lookups and having ChatGPT write summaries of who they are and what they do and where they live and how much money they make.
[00:59:23] Paul Roetzer: Is that what we want in society? Like, and yeah, you're right. Like this is too hard for kids. This could be knocked off in an hour. You could probably, if Claude would do it, if it wasn't Red Team, you could probably write the code for this. So even you and I as non coders could probably use a language model to write the code to emulate this program. And yeah, it's, it's, it's. It's terrifying. Like, and there's no, there's no logical way that to stop it. Like the tech is there, Pandora's box is open. Like people know you can do this kind of stuff. They're going to do it. And then it [01:00:00] gets to the societal thing of like, I'm just not going to trust anybody wearing AI glasses.
[01:00:03] Paul Roetzer: Like, I don't care what brand it is. Cause I, what apps are they running on the thing that I don't know about. It's. I hate thinking about this stuff, honestly. Like,
[01:00:12] Paul Roetzer: I
[01:00:13] Paul Roetzer: I get asked all the time about that. Like, how do you not think about the dark stuff? It's, it's very intentional. This is like a Black Mirror episode
[01:00:20] Paul Roetzer: hundred
[01:00:20] Mike Kaput: I was going to say, it's like a sci fi novel, right? Where you're this ends. Yeah.
[01:00:27] Paul Roetzer: that it's going to be here, but I don't want this.
[01:00:31] Mike Kaput: Especially as we talk more and more about. models ability to persuade or reason and understand kind of like to coerce people so to speak like this is just like you're playing a poker game against someone that like they know all your cards if they're like looking at your facial expression like that's so weird to me to even think about I guarantee you people aren't thinking about that
[01:00:52] Paul Roetzer: Yeah, and if you've got like airpods in, the airpods are connected to an app that's telling you what to say and how to, you know, persuade them to do
[01:00:57] Paul Roetzer: that.
[01:00:59] Paul Roetzer: All of it's gonna happen. [01:01:00] All of it. Like, if your mind like goes this direction, I apologize if you're now heading in the wrong direction, but all of it is going to happen.
[01:01:09] Paul Roetzer: And soon, like this tech is, is here. There's no scientific barriers to doing these kinds of things. And I just, it's why we, it's like, again, the only way I have some peace at the end of the day about any of this is more people are becoming aware of these issues. And hopefully, the more people we help become aware of it, the more gets done to prevent misuse of the technology.
[01:01:37] Paul Roetzer: Because the tech's going to be there, bad people are going to do bad things. But if everyone's completely oblivious to the bad things that can be done, then they just happen. Without anybody knowing, but at least if we have an educated society, that's aware of the downsides of AI, at least we can try and do something to ensure the positive outcome for all this, because bad stuff's going to happen, but we got to offset it with [01:02:00] the good stuff.
[01:02:00] Paul Roetzer: And that's not going to happen on its own.
[01:02:02] Mike Kaput: for sure. Alright, let's end on a high note, because Meta hasn't been all in negative
[01:02:09] Paul Roetzer: And I'll come up with ways this can be misused. Just give me a minute.
[01:02:11] Mike Kaput: sorry, yeah, yeah, yeah.
[01:02:13] Meta MovieGen
[01:02:13] Mike Kaput: But, on the surface at least, they did just unveil something called MovieGen, which is a breakthrough in generative
[01:02:21] Paul Roetzer: And this, by the way, this morning, so like Mike and I are, this is on the fly. We're kind of doing this one. So give us some grace on if we don't get all the
[01:02:29] Mike Kaput: Exactly, right. Yeah, so this is basically, they had a research breakthrough in media generation, generative AI, to generate images, videos, and audio. And this new suite of models represents basically their third wave of generative AI work. It builds on some things we've talked about in the past, like their Make A Scene technology, their Llama image projects.
[01:02:50] Mike Kaput: What MovieGen does is it generates videos from text as the primary, capability of it. It uses a 30 billion parameter transformer model to create high [01:03:00] quality, high def videos up to 16 seconds long from text prompts. It can generate videos featuring a specific person based on a single image input and a text prompt.
[01:03:10] Mike Kaput: It offers precise video editing. It can make localized and global edits to existing videos based on text instructions. Preserving original content while targeting specific elements and a 13 billion parameter model can generate high quality audio up to 45 seconds long, including ambient sound, sound effects, and instrumental background music synced to the video content.
[01:03:35] Mike Kaput: Meta claims that MovieGen outperforms similar industry models across these tasks when evaluated by humans. So, Paul, this is just the latest in advanced video models coming out. We heard there were delays and, inadequacies with Sora, it sounds like. Meta's a major player to take seriously. Like, how seriously should we take this video generation model?
[01:03:59] Mike Kaput: [01:04:00] Like, did they get a leg up on the other players? Heh
[01:04:03] Paul Roetzer: put the research paper in a notebook LM to create a deep dive podcast on it. It wasn't done rendering by the time you and I got on to record. So I'll be listening to that summary of the research paper later today. Thank you, notebook LM. what it means is, Video is a major frontier that progress is being made on, and at some point, you and I and others will have access to actually generate 10 to 20 second clips reliably and quickly at an affordable cost.
[01:04:33] Paul Roetzer: None of those things are true today. You, you can go into Runway, you can use Pica, you can use, I don't know, some of the other tools, but you can go in and create videos. Thank you. But the consistency isn't great. Like the characters will change, things will remain consistent frame to frame. it takes forever to, to output them.
[01:04:54] Paul Roetzer: So like a 15 second video, if you could do it in runway might take 10 minutes. [01:05:00] Costs a lot of money. Like, so it's not. The tech isn't there yet. It's not ready to kind of scale in the business world. but we know Sora is coming eventually. We know Veo from Google, DeepMind is coming. NVIDIA is a major player in this space.
[01:05:14] Paul Roetzer: Again, they're, they're kind of everywhere. I think we still have time, but it's interesting that we need to figure out how this is going to work. Impacts the creative profession. So it's so funny, like Runway in particular, they're very, they make a lot of efforts to, to make it sound like they're doing everything in collaboration with creators, that it is only augmenting what creators can do.
[01:05:40] Paul Roetzer: And there's certainly an element to that, but everybody glosses over the negative impacts. So even with Meta, which again, this tech isn't available. This is just research. They're sharing this, but it's not like, I don't think you can go into ai and
[01:05:52] Paul Roetzer: Start playing around with these tools. So they say, this is in their kind of release post.
[01:05:58] Paul Roetzer: whether a person is an aspiring filmmaker [01:06:00] hoping to make it in Hollywood or a creator who enjoys making videos for their audience, we believe everyone should have access to tools that help enhance their creativity. so we're excited, Premiere, Meta, MovieGen. We anticipate these models enabling various new products that could accelerate creativity.
[01:06:17] Paul Roetzer: while there are many, many exciting use cases for these foundation models, it's important to note that generative AI isn't a replacement for the work of artists or animators. We're sharing this research because we believe in the power of this technology to help people express themselves in new ways and provide opportunities to people who might not otherwise have them.
[01:06:33] Paul Roetzer: Our hope is that perhaps one day in the future, everyone will have the opportunity to bring their artistic visions to life and create high definition videos and audio for using, and audio using MovieGen. So it's kind of like their vision for what they're doing. They're, there's going to be good and bad.
[01:06:49] Paul Roetzer: And, I don't know. So it seems like really impressive tech. I think it's a race now with, you know, OpenAI and Google and Runway and Luma and Pica and NVIDIA and Meta and, [01:07:00] everybody's building for the same stuff. Text, we've got images, video, audio, code. Like those are the five main modalities we talk about all the time.
[01:07:09] Paul Roetzer: And they're all pursuing those same modalities.
[01:07:12] Mike Kaput: And like we talked about last week with YouTube, because this is meta, expect to see a lot more video generation on social
[01:07:21] Mike Kaput: platforms.
[01:07:21] Paul Roetzer: Yes, and assume that any video you've ever uploaded to Meta is being used to train the model.
[01:07:28] Mike Kaput: media. Good call. Alright, so we, we didn't end on exactly the highest note, but it wasn't like super dark, but it was, yeah, middle of the road gray, a little gray.
[01:07:39] Paul Roetzer: That's great.
[01:07:41] Mike Kaput: All right, Paul.
[01:07:42] Paul Roetzer: I'm glad it's a Friday and I can go have a drink now. Kidding.
[01:07:44] Mike Kaput: Well, thank you as always for breaking everything down. Just a couple quick housekeeping announcements.
[01:07:52] Mike Kaput: Go sign up for our newsletter, marketingaiinstitute.com/newsletter. We have tons of topics we don't get to every week [01:08:00] that are all in the newsletter broken down for you. And if you have not left us a review and are able to, we would love to hear your feedback on the show and help us get the show to more people.
[01:08:11] Mike Kaput: Paul, thanks for weathering the, some of the doom and gloom topics this week, but always interesting.
[01:08:18] Paul Roetzer: Thank you, Mike. We'll talk with everyone again next week. Thanks for listening.
[01:08:22] Paul Roetzer: Thanks for listening to The AI Show. Visit MarketingAIInstitute. com to continue your AI learning journey, and join more than 60, 000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, attended virtual and in person events, taken our online AI courses, and engaged in the Slack community.
[01:08:45] Paul Roetzer: Until next time, stay curious and explore AI.
Claire Prudhomme
Claire Prudhomme is the Marketing Manager of Media and Content at the Marketing AI Institute. With a background in content marketing, video production and a deep interest in AI public policy, Claire brings a broad skill set to her role. Claire combines her skills, passion for storytelling, and dedication to lifelong learning to drive the Marketing AI Institute's mission forward.