AI News Roundup: Perplexity Shopping, Microsoft Ignite 2024, AI “Manhattan Project,” Suno v4 and More

Written by Mike Kaput | Dec 3, 2024 3:56:43 PM

In the past week, we’ve seen some huge AI news stories break…

Here’s what you need to know about what's going on:

Perplexity Launches AI-Powered Shopping Experience

Perplexity, the AI-powered search engine, just released a huge new shopping feature that allows you to research and buy products right within Perplexity.

The feature is called Buy with Pro, which lets Perplexity's paid users in the United States complete purchases directly through the platform.

Here’s how it works:

When you ask Perplexity a question related to shopping, you still get natural language responses like normal.

But you also may now see product cards that show the most relevant items available for purchase and their product details.

Perplexity says these cards “aren’t sponsored—they’re unbiased recommendations, tailored to your search by our AI.”

When you see product you like, you can use one-click checkout right in Perplexity to buy it (if you’ve saved your shipping and billing info with them) by clicking “Buy with Pro.”

Right now, you get free shipping on all Buy with Pro orders courtesy of Perplexity.

Microsoft Positions Copilot as the "UI for AI" at Ignite 2024

At its Ignite 2024 event, Microsoft doubled down on its AI strategy, positioning Copilot as the default interface for AI interactions and hyping up the ability to create AI agents in Copilot.

In his keynote at the event, CEO Satya Nadella said Copilot is the “UI for AI.” To that end, he made some major AI announcements, including:

Copilot Actions, which are customizable prompt templates that automate repetitive tasks
New purpose-built AI agents in Copilot that take on specialized roles, including an agent for HR and IT questions, one for meeting collaboration, and a project manager agent
Copilot Studio is getting enhanced with autonomous capabilities, allowing agents to take actions in the background without human prompting
For Teams users, Copilot will soon analyze screen-shared content during meetings, providing insights from both the conversation and visual content
And PowerPoint is getting new AI features, including a Narrative Builder that can create presentations from referenced documents, and the ability to translate entire presentations into 40 different languages while maintaining the design

The keynote concluded with Microsoft's commitment to AI education, noting that they've helped train over 23 million people in AI and digital skills over the past year, with plans to reach millions more.

Google Gemini Gets Memory and App Control

Google is rolling out two significant updates to Gemini.

First, a new memory feature allows the AI to remember personal information for more contextual responses, according to TechCrunch. (For instance, Gemini might remember your favorite foods or details about your work, so it can reference that information when relevant and helpful.)

The feature, available only to Google One AI Premium subscribers at $20 per month, can be turned off at any time and currently works only in English on the web client.

Second, a report from The Verge indicates that Google may be prepping Gemini to take action within apps.

Hidden in Android 16's developer preview is a new "app functions" API that could give Gemini the ability to take direct actions within apps.

This could mean, for example, ordering food through DoorDash without ever opening the app—you'd just tell Gemini what you want.

Second, hidden code in Android 16 suggests Gemini may soon be able to take direct actions within apps.

Meta Makes Major Enterprise AI Play

Meta has poached Salesforce's CEO of AI, Clara Shih, to lead a new business AI group.

With Meta's Llama models seeing over 600 million downloads and Meta AI claiming 500 million monthly active users, the company appears poised for a significant enterprise push.

Shih said in a post on X: “Our vision for this new product group is to make cutting-edge AI accessible to every business, empowering all to find success and own their future in the AI era.”

U.S. Calls for "Manhattan Project" Style AI Initiative

A major U.S. congressional commission is calling for what could be one of the most ambitious government AI projects in history: a Manhattan Project-style initiative to develop artificial general intelligence.

The U.S.-China Economic and Security Review Commission, a bipartisan group established by Congress in 2000, says America needs this type of large-scale public-private partnership to stay competitive with China in the race to develop AI systems that match or exceed human intelligence.

The historical parallel they're drawing here is significant.

The original Manhattan Project was a massive collaboration between the U.S. government and private sector during World War Two that led to the development of the first atomic bombs.

While the commission is advocating for a similar scale of effort, they haven't yet specified exactly how much money should be invested or how the partnership would be structured.

One specific suggestion did emerge from Jacob Helberg, a commissioner and senior advisor to Palantir's CEO.

He proposed streamlining the permitting process for data centers, noting that energy infrastructure is currently a major bottleneck in training large AI models.

Amazon Doubles Down on Anthropic with $4 Billion Investment

Amazon just doubled down on its relationship with Anthropic by investing another $4 billion in the company (bringing its total investment to $8 billion)—and what’s curious is that it’s doing so while also actively making moves to reduce its reliance on the company.

Anthropic announced on November 22 that they had received another $4 billion from Amazon and that AWS would be the company’s primary cloud and training partner.

A key part of this expanded partnership involves close collaboration on AWS Trainium hardware.

AWS Trainium is a purpose-built machine learning accelerator that enables high-performance model training.

Anthropic's engineers are working directly with Amazon's Annapurna Labs team “on the development and optimization of future generations of Trainium accelerators, advancing the capabilities of specialized machine learning hardware.”

Despite this deep collaboration, however, Amazon also appears to be trying to actively reduce its reliance on Anthropic.

Another report from The Information reveals that Amazon has developed a new AI model that can process images and video in order to make it less dependent on Anthropic.

Says The Information:

“In developing the new model, Amazon is showing that it still hopes its internally developed AI can gain traction among its cloud customers, making it less dependent on AI from Anthropic, whose Claude chatbot is now a popular offering on Amazon Web Services.”

AI Agent Startup Raises Massive $56M Seed Round

A new AI startup with an unconventional name—/dev/agents—just secured a whopping $56 million seed round at a $500 million valuation. The company is building what it calls an operating system for AI agents, autonomous programs that can handle complex tasks without human supervision.

Led by former Stripe CTO David Singleton, and backed by an impressive team of ex-Android and Meta Oculus veterans, the company draws an ambitious parallel: Just as Android created the foundation for the mobile revolution, /dev/agents aims to build the essential platform for AI agents to reach their full potential.

The startup has attracted high-profile investors including Andrej Karpathy and Scale AI's Alexander Wang. However, details about the actual product remain scarce—their website is remarkably minimal, reminiscent of other recent AI startups that have raised massive rounds with little public information.

Salesforce Brings AI Agents to Slack

Salesforce is now adding its AI agents to Slack to act as your digital coworkers right within Slack workspaces.

Instead of having AI assistants work in isolation, Salesforce’s Agentforce platform gives them access to an organization's Slack conversations and enterprise data, allowing them to understand context and take more relevant actions directly in the flow of work.

What makes this particularly interesting is how Salesforce is positioning these AI agents.

They're not just passive assistants—they can actively suggest and execute actions on behalf of employees across different departments.

For instance, HR agents can handle onboarding and benefits questions, IT agents can resolve help desk tickets, and sales agents can prepare executive briefings and create proposals.

The company has already lined up some impressive early adopters.

Accenture reports they've developed IT support agents that handle first-tier questions in Slack channels, while Box CEO Aaron Levie describes it as the realization of a decades-old dream in computing—having a single conversational interface where humans and AI work together seamlessly.

Salesforce says Agentforce will become available through Slack, though specific pricing and availability details haven't been announced yet.

Apple Preps AI-Powered Siri Upgrade

Apple is getting ready to launch an AI-powered version of Siri that it hopes can help it catch up to OpenAI, Google, Microsoft, and others in the AI arms race.

According to Bloomberg, Apple is developing what employees internally call "LLM Siri," a major upgrade that aims to transform the 13-year-old digital assistant into a much more conversational AI powered by large language models.

The upgrade promises several key improvements: more human-like interactions, better handling of complex tasks, expanded control of third-party apps, and integration with Apple Intelligence features like text generation and summarization.

The new Siri will be built on Apple's own AI models and is currently being tested as a separate app on iPhones, iPads, and Macs.

The project also represents Apple's bid to catch up with competitors like ChatGPT and Google's Gemini.

While Apple recently launched its Apple Intelligence platform, the company is still trailing behind in many AI features offered by other tech giants.

But, as Apple Intelligence taught us: Just because we’re hearing about this doesn’t mean you’re getting it any time soon.

The company plans to formally announce the updates as early as 2025 as part of iOS 19 and macOS 16.

But consumers won't actually get to use these features until spring 2026, about a year and a half from now.

Suno v4 Raises the Bar for AI Music

Popular AI music generator Suno just released a major update that’s getting a lot of attention.

The new version of Suno, called Suno v4, promises better overall audio quality and more sophisticated song structures.

Additionally, the system can now apparently handle more complex musical arrangements and produce sharper, clearer lyrics.

The company has also introduced some interesting new features.

There's a "Remaster" function that can upgrade existing tracks to the new v4 quality level.

They've added a Cover Art generator to create visuals that match your music's style.

And there are two particularly interesting additions: a "Covers" feature that can reimagine songs in different styles, and "Personas" that let users maintain a consistent musical identity across multiple creations.

The update is currently in beta and available to Suno's Pro and Premier subscribers.

Eleven Labs Doubles Down on Voice AI

AI voice cloning company ElevenLabs just dropped two big updates: a platform for building conversational agents and a product that turns written content into podcasts.

First, ElevenLabs is rolling out tools that let developers create customizable conversational AI agents on their platform.

These agents can be fine-tuned across multiple variables—from tone of voice to response length—and can work with various language models including OpenAI’s models, Google Gemini, and Claude from Anthropic.

Developers can even integrate their own custom language models and knowledge bases.

What makes this particularly interesting is that ElevenLabs is leveraging their expertise in voice technology to create more natural-sounding AI conversations.

The platform includes features for handling real-world challenges like customer interruptions and data collection during conversations.

Second, ElevenLabs has also launched GenFM, a creative new service that transforms written content into AI-hosted podcast discussions.

Available through their ElevenReader app, it can take any PDF, article, or ebook and turn it into a podcast featuring two AI co-hosts discussing the content in any of 32 languages.

Runway's "Frames" Tackles AI Image Consistency

Runway just unveiled a new AI image model that claims to make major advances in controlling styles across AI-generated imagery, which is a significant limitation in today’s AI image generation tools.

The model, which is called Frames, purportedly is able to maintain consistent visual styles across multiple generations, meaning you can generate many images in the same visual style so they all look consistent.

Runway has organized the model's capabilities around the concept of "worlds"—coherent visual styles that can be consistently applied across multiple generations.

Writes Runway:

“With Frames, you can begin to architect worlds that represent very specific points of view and aesthetic characteristics. The model allows you to design with precision the look, feel and atmosphere of the world you want to create.”

They offer some examples of worlds they’ve created with the new model, including one that duplicates the aesthetic of 1970s album art and another that duplicates the visual style of taking photos on a disposable camera.

The technology is being gradually rolled out through Runway's Gen-3 Alpha platform and API.

View full post