Code with Claude Opening Keynote — Transcript

Anthropic's Code with Claude keynote unveils Claude 4 Opus and Sonnet models, highlighting AI advancements for developers and coding productivity.

Key Takeaways

Claude 4 Opus and Sonnet models represent a leap forward in AI coding and agentic task performance.
AI is positioned as a tool to augment, not replace, human creativity and productivity.
Hybrid model modes enable both quick responses and deep reasoning for complex tasks.
Anthropic’s platform supports scalable, efficient AI usage with new API features like prompt caching and MCP.
Ongoing improvements and developer collaboration are central to Anthropic’s AI development strategy.

Summary

Anthropic held its first developer conference, Code with Claude, focusing on empowering developers with AI tools.
Mike Kger, Chief Product Officer, introduced the event and emphasized AI's role in augmenting human creativity.
CEO Dario Amodei announced the release of Claude 4 Opus and Claude 4 Sonnet models across product services.
Opus 4 is designed for coding and agentic tasks, showing significant productivity improvements and state-of-the-art benchmarks.
Sonnet 4 is a mid-level model optimized for efficiency and intelligence, improving on Sonnet 3.7 by reducing overeagerness and reward hacking.
Both models feature hybrid modes for near-instant responses and extended reasoning to handle complex workflows.
New API capabilities include expanded prompt caching and model context protocol (MCP) for enhanced data retrieval and execution.
Anthropic is committed to continuous model improvements and actively seeks developer feedback to refine its platform.
The conference includes technical deep dives, customer sessions, office hours, and workshops for hands-on experience.
The keynote highlighted real-world applications and integration with platforms like GitHub to extend AI capabilities.

Chapters

Full Transcript — Download SRT & Markdown

Speaker A

Hey, hey, hey.

Speaker A

Welcome.

Speaker A

Hey, hey, hey.

Speaker A

Hey, hey, hey.

Speaker A

Heat, heat.

Speaker A

Happy birthday, heat, heat.

Speaker A

Doo doo doo doo doo doo doo doo doo doo doo.

Speaker A

Let me do it.

Speaker A

Hey, come on.

Speaker A

Heat.

Speaker A

Heat, heat, heat.

Speaker A

Happy.

Speaker A

Hey, hey, hey.

Speaker A

Oh, hey, hey, you are.

Speaker A

I'm.

Speaker A

Hey, over here.

Speaker A

What?

Speaker A

Hey, hey, hey, one, one, one.

Speaker A

1.11.

Speaker A

Here come.

Speaker A

One, one, one, one, one, one, one.

Speaker A

Hey, hey, hey.

Speaker A

I love me.

Speaker A

Come on, hey.

Speaker A

Hey, hey, hey.

Speaker A

Hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey.

Speaker A

Party, don't get, get yourself.

Speaker A

Hey, down, down, down, down, down, down, down.

Speaker A

Come on, come on.

Speaker A

Down, down, down, down, down, down.

Speaker A

Gabbit jump.

Speaker A

Dick, dick.

Speaker A

Down.

Speaker A

Get down.

Speaker A

Hey, hey, hey.

Speaker A

Down, hey.

Speaker A

Black, yeah.

Speaker A

Yeah, yeah.

Speaker A

Yeah, yeah.

Speaker A

Heat, heat.

Speaker A

Okay, okay.

Speaker A

Please welcome to the stage chief product officer of Anthropic, Mike.

Speaker A

Good morning, everyone, and welcome to Code with Claude, Anthropic's first developer conference.

Speaker A

I'm really happy to see you all here.

Speaker A

I'm Mike Kger.

Speaker A

I am chief product officer here at Anthropic.

Speaker A

I just hit my one-year mark, which in AI years is about like three years, but I'm having a blast.

Speaker A

Before this, I co-founded Instagram and also an AI-powered news app called Artifact, which is where I first started getting exposed to a lot of these AI technologies.

Speaker A

I joined Anthropic because of its founder's vision: building AI systems that are powerful as well as helpful and trustworthy.

Speaker A

Today, that vision includes something immediate and concrete: a commitment to empower developers like yourselves to transform how work gets done and how companies get built.

Speaker A

This transformation is about augmenting, not replacing, human creativity.

Speaker A

AI agents are changing the way we work and the way we innovate.

Speaker A

They're expanding what we can build by removing bottlenecks that have limited human productivity.

Speaker A

Today, you'll hear from our product and engineering leaders as well as some of our customers about how they're pushing the frontier.

Speaker A

To give you a sense of what you can expect today at Code with Claude, you can attend three technical deep dives to transform how you build with Claude and five sessions from leading players already using Anthropic's platform to reshape their industries, and dedicated office hours and workshops for hands-on experience.

Speaker A

But before we talk about some exciting new API capabilities I have for you, I want to invite a guest on stage.

Speaker A

Please welcome our CEO and co-founder, Dario Amodei.

Speaker A

Hey everyone, I'm going to be back in 20 minutes for a fireside, so I'll be really, really brief with this appearance.

Speaker A

I'm not one to hype things up, so I'll just say this without any further fanfare: I'm happy to announce that as of exactly this moment, we're releasing Claude 4 Opus and Claude 4 Sonnet on all of our relevant product services.

Speaker A

Now, I know that we haven't had an Opus model in a while, so just as a reminder, Opus is the most capable and intelligent model, and Sonnet is the mid-level model that you all know and love and have been using for the last approximately year.

Speaker A

That's a good balance between intelligence and efficiency.

Speaker A

We tried to design both of them so that there are use cases and times when it's optimal to use each one.

Speaker A

So I will talk very briefly about the two of them and then turn it back over to Mike, and then I'll be back for the fireside.

Speaker A

First, let's talk about Opus.

Speaker A

It is especially designed for coding and agentic tasks.

Speaker A

It gets state-of-the-art on Sweetbench, Terminal Bench, some other things like that.

Speaker A

But I think in many ways, as we're often finding with large models, the benchmarks don't fully do justice to it.

Speaker A

Customers who we've previewed it to have found that it can do tasks that take humans up to six or seven hours autonomously.

Speaker A

Within Anthropic, I've seen some of our most senior engineers be surprised at how much more productive it has made them.

Speaker A

For the first time, actually, when I've looked and seen Claude write internal summaries, documents, and ideas, in the past the quality was often good, but you could never really quite mistake it for a human because it always had that specific style.

Speaker A

This was the first time I actually got fooled, where I actually got back and then, you know, I just read the name really fast and I thought it referred to someone on the team, and I'm like, no, the name was Claude.

Speaker A

So I think there's a lot in Opus.

Speaker A

On Sonnet, I think this will be for many people a strict update, a strict improvement from Sonnet 3.7 at the same cost and better intelligence.

Speaker A

Many customers are simply switching directly from one to the other.

Speaker A

It actually does just as well as Opus on some of the coding benchmarks, but I think it's leaner and more narrowly focused.

Speaker A

I think in particular it addresses some of the feedback we got on Sonnet 3.7 around overeagerness, the tendency to do more than you asked for, which is sort of the opposite of laziness, which was an earlier problem, and some of the reward hacking issues.

Speaker A

Many of our customers have been trying it out and view it as a strong upgrade from 3.7.

Speaker A

For example, Cursor here has been one of our well-known customers, has been trying it and says this is a state-of-the-art coding model.

Speaker A

It's a leap forward in complex codebase understanding, and we expect developers will experience across-the-board capability improvements.

Speaker A

Someone who was playing with the model in person, one customer said, "What the f is this model?"

Speaker A

It's really amazing.

Speaker A

I'll leave the details to others, but the last thing I'll say is we are going to continue to improve the Claude 4 series of models.

Speaker A

We expect to periodically release perhaps minor version updates, ideally even more frequently than we have for Sonnet.

Speaker A

So it should be out there.

Speaker A

You should be able to try it on basically all the surfaces as of now, except I think the free tier has Sonnet only, but all the other surfaces, all the API surfaces have both.

Speaker A

So really hope you enjoy the model, and I'll turn it back over to Mike.

Speaker A

Thank you, Dario.

Speaker A

Two new models, and you heard it here first.

Speaker A

We'll be seeing Dario again, as you mentioned, at the end of our agenda for a Q&A where I'll get to ask him the questions that are likely on your mind right now as well.

Speaker A

I'm personally very excited for our customers to try both Claude Opus 4 and Sonnet 4.

Speaker A

Our teams have loved working with them, and we think you will too.

Speaker A

Now that Dario has shared our big model news, I'll talk more about our detailed API roadmap.

Speaker A

Our goals for building Claude 4 were clear from the start: wanted to build powerful AI that safely introduces new model capabilities, continue to advance the frontier for coding and AI agents, and ensure that Claude becomes your virtual collaborator.

Speaker A

And that's exactly what we've delivered with Opus 4 and Sonnet 4.

Speaker A

Like Sonnet 3.7, both Claude 4 models are what we call hybrid models that have two modes: near-instant responses and extended thinking for when you need deeper reasoning.

Speaker A

I've been surprised at how many customers use the deeper reasoning even for non-coding and non-math use cases.

Speaker A

Opus 4 is great at understanding yo—

Speaker A

the web and execute uh complex workflows across any data source and across anything in the API via MCP and finally efficient scaling as of today you can use expanded 1hour prompt caching to optimize performance and cost at scale each advancement builds on what we've discussed today with Cloud

Speaker A

4 as the foundation Opus 4 for your most complex uh agentic workflows Sonnet 4 as your daily driver for everyday intelligence we're enabling a new class of applications code execution expands the hours of work that cloud can do mcp expands the comprehensive information that cloud can retrieve

Speaker A

and our platform updates ensure our models become increasingly efficient for every dollar spent we're actively learning from developers like yourselves uh to how you use these tools so please keep the feedback coming i love API feedback if you don't know this about me like absolutely like

Speaker A

ping me uh I love hearing the feedback and how we can continue to improve the API for developers or yourselves and MCP is a perfect example of this it started as an internal idea and then began and graduated to community experimentation and now it's a core platform feature if you watch the

Speaker A

Microsoft Build keynote they're building MCP into so much of their uh of their real infrastructure as well we want to create an ecosystem of AI agents where we have the feedback loops to make them actually useful for you today we stand at a major threshold our latest models combined with

Speaker A

all the latest tools that we've released are giving the seeds of a new era the future isn't about AI doing human work it's about AI helping humans do superhuman work and I'm really excited to build this vision together with you and I can't wait to see the kinds of applications it

Speaker A

powers for all of your companies and to show you what's possible I'm next going to hand the mic to Catwoo from our product team to demonstrate how accessing our new models inside Cloud Code transforms your development workflows helping you ship complex multi-day tasks in a single

Speaker A

conversation welcome again to Code with Claude and thanks again hope you enjoy the rest of your day hi everyone I'm Cat Woo product manager for Cloud Code as Mike mentioned we recently launched Cloud Code our agent coding tool in research preview claude Code

Speaker A

gives developers direct access to the raw power of Enthropics models right where they work in their terminals as of today quad code is generally available throughout computing history we've continually moved to higher levels of abstraction from machine code to assembly to highlevel languages with cloud code and

Speaker A

increasingly agentic models we're witnessing another step forward developers are shifting from asking for specific functions to describing entire features guiding AI and changing how software is built today we're bringing the new Claude 4 models to Claude Code making it an even more powerful and capable coding agent

Speaker A

and on top of new models we're releasing several new features in Cloud Code focused on making it a more versatile coding agent across your whole dev life cycle first Quad Code now integrates with VS Code and Jet Brains bringing it to familiar interfaces for millions of developers as Cloud

Speaker A

Code works you can now see its proposed changes in line in your editor we're also releasing the Quad Code SDK which allows developers to use Quad Code as a building block in your applications and workflows the possibilities are endless with the SDK to showcase these possibilities

Speaker A

we're releasing an open-source example of the SDK in action with Claude Code in GitHub you can tag Claude directly on poll requests and issues in GitHub and Cloud Code will respond to reviewer feedback fix CI errors and add new functionality with these additions Cloud Code

Speaker A

now works everywhere you do acting as a virtual teammate across all surfaces in the terminal for deep development work in remote environments like GitHub for automated workflows built on the SDK and in the IDE for seamless review all in Quad Code is a versatile coding agent

Speaker A

for accelerating development wherever you are whether you're working directly with cloud code interactively or using it asynchronously great my favorite part let's see what these updates look like in a demo i'm going to show Quad Code tackling a real dev task in a product that many

Speaker A

of you are familiar with we'll use Excaladrol an open- source whiteboarding tool and ask Quad Code to implement one of their most requested features adding a table component how many of you have gotten that feature request that's been on your backlog for ages that you know your users would

Speaker A

love but you just haven't had the time to build this is the kind of task that we can handle much faster with cloud code normally for a task like this I would set call to work make some coffee

Speaker A

catch up on email and Slack and come back when the outputs are ready but I only have 10 minutes with you all today so let's show a sped up but real workflow here's the Excal repo open in VS

Speaker A

Code let's write a prompt to tell Cloud Code our requirements we'll ask Quad Code to add a table component that supports custom dimensions drag to resize and all of Excal's other styling options here's where it gets exciting quad Code will first create a to-do list for how

Speaker A

it'll approach the entire problem then we can see that Quad Code will start to explore the codebase starting with the file that we already have open for context the best part of the ID integration is the ability to see diffs in line in the editor this way you can see the surrounding

Speaker A

code for more context so you can accept changes with confidence or give quad code feedback we can approve each edit as Cloud Code works or we can let Quad Code continue making edits with auto accept mode letting us balance visibility and control in this demo we gave Quad Code the ability

Speaker A

to make edits run lint and tests and make PRs so Quad Code worked for 90 minutes on this task i wish I could show you the whole thing but we need to speed things up what you're seeing is actual

Speaker A

unedited output from quad code an hour and a half later and it's done it added table functionality wrote tests to validate the change and iterated until lint and test passed this normally required us to understand the codebase architecture and how every single other tool was implemented

Speaker A

in this case cloud code is literally doing hours of work for us pretty impressive right now now let's run Excal locally and just make sure the feature works as we expect let's check that we have a fully functional table component by making a three

Speaker A

row by three column table great we can reposition the table we can drag to resize we can change the border pattern and color and we can add text to cells this also integrates with Excal's existing UI all of this was done with one prompt in Cloud Code

Speaker A

[Applause] next we'll ask Cloud Code to use the GitHub CLI to create a pull request for this branch cool let's click in now we have our pull request this is where the Quad Code SDK shines it lets us build custom workflows on top of cloud code including through GitHub actions for this PR

Speaker A

I'd like to update the docs instead of going back to the IDE we can just tag at Claude and ask it to update our documentation for us behind the scenes this triggers a GitHub action that runs Cloud Code

Speaker A

claude comments on the PR as it works and it'll it'll make a commit for us when it's done you can also tag at Quad on a GitHub issue and I'll also make a PR for you there with this feature Quad

Speaker A

Code meets users on even more surfaces where they're already working devs no longer need to context switch in their local environment and you can even kick off runs on the go this is all built on the Cloud Code SDK beyond powering GitHub actions we've seen customers do incredible things

Speaker A

with the SDK including running many Quad codes in parallel to fix flaky tests increase test coverage and even do on call triage cool it looks like the action is done running and we can see Quad Code updating its comments to let us know what it did let's click into the commit and see cloud's

Speaker A

changes it updated the documentation for us in our PR and committed it without us having to do a thing in just 10 minutes you've seen quad code tackle a complex tasks that would have taken days to implement manually writing hundreds of lines of code integrating seamlessly with Excal's

Speaker A

existing features and doing hours of work for us all of this is available to you today quad code in GitHub actions powered by our SDK is available in beta and you can install it by running a simple command on the screen within quad the VS code and Jetrains IDE extensions are also live in beta

Speaker A

just run quad from your IDE to install last but not least our latest models Quad Opus 4 and Claude Sonnet 4 are available to Cloud Code users today quad code shows what's possible when AI can truly understand and work with code

Speaker A

to build powerful agents whether coding assistants or applications in any domain you need more than just intelligent models you need the right platform please welcome Michael Gersonenhober who will show you exactly how we're making that possible thanks so much Cat and good morning everybody thank you so much for being here

Speaker A

i'm Michael Gersonenhopper head of product for the API platform at Enthropic how many people here use AI generated code already to write their applications yeah and how many of those are using AI at their core feature delivery like everybody here that's what

Speaker A

I thought most applications in the world will be built by people already trying to solve the world's problems whether you pass Ble Code whiteboard interviews or getting started with Vibes we're all software engineers now but writing code is just the start you need to more

Speaker A

quickly build stable secure and maintainable AI applications and that's why we built the anthropic platform a complete toolkit designed for building state-of-the-art AI applications and agents our platform is already powering most of the world's AI delivery in every

Speaker A

domain in finance Turboax helps millions of customers confidently file taxes with federal tax explainers in healthcare Novo Nordisk is using Claude to draft clinical study reports in less than 10 minutes instead of 15 weeks and the world's best coding assistants run on our platform

Speaker A

each of these companies took Claude's intelligence and turned it into something uniquely valuable for their users at its foundation our platform provides reliable access to Claude through our model inference service which includes the messages API and essential tools like

Speaker A

prompt caching to optimize performance and costs over 50% of all input tokens are cached on the platform doubling the effective context window for our models notion can put vast amounts of your documents in the context window but maintain snappy real-time

Speaker A

execution this lets them adopt your voice for creative writing and virtually eliminate hallucination starting today we're extending the cash time to live from 5 minutes to 1 hour your agents can now maintain complex context across the entire user session without breaking

Speaker A

the bank but that's just a foundation to build powerful agents our platform provides powerful building blocks as Mike shared we're releasing two new capabilities the files API and a code execution tool just like you and me there are some problems that are easier to solve by writing a

Speaker A

script our platform lets your agents write their own code in production just like you would these new features join existing components like web search for real-time information and citations for grounding responses in source documents when Thompson Reuters provides analysis to attorneys

Speaker A

in co-consel it's critical that they ground this in their legal research in case law not in the models training data our platform also connects your agents and your data and businesses business systems through model context protocol mcp has taken off within our developer ecosystem with

Speaker A

over 3,000 integrations built by the community whether your agent is accessing application errors with Sentry triggering Zapier workflows or creating Asana tasks the MCP connector enables the model to interact with any tool data or app your task requires and today the platform makes

Speaker A

it even easier by handling all the technical complexity of tool and API calling for you one thing that I want to emphasize about the platform is the composability of the APIs they're building blocks that work together as well as they work apart helping to solve unique problems that

Speaker A

can't be coerced into a cookie cutter shape think of Cloud as the architect and general contractor for your agent it doesn't execute predefined sequences or stack components randomly instead it intelligently determines which materials you need in what order and how they fit together to create

Speaker A

something far more powerful than any individual element let me show you what I mean when you build an agent for complex financial analysis Claude intelligently assesses the task and orchestrates the right tools using MCP to access financial data spinning up code execution for statistical

Speaker A

analysis searching the web for real-time market data and grounding insights with citations for accuracy and compliance iterating and refining based on results no hard-coded workflow no brittle scripts just intelligent orchestration that allows you to build powerful agent and is

Speaker A

seamless and seamlessly adopt new capabilities as our researcher research brings them to life we understand that prompt quality can make or break an AI application which is why we created dev tools like the prompt improver and evaluations along with new observability features that help

Speaker A

you get to production and scale faster today we're already helping developers build faster with resources like cookbooks and guides that show you how to implement features like memory into your applications in the future we'll adapt these for programmatic access and host them directly on

Speaker A

the platform so you can build even more powerful agents that can research and remember on their own in production everything we've built centers on one goal helping you ship better AI faster the anthropic platform isn't just tools it's your path to building industryleading agents so

Speaker A

thank you all for being here today with me at at Code with Claude i'll be on the floor the rest of the conference but it's my privilege to welcome Mario Rodriguez from GitHub to show you exactly what this looks like in production [Applause] [Music] thank you thank you Michael and I am here

Speaker A

um thrilled to be with you all we at GitHub are incredibly excited to be part of this energy and innovation and to share more about our deepening partnership with Anthropic um this amazing team everything GitHub does is anchor on two core beliefs right number one is giving developers

Speaker A

choice and number two is giving them the best developer experience at GitHub Universe last year we kicked off the relationship with Anthropic we announced Cloud Sonet 3.5 support in VS Code and also in our conversational experiences and we did this because we share fundamental belief with

Speaker A

Anthropic that AI can be a powerful force and a force multiplier for developers augmenting their capabilities not replacing augmenting their capabilities and freeing them up to focus on what they do best which is imagination and creativity are of being a software developer

Speaker A

is being a whizzer since we haven't expanded since then we have expanded the partnership and experiences across VS Code Github.com and our mobile app just to mention a few and today I am delighted to announce that GitHub copilot supports Cloud Sonet 4 and Opus 4 available right

Speaker A

now we just pulled the trigger right when Dario announced it and every one of those services That is what SIM shipping is all about let me tell you it's really hard to do i don't know if you've done it with every application that you have done but it's incredibly hard to

Speaker A

do so thanks to all of the teams that make that happen now as you all surely know the future of code is what agent an agent mode in VS Code is our autonomous perprogrammer that can perform multi-step coding tasks based on your natural language commands we've seen

Speaker A

firsthand how having cloud's intelligence directly within the editor truly helps developers understand complex code bases um get faster code to production and increase their productivity without ever leaving the environment they already know love and trust but even that right even that is singlethreaded and in my opinion the future is multi-threaded you

Speaker A

think about it you're in your editor it becomes a waiting room you're you're going faster but it's still a waiting room and that's why on Monday we took one step further and announces GitHub's copilot coding agent now our coding agents this are autonomous asynchronous peer programmer not

Speaker A

pair anymore now it's your peer programmer embedded directly into GitHub copilo's coding agent is currently powered by you probably guessed it cloud sonnet uh and you know the reason we chose that was very clear to me so let me just walk you through three things that made that

Speaker A

decision possible number one our evaluation showed that cloud demonstrated three main strengths right strong software engineering and coding knowledge powerful problem solving and that's very important because sometimes you have to go and look at the code and find the right place to make that

Speaker A

edit and then number three excellent instruction following and specifically when thinking about tools and MCP so when you're building for ejected coding dealing with these things and large code bases and system prompts you also need something else which is caching right and that prom caching

Speaker A

bless you that prom caching support we get from the anthropic API let us build these experiences in a most cost effective way every token counts and every token counts also on the price side so the more we save those the better experience we could provide our customers now on top of that

Speaker A

cloud was already the most frequently selected model in agent mode so once we put all of those things together it was very clear to us that cloud set was the right model choice for agent coding in GitHub scenarios now with cloud set 4 we seen improvement in all of these areas not

Speaker A

just aggregate benchmarks like sweet benchmarks but more importantly on our real world evaluation suites as well now our collaboration goes deeper than this right it's not just about integrating models directly we've been working closely with Anthropic to officially adopt and scale MCP we're

Speaker A

combining intelligence if you think about this like these models are incredibly intelligent you stack like three PhDs on them with knowledge so how do you get knowledge into that intelligent model well the answer to us is MCP and tools and that really unlocks the next acceleration

Speaker A

of developer tools recently Kevin Scott that's Microsoft CTO made the analogy that MCP is like the HTD protocol of the web and I completely agree with him so if you have not adopted MCP do it today right after this keynote go and play with it it's that important it's the way you get

Speaker A

knowledge into these intelligent models now as we step into this new era of software development we're transforming GitHub's platform from an AI infuse into AI native from creation to deployment we envision this SDLC powered by an agentic layer at the top of it that spans that inner

Speaker A

where you are coding and that outer loop those asynchronous experiences and you are going to be an active collaborator every single step of the way the reason why we say co-pilot is the human is at the center and then there's agents helping you that is why we're announcing a new partnership

Speaker A

that integrates what Kai just showed you cloud code and the extensible cloud code SDA directly into GitHub's agent platform this opens up new possibilities to customize cloud code remotely invoke it from new surfaces that are embedded into GitHub and our workflows again all on the GitHub

Speaker A

platform now we're already done a lot uh but the journey with anthropic is still just beginning in our opinion we believe that by bringing together GitHub's deep deep understanding of developers and Anthropic's AI capabilities through cloud and the platform APIs we will and we can unlock

Speaker A

a future that is more intuitive more efficient more ultimately more human that human power is important so I'm excited to see what we continue to build together and also what each of you builds with us so thank you so much and please welcome back to the stage Mike Kriger thank you

Speaker A

sir hello again and thanks again to Mario to Michael and to Cat um I love the GitHub integration the last project I did I actually was like "Oh I can actually just install cloud code into a GitHub code space." And all of a sudden I have Cloud Code against the repo that I've

Speaker A

already been building it was really great to hear from each of them and hear all about the exciting work being done with Cloud so to close out the show I'd like to dive a little bit deeper into Cloud 4 our research direction uh and what developers can expect next from Enthropic um

Speaker A

so please help me welcome back to the stage Dario for our one-on-one conversation welcome back Dario hello again this is great this is like our one-on-one in front of the whole audience this is great um so Cloud 4 uh uh is out cloud Sonet 4 and Cloud Cloud Opus 4 are available um what excites

Speaker A

you the most about the Cloud 4 models and how does it change your thinking about what's possible in the next 12 months yeah so um I I think abstractly the thing I'm most excited about is you know every

Speaker A

time you have a new class of models there's like more you can do with it right so uh uh you know we're we're we're going to be releasing uh models after Claude 4 there'll probably be a Claude 4.1

Speaker A

at some point just like we did with uh with Sonnet uh uh 3.5 and I think we're just at the beginning of of of of you know what what what we can do with the new the new generation of model in

Speaker A

terms of tasks i think the autonomy is going to go uh uh is going to go much further than it has already just the ability to give you know set your model free and and give it the ability to

Speaker A

you know do something for for a long period of time i think we're I think we're very much very much still still at the beginning of that um uh I'm I'm actually increasingly excited about the models for cyber security tasks i mean you can think of cyber security as like a a subset of of

Speaker A

of of coding tasks but they tend to be higherend coding tasks and so I think we're maybe finally hitting the threshold for that and then as a as a former biologist I'm I'm always excited about use of the models for uh for you know biomedical and kind of kind of kind of detailed

Speaker A

uh scientific research work which I think opus and opus in particular is going to be good opus in particular I think is going to be particularly particularly strong at that um it really connects I think to machines of loving grace so how does cloud 4 fit into that trajectory overall i like

Speaker A

to joke that people think of Machines of Loving Grace as an essay and I think of it as a product road map for the next few years and curious how Cloud 4 fits into that journey yeah it was sort

Speaker A

of a product road map that I wrote without knowing how to how to actually get to it and and kind of said all right guys then this is your work this is your job um uh yeah uh you know we're I think

Speaker A

we're increasingly thinking about on the biology side of things and and software is part of that right where and you know increasing amount because biology increasingly involves data even involved data 10 years ago when when I uh when I was a biologist uh uh I I think I think I think

Speaker A

more and more of it is is is going to be okay we have these models that know a lot about biology and they can help write code and so if you're a computational biologist I think these models will

Speaker A

will really accelerate what what you can do and you know we have a number of customers who are who are who are trying out the models for for these tasks i guess we'll we'll get to that in a bit yeah I think uh one of the first hackathons we did after we uh released MCP somebody hooked

Speaker A

up MCP to one of those like plotters that so to do drawing and so cloud could draw for it's actually really fun to like see what cloud draws for itself but it was like the first one was like MCPs don't

Speaker A

just have to be connecting to digital systems they could also be connecting to the real world so like when you'll be able to drive lab equipment VMCP I think is an interesting uh question for the soon we'll be able to test Claude by connecting it to a polygraph yeah I love that idea are you lying

Speaker A

who needs interpretability when we have the polygraph um uh you mentioned that moment where you were you know convinced that Claude the claude written content was was human written um any other breakthrough moments in watching us all you know

Speaker A

uh dog food uh Cloud 4 or even try it yourself that made you realize this model felt different um you know I I didn't actually understand I didn't actually understand the details but like there were several people in our side there was a moment a few weeks before the model launched

Speaker A

where someone said "Oh my god this model just like oneshotted this like incredibly difficult performance engineering task." And and no model had ever had ever done anything like that before i I I will say that there there's there's this almost almost like superstitious process in the

Speaker A

model development where like it it it it somehow all comes together at the last moment even if the training plot process is all planned out like just some of the models abilities maybe it's something about their interaction with people maybe it's something about like just making it the last bit

Speaker A

better matters maybe it's people getting used to the model and prompting it but but you you always find the the early versions of the model um you know people are struggling to figure out how to use them and then and then you finally get to a point and people are like this works for me

Speaker A

all the time and there's that there's that alchemy that happens somehow always the last moment if you read uh the Creativity Inc by Ed Catmol he talks about the same process with all the Pixar movies like they're really bad until like two days before they're supposed to go out and I feel the same

Speaker A

way about our models not that they're really bad but they're like there's like they're not quite there and then suddenly they click and we're like I can't wait to get this out to people it it doesn't make it doesn't make any sense because like the training process is uniform and you know

Speaker A

you know you you would think that that it doesn't work that way that it's all a rational process but it's absolutely not there's no point on the RL curve at all that that they come together it comes together at the last minute i don't know why it's a real moment um many people in the audience

Speaker A

are developers here and a question that I know has come up internally as people you know think about uh how AI is developing is which parts of the software engineering job will AI take over um and what becomes more important in a world where we have autonomous agents being able to do do a

Speaker A

lot of software engineering yeah um so probably like many people here I I read with great interest uh Steve Jay's blog post a couple months ago uh revenge of the junior developer uh he had some uh he had some similar blog posts uh he had some similar blog posts around that actually uh came in

Speaker A

to visit us even um uh um uh uh and and that laid out I think the vision of where things are going maybe maybe even better than I could which is that we're gradually go we're gradually going to more

Speaker A

and more autonomy of the models right we had this phase where you would do basically autocomplete now there's this thing that I guess people have called vibe coding um uh uh uh and and you know then then we're going more to kind of like you can dispatch the agents to to do things and I think

Speaker A

with with claude code we're going to go go more in the direction of you know you can dispatch the agents to do things and I'm sure we'll have other product surfaces that that that allow you to do that as well and I think we're we're heading to a world where a human developer can kind of manage a

Speaker A

fleet of agents and say you go off and do this you go off and do this you go off and do that but but I think continued human involvement is going to be going to be important for the quality control

Speaker A

to make sure they do the right things to get the details right and so you know working together on both the models and the product surface around it to get the details right is going to be really important i think it's also highlighted to me it makes the stuff that is inefficient in your work

Speaker A

way more painful because it's taking you away from like this flow of building and so at least it's made me realize like where we're spending too much time on crossunctional alignment and you know road mapping when like we just should be trying to get more building so it's I've I've

Speaker A

it's become more painful as the engineering part has has been sped up as well um so there's endless debate uh you know around the industry around you know uh bigger models or smaller architectures which will win in the long run um you're famous for you know popularizing and and pioneering the

Speaker A

scaling laws paper what's your current take on you know the extreme being is pre-training dead is pre-training all that matter still and its role you know relative to to post-training i mean without getting too specific I would say that you know the Clawed 4 models embody advances in

Speaker A

both pre-training and post-training um so we're continuing to see the pre-training scaling laws work the way that they've worked before um uh and we're also continuing to see continued advances in uh post-training and and they kind of they kind of complement each other uh and I I think we're going

Speaker A

to continue seeing advances in both of those i think we're also going to continue to scale up so we have these these multiple trends these multiple sources of exponential growth and they're they're all going to compound with each other right that's that's why I think all of this is going to go very

Speaker A

fast one of the reasons I liked Jiega's blog post is that it was someone who was not me repeating the mantra of like it's only going to be a year or two until until these things are like you know

Speaker A

are basically peers to us it's insane that 37 was just in February right it's it feels like a year ago but it was just three months ago i I I know it i know it feels like it's like oh this is

Speaker A

this feels like an obsolete model or something and you know it was it's less is like two and a half months or something it's like the the time scales are the time scales are compressing and I often say that uh being in the AI field I will go on a very brief digress be being in the AI field

Speaker A

it feels like you're getting on a a spaceship from leaving Earth at relativistic speeds and uh you know one day you wake up and you know it's like you know one day on your spaceship two days on Earth so you have to take in the news of two days it accelerates one day on your spaceship

Speaker A

three days on Earth and and and you know that that's that's just what it feels like being being on this ride that resonates i've heard the metaphor before but it absolutely does um maybe on the post- training front one of the things that I got really excited about seeing developed in Cloud

Speaker A

4 has been this concept of memory and having the the model being able to manage it memory maybe talk a sec about why that's important and what that kind of enables uh sorry repeat the question like for uh the model to be able to manage its own memory and be able to handle those

Speaker A

long horizon tasks as well yes yes we have found that to be uh very useful i think one one place we found it to be useful is Pokemon right um uh where the model's able to like remember its state

Speaker A

but you know presumably it's it's useful for many things other than just Pokemon um uh uh but uh um no I think I think it's great that you know the model you know just as a human would like when I'm

Speaker A

thinking I'll write a bunch of notes and uh you know then I'll like recall those notes at a later time or you know that there's just a lot of lot of intermediate work that I have to do that that

Speaker A

you know and models do that to some extent when they when they reason when they have you know like our our reasoning traces but uh you know not not everything I do can be incorporated in one scratch pad right there's like presentations there's um you know individual documents that I that I write

Speaker A

and so models are the same right the the idea for them to kind of you know be able to create files to do things with those files to load data and to kind of seamlessly interle those things right the

Speaker A

the one of the new features that we have is this this kind of interled re interled reasoning and taking actions and some of those actions can be storing data recalling data again the affordances that the models have are gradually converging towards the affordances that a human has which

Speaker A

I think is is the way that it should be one of my mind-blowing moments in Cloud 4 so far was we added like basically a to-do list scratch pad to cloud code and just watching it turn through the to-do list and then as it thought of more things to do add to the to-do list check things

Speaker A

off strike out what was no longer relevant it really mimicked I think how people managed their own work and how they think about uh completion along the way and then the interled reasoning uh and tool use as well i saw a write up this morning on Mac stories where it was using a tool it was an

Speaker A

MCP and it hit a rate limit with the backend MCP server and because it was doing the reasoning it was long I was like hm I probably hit a rate limit let me try this other approach to do this as well

Speaker A

and so like that ability to reason and remediate as part of tool use I think is is really powerful um I'd love to touch on race at the top so um uh safety and and capabilities are often you know

Speaker A

uh thought of as being at odds with each other and your thesis is exactly the opposite and that these two things can move in tandem i found that very inspiring and one of the reasons I joined here but maybe touch on how you think of of race to the top yeah so you know I think I think it it it uh

Speaker A

applies to things you know from the from the uh from the very mundane and simple and commercial to kind of you know the grand directions that that that that that that that AI is going in the future um so you know I you know I think I think when we when we talk to customers we

Speaker A

have a number of customers who you know care a lot about making sure that the behavior of their AI models is predictable that it's trustworthy um uh and I think that's aligned with what some of we're what what we're trying to do in the long term for uh you know making sure that models in

Speaker A

a more grand sense stay in line with human intent um so there's there's this nice there's this nice synergy here and you know I think whenever we're able to do so whenever we think it's reasonable or responsible to do so we do want to provide tools for the community so M MC MCP MCP

Speaker A

is an example of that um I I myself was actually surprised at the the pace at which everyone seems to have standardized around around MCP i mean it was it was very strange we released it in November i wouldn't say there was like a huge reaction immediately but then but then within 3 or 4 months

Speaker A

you know it kind of become it kind of become the standard again there's again this this feeling of like being on the spaceship accelerating from Earth and and and you know experiencing you know larger and larger time time dilation constants yeah um where it's you know like you know think

Speaker A

of like USB and other standards you know think of like standards in the '9s or the two like you know this would take it would take years for people to converge on something yeah and even in talking to other participants in the industry around MCP they're like we don't want to slow down

Speaker A

whatever is working on MCP like we do want like some you know help on steering but like this is you've captured lightning in a bottle let's make sure it becomes the new protocol and the standard by which we interoperate agents as well um uh maybe tied together the race to the top i loved

Speaker A

your urgency of interpretability essay you have a background in neuroscience as well can you talk a little bit about how you see the co-development of interpretability and um machine intelligence yeah so um you know I think 10 years ago uh many people thought that neuroscience would tell us about how

Speaker A

to do AI um uh and indeed there you know are a number of former neuroscientists in the field i'm not I'm not the only one there you know there are other lab leaders some who have that uh who have

Speaker A

that background um and you know I found at a high level there's some inspiration but I wouldn't say I've said oh you know this is how the you know this thing we know from the hypothalamus we can use for you know for for for making these models it's it's all been pretty much from scratch but

Speaker A

interestingly things have gone the other way more which is that using interpretability we're able to see inside models and although of course they're not ex made in exactly the same way the human brain is at at a you know the a kind of superficial level there's there's a lot of

Speaker A

differences a lot of the conceptual patterns we have found inside models sometimes they then get replicated in replicated in neuroscience research there was something about like high low frequency detectors in vision um that uh was found via interpretability via via one of one of the

Speaker A

people on Chris Ola's team and then a couple years later a neuroscientist actually replicated it in in animal brains um the idea that for example vision models separate out you know they have one path that that tends to correspond to color and you know another path that corresponds to

Speaker A

uh you know uh h a brightness or to the boundaries between objects these seem to be natural distinctions in the world right that are that are kind of there to be discovered and anytime you have any kind of abstract learning system whether it's artificial or biological you kind of discover

Speaker A

the same thing so it's very interesting i'm really curious how the circuits paper ends up affecting neuroscience research as well um let's move into the 5 to 10 year time horizon um to the extent that that is even possible in AI as as we move relativistically maybe relativistically that's

Speaker A

probably one year in real time um when do you think there'll be the first billion dollar company with one human employee 2026 yeah I absolutely buy that um do you have any advice for people building with Claude um for the next year how to think about building at that frontier as well yeah

Speaker A

um I you know I think there's like a lot of very specific things you could say about like how about how to use the models but I feel like because of this whole like relativistic time dilation thing this like speeding things up like almost all the advice is drowned out by like one sentence which

Speaker A

is or maybe two words which is just be ambitious um like build something that's greater than you think is is possible and even if it doesn't quite work yet another model will come out in the next generation which right now is three months but like probably it's going to go down

Speaker A

to two months then one month and you know then then if I want to come up this year maybe I'll be giving advice that's like oh you know don't build anything today you know we're releasing something today but by tonight it'll be you know you won't want to be building with this tonight

Speaker A

i talked to a founder who started a company two years ago in the sort of autonomous AI coding agent space and he basically tried every single model and his startup wasn't working and then it was actually 37 where he's like my startup works now and it was the same thing of like this thing

Speaker A

that I was trying that was really hard all of a sudden is now um possible but hitting your head against the wall actually sometimes can be useful because you put all the other pieces in place and and everything works except the model and then when the model works it's almost like you've

Speaker A

built something that's like more robust than it needs to And that can be like a positive property um so so you know as much as I joke about like oh you should you know you can just wait for the

Speaker A

next model actually hitting your head against the wall as long as it's something that's like almost possible if it's not like you know like three years out from from what's possible um I think it can actually be productive we saw that even with advanced research internally like our our research

Speaker A

and cloud skills team had built a prototype of this the model kind of lost its way it wasn't good at using tools and then with 37 especially with cloud 4 I think you'll find that it does advanced research really really well as well and it's because we were trying and kind of failing

Speaker A

along the way as well yeah it's it's almost as if you want to run your you want to run your startup as like speculative execution against the next model right there's some kind of like I don't know I love that yeah I think that's exactly right um all right so last question to wrap up um for many

Speaker A

of us today um who aren't Dario we couldn't have imagined the progress that AI has made and the rapid pace of change what are you most excited about for the coming year and in the next five years um yeah so uh I think for in the next coming year uh we are going to see incredible things in

Speaker A

in in code i would refer again to kind of the you know taking where we are with cloud code and where we are with the coding models and going from there to kind of to kind of the agent fleets um I think

Speaker A

this will have an interesting effect in the world which is I don't know that we've thought carefully like from an economic or business perspective about what happens when the cost of producing software goes down it's kind of an assumption an article of faith that you only make software

Speaker A

if it's only worth it to make it if millions of people use it or at least hundreds of thousands or maybe tens of thousands like you wouldn't make you you know you you you you like wouldn't make a

Speaker A

whole piece of software for this event right like you might throw together something but like when it just becomes really cheap when it costs you 20 cents to like oh let's just let's just throw let's just throw together something that you know you know changes you know ch changes my

Speaker A

vision for this particular event or something like that um uh I think the world is going to be very different when these things can be made ad hoc on a on a one one one oneoff basis in like a

Speaker A

few seconds for for less than a for for for less than a dollar what are what is the role of the developer there what is the role of businesses what is the role of startups um and what is what

Speaker A

is the experience of the of the you know of the of the people using it i think we don't know the answer to any of those questions so that's very interesting on the on the fiveyear time scale I will return again to biology i think the biomedical stuff will not be revolutionized in the

Speaker A

next year because it's it's kind of you know slow to slow to happen but uh yeah yeah I hope that uh five years from now we will have uh vanquished uh many of the diseases that now uh that now exist i

Speaker A

love we'll leave it at that unfortunately we do have to wrap up i feel like we could talk for another 40 minutes so first I want to thank Daario for spending time with us today thank you Dario i also want to thank all of you who are here in person and those watching via liveream uh

Speaker A

but before we close I almost forgot one thing um as a special thank you to everyone who joined us today at Code with Cloud in person i'm excited to announce that each of you will receive free access to Max 20X our highest tier plan for three months so look out for that

Speaker A

i especially love using Macs with cloud code so you'll be able to do that as well so we can't wait to see what you build have a great rest of your day with the different um sessions and welcome again to Code with Claude thanks for coming thanks for coming everyone

Speaker A

[Music] [Music] [Music] oh yeah oh [Music] oh hey hey [Music] [Music] hey hey hey

Topics:AnthropicClaude 4Opus modelSonnet modelAI codingdeveloper conferenceAI agentsAPI roadmapmachine learningproductivity

Frequently Asked Questions

What are the main differences between Claude 4 Opus and Sonnet models?

Claude 4 Opus is designed for complex coding and agentic tasks with state-of-the-art performance, while Sonnet 4 is a more efficient mid-level model optimized for everyday intelligence and improved over Sonnet 3.7 by reducing overeagerness and reward hacking.

How does Anthropic position AI in relation to human creativity?

Anthropic emphasizes AI as a tool to augment human creativity and productivity rather than replace it, aiming to remove bottlenecks and expand what developers can build.

What new API capabilities were introduced at the conference?

New API features include expanded one-hour prompt caching for better performance and cost efficiency, as well as the Model Context Protocol (MCP) to enable comprehensive data retrieval and execution across various sources.

Get More with the Söz AI App

Transcribe recordings, audio files, and YouTube videos — with AI summaries, speaker detection, and unlimited transcriptions.

App Store Google Play

Or transcribe another YouTube video here →