Building AI-native at enterprise scale: monday.com, Doc… — Transcript

Tech leaders from monday.com, Doctolib, and Delivery Hero discuss their AI-native transformation using Claude to enhance enterprise-scale operations.

Key Takeaways

Legacy enterprises can successfully pivot to AI-native by integrating AI tools into existing products and workflows.
Broad organizational adoption beyond engineering is critical to maximize AI benefits.
Building platforms that enable sharing and scaling of AI skills accelerates innovation and efficiency.
Autonomous software delivery powered by AI can significantly increase engineering throughput.
Customer-facing AI applications can enhance user experience by automating complex tasks like product specification.

Summary

Panel features tech leaders from monday.com, Doctolib, and Delivery Hero discussing AI-native enterprise transformation.
All three companies were founded before the LLM era and are pivoting to integrate AI deeply into their products and workflows.
monday.com is using Claude to evolve from a work management platform to one that executes work via AI-powered agents.
Doctolib has achieved near 100% adoption of Claude across all roles, building a skills marketplace to share AI capabilities company-wide.
Delivery Hero developed HeroGen, an autonomous software delivery system that automates pull requests from Jira or GitHub issues.
Each company manages legacy codebases, balancing monolithic and distributed systems while integrating AI solutions.
Focus on organizational adoption, with cross-functional teams including engineers, product managers, and designers using Claude.
monday.com’s customer-facing AI product, Monday Vibe, turns simple prompts into detailed product requirement documents rapidly.
Doctolib emphasizes collaborative learning and scaling AI skills through internal communities and shared tools.
Delivery Hero’s HeroGen system has processed thousands of pull requests, demonstrating exponential growth and traction.

Full Transcript — Download SRT & Markdown

Speaker A

Hello, everyone. My name is Rebecca, and I'm on the Anthropic Go-To-Market team. It's wonderful to see so many developers here today at our first Code with Quad in London. We have a really special panel here today with tech leaders from Monday.com,

Speaker A

Dr. Lib, and Delivery Hero. For those that don't know, all three companies were founded between 2011 and 2013, so before the rise of the LLM era.

Speaker A

And today's conversation is really going to be about their pivot to becoming an AI native enterprise. With that, would love to pass it off to the panelists, would love a quick intro, name, role, company? And then what's the cloud-powered system

Speaker A

that's now critical for your organization? And roughly how old is the code base that it lives in? Ruslan, if you want to start. Hi. I'm one of the engineering leaders at Monday.com, and we are the company which tries to reinvent everything about

Speaker A

how we do and what we build for our customers. And Claude is actually powering us on this huge new mission from reinventing us as a platform to manage work, to actually a platform who helps execute work for our customers, building teams of

Speaker A

agents and tons of native AI capabilities. The company is roughly 14 years old, so the code base of a company that's been all this time in a startup mode, you can imagine it's a monolith journey that takes a

Speaker A

long time to break out and a lot of imperfections. So, yeah, it's quite a journey we're on. Hi, everyone. So at Dr. Lib, we have two missions. First one is to make people healthier, and the second one is to improve the daily life of healthcare professionals. So for people, we

Speaker A

build a full health companion where you can find and book an appointment with a doctor, patient messaging, manage your personal health record, and then for healthcare practitioners, it's all the solutions that they need to do their job, from clinical, financial, care corporations,

Speaker A

manage the patients. In terms of usage of Claude, we have pretty much now 100% adoption of people kind of building with Claude. And building, it's not just about engineers, but it's everybody, kind of product managers, designers. And now with Claude co-work, it's actually

Speaker A

a large proportion of the population outside of kind of tech and product that is also now beginning kind of to use Claude to do their daily work. And then in terms of kind of the code base, we're about half of it is a

Speaker A

monolith, which was created now over a decade ago, kind of when we were founded.

Speaker A

And then about half is also kind of similar journey as what Ruslan was mentioning, being built in the last couple of years in terms of distributed systems and kind of services. Yeah. So my name is Ulrik Schäfer. I'm VP for Tech

Speaker A

Foundations at Delivery Hero. Delivery Hero is the world's leading local delivery network. So we are delivering food, groceries, and other things to our customers worldwide in over 60 markets.

Speaker A

What we're doing with cloud is obviously we're also rolling out cloud code to our engineers. We're using cloud also within our products. But what I want to talk about today is mostly how we build our autonomous software delivery system

Speaker A

called HeroGen. Awesome. Rodrigue, I think with that, my next question is, none of the three of you really got to greenfield this. You all had an existing working product, a lot of customers and a large functioning engineering organ

Speaker A

and many years of code base, right? And so would love you to walk us through what you built with Claude and how it actually plugs into what was already there. Yeah, so if we look at this autonomous software delivery agent, what we wanted

Speaker A

to achieve there, and this morning we heard already on the keynote that when you When you look at things like cloud, you need to build for the next model, not for the current one. We did exactly that last year. We looked at where

Speaker A

the trajectory is going, and we decided to try out to build basically an agent that takes a Jira ticket or a GitHub issue and takes it to production readiness in terms of a pull request that can then be merged, right? And

Speaker A

that's what we build on over, the last two quarters of last year, and we launched it in Q1, and it has gained huge traction across all of our engineers and subsidiaries in the group.

Speaker A

So, just to give you some numbers where we are right now, so we're at around 173 on average merge pull requests that go into production per day right now in the last 10 days as an average. And, yeah, we have

Speaker A

around 7,000 merge pull requests in total since the launch in February, and the trajectory is really exponential. Awesome. Alex would love to hear a little bit about the experience at Dr. Lib as well. Sure. So a lot of our

Speaker A

focus has been how to get everybody to start using Claude, but also using it effectively. And one of the things that we did very consciously was we said, OK, it's not going to be only the teams that are working on the platform that

Speaker A

will be working on building everything around Claude and everything around how we build with Claude. But we wanted to really leverage the expertise, the creativity, the innovation of all the engineers, right? Because your best engineers are going to be the ones who are

Speaker A

going to find the most interesting ways of using it, of applying the new technology.

Speaker A

So what we said was the job of the platform teams is just as much as just enabling and finding what are those best practices that people are using, and then helping accelerate those. So either remove bottlenecks or industrialize them and then scale them

Speaker A

across the teams as a whole. What we ended up doing was we said, okay, we want everybody, as they're building their own skills, make them available to everybody else.

Speaker A

We ended up building a skills marketplace where everything is discoverable. You can see which skills get the most usage, which ones are trending. And then we also have an environment that we provide for developers that have all the tools automatically connected as you

Speaker A

start working. And we packaged many of those skills directly into that environment. So immediately, as you onboard at Dr. Lib, you get access to all of those skills. They're available to you. And then there's also plugins for experimental skills as people are trying

Speaker A

out new things. And what really has worked for us as well is not just doing this in silos, but the most popular channel, I think, of the whole company right now is called Build with AI, where people are sharing everything that they're learning,

Speaker A

asking questions, promoting the skills that they have. It's really the liveliest conversation that we have. That's awesome. So it's not just individual adopter at the engineering level, but across the org, there's some streamlining and broader organizational efforts. The goal has

Speaker A

been really to try to go down the learning curve together, rather than everybody kind of doing things and doing amazing things, but by themselves. Awesome. Roslyn, I know we were catching up a bit yesterday on Monday's agent, but would love to hear more

Speaker A

about the evolution in that journey. Yes. So Monday, we, of course, do everything, right?

Speaker A

We reinvent how we work and what we do for our customers at the same time. I want to focus more on, like, product implementation and customer-facing products where we use Claude. Of course, we build platform agents on top of Monday that help execute

Speaker A

work and also invite external agents to be first-class citizens of the platform, which is creating a lot of challenges for identity systems and permission models, of course. But at the same time, one of the biggest successful releases Monday ever had was Monday Vibe.

Speaker A

It's a prompt application tool that we opened to our users and has been growing exponentially in a successful adoption of customers. So, in short, it's basically turning a simple prompt into a detailed PRD for the customer to interpret their intention well and refine it well together, and then

Speaker A

build a working application literally in minutes. There is an interesting advantage that we got to boost this.

Speaker A

despite we have an old code base that has a lot of things with it coming with it, the open platform approach helped to literally contain it and literally let the Vibe coding tool do what external developers are doing using the same APS,

Speaker A

using the same SDKs, and also like deployment mechanism and publishing applications in exactly the same way. that really massively boosted the initial phase of it. And later stages, of course, to unlock full potential, we still deal with, like, as soon

Speaker A

as you touch every feature of the platform, you want to interact with it, an application that you built to be integrated and build more and more complex applications, you need to make sure that every feature needs to be open. Every feature needs to

Speaker A

be accessible to it properly, which is, of course, a much longer journey we are in, but that's kind of an interesting factor that we felt the early investment in that early open platform really paid out in this case. That's

Speaker A

awesome to hear. I'd also be curious, I think as everyone in the room is probably aware of, the model under all of this changes every few months, especially with the advancements in the industry. And so I'm curious to understand across the three orgs,

Speaker A

when a new cloud model ships, what actually happens inside your organization that week? maybe walk me through a anthropocleases, a new model to we've actually rolled it out into production. So with HeroGen, our autonomous software delivery agent, I think the key moment in time where a model change

Speaker A

really introduced a step change was last November with the new Opus models, where our vision of this system actually working became a reality because before that, it was more of a fancy idea that we had, right? We just took a big bet that models are going to improve that much to be

Speaker A

able to take whole features and just do all the work for the engineers. So this was a step change. Since then, with that particular system, we have stayed with Opus 4.5. We have not yet made bigger model changes mostly because we do not yet have the A-B testing setup and the

Speaker A

necessary volume to make good decisions in terms of moving to a different model. I think in our case, when we've seen some of the new updates, first of all, it's the excitement of people going and saying, okay, what can I now do with these models that I

Speaker A

haven't been able to do before? Because many times, maybe if you were building your skills or if you were building your workflows, you were potentially kind of trying to compensate for some things. And now you can go and say, hey, do I still

Speaker A

need to do this? The good thing is if you already have an experimentation culture, like it's something that people are very much looking forward to and they want to try that instead of having to have somebody go tell them, hey, did you see

Speaker A

this? Are you going to do something about it? So it's You get some natural excitement and say, hey, what can I do now with it? And I think that's what makes it really a community that's building for us all of those skills

Speaker A

together and sharing those lessons. And do you guys have a team internally at Dr.

Speaker A

Lib focused on those evals of new models? Yeah. So when it comes to the products that are AI-based, products that we offer to our customers, We have teams that have very strict evals, and every time there's a new release,

Speaker A

we go and check, okay, what's the performance of that? What's the tradeoff between all the different variables that we look at? I think when it comes to more of the development process, it's still a little bit more vibes for me

Speaker A

than potentially it could be, where I think as a the models continue to improve, it would be very interesting to see how can we have better verification to give us more confidence and even to go faster with some of the

Speaker A

existing models as things come out. Awesome. Ruslan, yeah, would love to hear from you, especially as you guys think about Monday agents and customer-facing agents. Yeah, so again, continuing the story of Monday vibe as a app building tool. I think it actually

Speaker A

works as a multimodal system, right? So, there's an orchestrator, for example, that uses the model, and then there is, like, a workflow underneath, of course, that has deterministic actions and, like, simpler models using different things, executing different subactions. So, I

Speaker A

think the release of the model impacts usually only one part of it. So, it's, like, actually the end-to-end evaluation is the key here. But of course, each of them runs their own lucky balls in specific atomic actions. So there was a story, for

Speaker A

example, when we migrated from Opus 4.5 to 0.6, I think that was quite a change because, yes, it brings all these amazing capabilities, but at the same time, all the system prompts we've been optimizing so far just didn't transfer well, right?

Speaker A

It was a completely different beast and it was... It requires us to go and rethink and fine tune from techniques for this new orchestrated work and maximize its value for us. We actually worked, our team worked heavily with the

Speaker A

solution engineers from Anthropik that actually really helped to go into depth and really understand the best practices there. So I think since then, this is the practice we do in major model releases, but a lot of smaller changes of course go also through

Speaker A

still the whole end-to-end testing and then still all good A-B testing and production and all these other things that normal products go through. But that's kind of some of the examples I can show. Awesome. So it seems like there's an internal kind of

Speaker A

evaluation phase with your own engineers and maybe in tandem with the Anthropic folks and then there's A-B testing for the customers and end users. It's quite a journey, right?

Speaker A

So to release such a major release for a model change, it's quite a journey.

Speaker A

You cannot assume just compatible with the old one. It's literally treating it as a completely different thing and harnessing it in a way that works for it, right? I think that's definitely the approach we take now. Awesome. I'm also curious from the three of you, what is the

Speaker A

architectural or platform decision you would make differently if you were starting this journey today?

Speaker A

Yeah, when it comes to the architecture where we tried to do with the system was that we wanted to integrate it as good as possible into the existing software delivery ecosystem that we have in place at Delivery Hero, which is, by the way,

Speaker A

also quite fragmented. So we wanted to connect our agent with all the different systems out there like Gyra, like GitHub Issues, soon also GitLab, as a first touchpoint, integration point, with the person assigning the task to the agent, right? So we try to not

Speaker A

change the whole interface to something else, like a chat window or something similar. We try to stay with the current environment that people have to drive that kind of adoption, where they just assign these tickets to the agent, right? That was one thing.

Speaker A

We also, of course, added integrations into our continuous integration system. So we're running these tests. with the agent, we're feeding back any problems that are being found during the test to the agent to fix them. We also have

Speaker A

the issue with flaky CI that the agent is fixing now itself. Other integration points that we did with our ecosystem are, for example, with our security team.

Speaker A

So we want to build a way for security vulnerabilities that are code related to be automatically assigned to the agent, automatically fixed so that the repository owners just have to look at the pull requests and say, okay, that's fine. I take it, right? Another thing from an architectural perspective, and that

Speaker A

was a thing that really drove the success rate, and success rate is, in our case, we define it as the ratio between the pull requests that are actually accepted into code, so merge, and the ones that are actively rejected by a software engineer.

Speaker A

So one thing that drove that success rate to up to 85% was what we call a council of agents. This is a set of different models that is looking at the same code and reviews it. The reason why we choose

Speaker A

several different models for this is mostly that we want to avoid that a model has some sort of blank spot or some sort of bias and then doesn't detect the issues with the code itself generated, right? And so this was a pretty meaningful

Speaker A

change and actually did not drive up the cost as much as we saw it. So it's very, very doable and can just suggest that to anyone to try that out. In our case, when it comes to the overall product architecture, I mentioned we're moving from a more monolithic system

Speaker A

to distributed. About half of our PRs are still going in the monolith, half are outside. And you can see a notable difference between how easy it is to adopt all the tooling outside of the monolith, where you have smaller code bases, or where

Speaker A

you have better, well-defined patterns, versus inside. And I think one of the choices that we made about everything that we've been building outside has been to be very opinionated. So, it means that there's a very standard way of how we build

Speaker A

all the services, all the applications that we do outside. That makes it a lot easier. And then when we go back inside the monolith, you actually have to be able to provide a lot more context about, hey, here's the right way of doing

Speaker A

things, or maybe this was the old way, right? Because over time, you may have had multiple patterns and the model is very good at figuring out hey, how have things been done before? So you have to spend an extra effort to be able

Speaker A

to tell it, okay, this is the new way that you want to do it.

Speaker A

Don't just follow the pattern that you see in the code base. So I think that's one big learning that, at least for now, having smaller code bases that have more standardization and more documentation built into them makes a big difference

Speaker A

in the model performance. Second one is, I think, many of the things that you kind of were okay with before when it came to automation are now becoming to be much costlier. So when the limiting factor was, well, how

Speaker A

long is it going to take us to write the code? You didn't care if some things had required a little bit more human interaction. Now that everything can be done when it comes to coding can be done by agents, you begin to constantly

Speaker A

find all of those new bottlenecks. And I think that's been one of our lessons as well, that to be able to now to go faster, you need to go back and rethink the many different touch points. Sometimes it's the different kind of processes,

Speaker A

not just architecture that you've built in and question everything. Because what has worked for you before with a very different kind of distribution of what people were working on, which tasks took the longest time, it's not working now. So

Speaker A

everything has to be rethought. Awesome. Rosalind, if you guys were to start over today at monday.com, what's the architectural or platform decision you would rethink? Yeah, I think I would double click on API first approach. As I mentioned before, I think this If

Speaker A

we invested in this earlier more and made it consistent across every service and have clean boundaries, I think that would really accelerate us even more on this journey. It's not just for Vibe application. As I mentioned, it was using publicly exposed

Speaker A

GraphQL APIs, but even internal APIs and boundaries between services. In a company that was heavily investing in UI as a surface for when the users will love our customers who build the best UI for them, That moves super fast and of course,

Speaker A

it was always the main use case. Now agents interacting, even external agents interacting with our systems needs API access that sometimes was just not there by default.

Speaker A

And I think that definitely would be number one thing I would invest in earlier.

Speaker A

I think, again, the other thing to mention is, again, rethinking completely identity systems and authorization systems to have an agent as a first-class user in the system and interact with people as a user. I think that's with our granularity of permission models we have in our system, it's been quite a challenge.

Speaker A

So I think starting this earlier would also definitely help. Assuming agents are here to stay, right? So I think that's definitely something I would do earlier. Awesome.

Speaker A

I have a quick lightning round. So a few different questions for the three of you. One word, one sentence. I guess, Roslyn, to start, would you rather have one general's coding agent or many specialists? I think I would go with many specialists

Speaker A

orchestrated by one generalist. At least for now, different specialists orchestrated by a generalist, but who knows? We'll see as the model gets better. I'm also definitely in the specialist camp, so I think that's where it's going.

Speaker A

Awesome. What are your best engineers spending time on now that they weren't 18 months ago? Yeah, so that's actually an interesting question because if you think about it, the best engineers are typically the principal engineers that you have, right? And

Speaker A

principal engineers, what are they doing usually? They're reviewing other people's code, they're part of architectural discussions and so on. They're usually not creating your code. And this, I think, is something where we see the script being flipped. So all of these people are

Speaker A

now producing a lot of code with the help of Cloud and similar tools, right? So they're actually a lot more hands-on suddenly than they were before.

Speaker A

And I think we also see that with other people who are even non-engineers like product managers and similar roles, or even engineering managers like this group here who, probably wasn't that hands on before in the last 18 months and it's now again

Speaker A

in a situation where we can actually deliver things ourselves or build prototypes and some other things. So I would definitely double down on the fact that there is more code generation output, but I think what's also interesting is how they're

Speaker A

doing it because they're not just sitting down and working in a synchronous matter with one terminal. What they're doing is orchestrating kind of the agents to do the work for them. For example, one really good use case that we found, we have a

Speaker A

data scientist who was doing prompt optimization before, and now he's orchestrated multiple agents and created a skill for this to be able to go and apply genetic algorithms to try different variations of prompts and then run all the evals on them. So

Speaker A

it's amplifying their own abilities using agents to do this at a much bigger scale than what they've been able to do before.

Speaker A

I'll try to be even shorter. So, I agree with everything said. On top of that, of course, best engineers spend time on building context layer, unified shared context layer, and what we call AI brains. It's definitely a thing that now, again, exists and

Speaker A

didn't exist even a year and a half ago. Awesome. What's the metric you actually look at every single morning? So for me, it's the merge pull requests and the success ratio to see how well our agent is performing. For me, it's still kind of our quality and reliability metrics because those are

Speaker A

kind of the control KPIs that tell you that if you're going kind of faster, everything is still working well. I'll go in an unusual way. Thinking about customer-facing AI experiences, yes, we look at success rates first and right, but I think

Speaker A

the most interesting part is look at failure rates and go deep into those failures and find out why the right tool didn't get called or why the user didn't get what they want and actually find the whole sometimes goldmine of new use case

Speaker A

to build on. So I think that's definitely my favorite. Awesome. I have one final question before we close out this panel. For the engineers in the room who are maybe six months behind the three of you, what's the first thing you'd have an

Speaker A

agent do in their codebase on Monday morning? Yeah, I think it's simple. Start using it, right? So we've seen resistance from engineers in the overall group.

Speaker A

with AI in the last 12 months, let's say, and at the beginning the models weren't that good, so I can see where that's coming from, but right now you can really achieve a lot by just trying it out. So what we're doing in

Speaker A

our company is actually we have given out kind of a mandate for this quarter to have every team develop a feature end-to-end with AI, right?

Speaker A

Just to help engineers kind of or overcome that initial resistance and see the result and see how well it works. So my advice is use it. If you can't use it in your company, use it privately, just try

Speaker A

it out, build something amazing and you will feel the same enthusiasm that we heard this morning from Boris, right? And then I'm pretty sure you're convinced. for sure you have to try it and maybe something as simple as ask the agent to examine your code base, maybe tell it

Speaker A

even kind of what you like, what you don't like about it, and then ask the agent to come up kind of with a plan to improve those things, right?

Speaker A

And then like work kind of with the agent to create like a full like execution kind of flow. And then you'll be very pleasantly surprised about like how effective that's going to be. And I hope that that will get you to

Speaker A

to believe more in the capabilities that are there today, and even more so what's coming in the future. I would say stop waiting for perfect conditions, perfect use cases, also enough AI-ready work. I think as engineers we know how imperfect our systems

Speaker A

are, how many flaws and things are not right in our code, Only if we would complete this big monolith split, we can then do it full on. Only if we do that big refactoring project, that probably will take a quarter or two, right?

Speaker A

So I think that's kind of what we want to stop doing, right? None of it matters that much anymore. I think you can pick a thing tomorrow, maybe a piece of work that your team is doing repetitively and it's toil and kind of

Speaker A

like is a burden, right? Just unleash a cloud full on and see full potential.

Speaker A

And I think that's definitely, what's here with us today already and just go and try it, not wait for anything. Awesome. It seems like the three of you are very aligned to just try and iterate using the agent to help guide you. Well,

Speaker A

that's it, folks. Thank you so much for spending the time with us today and hope you enjoy the rest of the Code with Cloud conference.

Topics:AI-native enterpriseClaude AImonday.comDoctolibDelivery Heroautonomous software deliveryHeroGenskills marketplaceenterprise AI adoptionlegacy code modernization

Frequently Asked Questions

What is the main focus of the panel discussion in this video?

The panel discusses how monday.com, Doctolib, and Delivery Hero are transforming into AI-native enterprises by integrating Claude AI into their existing products and workflows.

How is Delivery Hero using Claude AI in their engineering process?

Delivery Hero built an autonomous software delivery system called HeroGen that uses Claude AI to convert Jira tickets or GitHub issues into production-ready pull requests, significantly increasing engineering throughput.

What strategies has Doctolib implemented to encourage company-wide AI adoption?

Doctolib created a skills marketplace to share AI capabilities across teams and roles, and fosters a collaborative environment through channels like 'Build with AI' to promote learning and innovation beyond just the engineering teams.

Get More with the Söz AI App

Transcribe recordings, audio files, and YouTube videos — with AI summaries, speaker detection, and unlimited transcriptions.

App Store Google Play

Or transcribe another YouTube video here →

Free tools: TXT to SRT · SRT Validator · Merge SRT · Subtitle to Text · All tools